Aljamel, A, Osman, T ORCID: https://orcid.org/0000-0001-8781-2658, Acampora, G ORCID: https://orcid.org/0000-0003-4082-5616, Vitiello, A and Zhang, Z ORCID: https://orcid.org/0000-0002-8587-8618, 2018. Smart information retrieval: domain knowledge centric optimization approach. IEEE Access, 7, pp. 4167-4183. ISSN 2169-3536
Preview |
Text
12995_Osman.pdf - Post-print Download (999kB) | Preview |
Abstract
In the age of Internet of Things (IoT), online data has witnessed significant growth in terms of volume and diversity, and research into information retrieval has become one of the important research themes in the Internet oriented data science research. In information retrieval, machine-learning techniques have been widely adopted to automate the challenging process of relation extraction from text data, which is critical to the accuracy and efficiency of information retrieval-based applications including recommender systems and sentiment analysis. In this context, this paper introduces a novel, domain knowledge centric methodology aimed at improving the accuracy of using machine-learning methods for relation classification, and then utilise Genetic Algorithms (GAs) to optimise the feature selection for the learning algorithms. The proposed methodology makes significant contribution to the processes of domain knowledge-based relation extraction including interrogating Linked Open Datasets to generate the relation classification training-data, addressing the imbalanced classification in the training datasets, determining the probability threshold of the best learning algorithm, and establishing the optimum parameters for the genetic algorithm utilised in feature selection. The experimental evaluation of the proposed methodology reveals that the adopted machine-learning algorithms exhibit higher precision and recall in relation extraction in the reduced feature space optimised by the implementation. The considered machine learning includes Support Vector Machine, Perceptron Algorithm Uneven Margin and K-Nearest Neighbours. The outcome is verified by comparing against the Random Mutation Hill-Climbing optimisation algorithm using Wilcoxon signed-rank statistical analysis.
Item Type: | Journal article |
---|---|
Publication Title: | IEEE Access |
Creators: | Aljamel, A., Osman, T., Acampora, G., Vitiello, A. and Zhang, Z. |
Publisher: | Institute of Electrical and Electronics Engineers |
Date: | 7 December 2018 |
Volume: | 7 |
ISSN: | 2169-3536 |
Identifiers: | Number Type 10.1109/ACCESS.2018.2885640 DOI |
Rights: | This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/ |
Divisions: | Schools > School of Science and Technology |
Record created by: | Jonathan Gallacher |
Date Added: | 04 Jan 2019 16:22 |
Last Modified: | 09 Sep 2019 15:22 |
URI: | https://irep.ntu.ac.uk/id/eprint/35473 |
Actions (login required)
Edit View |
Statistics
Views
Views per month over past year
Downloads
Downloads per month over past year