Smart information retrieval: domain knowledge centric optimization approach

Aljamel, A, Osman, T ORCID logoORCID: https://orcid.org/0000-0001-8781-2658, Acampora, G ORCID logoORCID: https://orcid.org/0000-0003-4082-5616, Vitiello, A and Zhang, Z ORCID logoORCID: https://orcid.org/0000-0002-8587-8618, 2018. Smart information retrieval: domain knowledge centric optimization approach. IEEE Access, 7, pp. 4167-4183. ISSN 2169-3536

[thumbnail of 12995_Osman.pdf]
Preview
Text
12995_Osman.pdf - Post-print

Download (999kB) | Preview

Abstract

In the age of Internet of Things (IoT), online data has witnessed significant growth in terms of volume and diversity, and research into information retrieval has become one of the important research themes in the Internet oriented data science research. In information retrieval, machine-learning techniques have been widely adopted to automate the challenging process of relation extraction from text data, which is critical to the accuracy and efficiency of information retrieval-based applications including recommender systems and sentiment analysis. In this context, this paper introduces a novel, domain knowledge centric methodology aimed at improving the accuracy of using machine-learning methods for relation classification, and then utilise Genetic Algorithms (GAs) to optimise the feature selection for the learning algorithms. The proposed methodology makes significant contribution to the processes of domain knowledge-based relation extraction including interrogating Linked Open Datasets to generate the relation classification training-data, addressing the imbalanced classification in the training datasets, determining the probability threshold of the best learning algorithm, and establishing the optimum parameters for the genetic algorithm utilised in feature selection. The experimental evaluation of the proposed methodology reveals that the adopted machine-learning algorithms exhibit higher precision and recall in relation extraction in the reduced feature space optimised by the implementation. The considered machine learning includes Support Vector Machine, Perceptron Algorithm Uneven Margin and K-Nearest Neighbours. The outcome is verified by comparing against the Random Mutation Hill-Climbing optimisation algorithm using Wilcoxon signed-rank statistical analysis.

Item Type: Journal article
Publication Title: IEEE Access
Creators: Aljamel, A., Osman, T., Acampora, G., Vitiello, A. and Zhang, Z.
Publisher: Institute of Electrical and Electronics Engineers
Date: 7 December 2018
Volume: 7
ISSN: 2169-3536
Identifiers:
Number
Type
10.1109/ACCESS.2018.2885640
DOI
Rights: This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/
Divisions: Schools > School of Science and Technology
Record created by: Jonathan Gallacher
Date Added: 04 Jan 2019 16:22
Last Modified: 09 Sep 2019 15:22
URI: https://irep.ntu.ac.uk/id/eprint/35473

Actions (login required)

Edit View Edit View

Statistics

Views

Views per month over past year

Downloads

Downloads per month over past year