Osman, T ORCID: https://orcid.org/0000-0001-8781-2658, Khalil, H, Miltan, M, Shaalan, K and Alfrjani, R
ORCID: https://orcid.org/0009-0007-4624-4245,
2023.
Exploiting functional discourse grammar to enhance complex Arabic relation extraction using a hybrid semantic knowledge base - machine learning approach.
ACM Transactions on Asian and Low-Resource Language Information Processing, 22 (8): 214.
ISSN 2375-4699
Preview |
Text
2412478_Osman.pdf - Post-print Download (2MB) | Preview |
Abstract
Relation extraction from unstructured Arabic text is especially challenging due to the Arabic language complex morphology and the variation in word semantics and lexical categories. The research documented in this paper presents a hybrid Semantic Knowledge base - Machine Learning (SKML) approach for extracting complex Arabic relations from unstructured Arabic documents; the proposed approach exploits the principles of Functional Discourse Grammar (FDG) to emphasise the semantic and pragmatic properties of the language and facilitate the identification of relation elements. At the initial phase, the novel FDG-SKML relation extraction approach deploys a lexical-based mechanism that utilises a purposely built domain-specific Semantic Knowledge to encode the semantic association between the identified relations’ elements. The evaluation of the initial stage evidenced improved accuracy for extracting most complex Arabic relations. The initial relation extraction mechanism was further extended by integrating its output into a Machine Learning classifier that facilitated extracting especially complex relations with significant disparity in the relation elements’ presence, order, and correlation. Using Economics as the problem domain, experimental evaluation evidenced the high accuracy of our FDG-SKML approach in complex Arabic relation extraction task and demonstrated its further improvement upon integration with machine learning classifiers.
Item Type: | Journal article |
---|---|
Publication Title: | ACM Transactions on Asian and Low-Resource Language Information Processing |
Creators: | Osman, T., Khalil, H., Miltan, M., Shaalan, K. and Alfrjani, R. |
Publisher: | Association for Computing Machinery (ACM) |
Date: | August 2023 |
Volume: | 22 |
Number: | 8 |
ISSN: | 2375-4699 |
Identifiers: | Number Type 10.1145/3610581 DOI 2412478 Other |
Rights: | © ACM 2023. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in ACM Transactions on Asian and Low-Resource Language Information Processing, http://dx.doi.org/10.1145/10.1145/3615980 |
Divisions: | Schools > School of Science and Technology |
Record created by: | Laura Borcherds |
Date Added: | 21 Mar 2025 10:13 |
Last Modified: | 21 Mar 2025 10:13 |
URI: | https://irep.ntu.ac.uk/id/eprint/53276 |
Actions (login required)
![]() |
Edit View |
Statistics
Views
Views per month over past year
Downloads
Downloads per month over past year