Ensemble methods for instance-based Arabic language authorship attribution

Al-Hadhrami, T. ORCID: 0000-0001-7441-604X, Al-Sarem, M., Boulila, W., Saeed, F. and Alsaeedi, A., 2020. Ensemble methods for instance-based Arabic language authorship attribution. IEEE Access. ISSN 2169-3536

1268351_Al-Hadhrami.pdf - Published version

Download (1MB) | Preview


The Authorship Attribution (AA) is considered as a subfield of authorship analysis and it is an important problem as the range of anonymous information increased with fast growing of internet usage worldwide. In other languages such as English, Spanish and Chinese, such issue is quite well studied. However, in Arabic language, the AA problem has received less attention from the research community due to complexity and nature of Arabic sentences. The paper presented an intensive review on previous studies for Arabic language. Based on that, this study has employed the Technique for Order Preferences by Similarity to Ideal Solution (TOPSIS) method to choose the base classifier of the ensemble methods. In terms of attribution features, hundreds of stylometric features and distinct words using several tools have been extracted. Then, Adaboost and Bagging ensemble methods have been applied on Arabic enquires (Fatwa) dataset. The findings showed an improvement of the effectiveness of the authorship attribution task in the Arabic language.

Item Type: Journal article
Publication Title: IEEE Access
Creators: Al-Hadhrami, T., Al-Sarem, M., Boulila, W., Saeed, F. and Alsaeedi, A.
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Date: 8 January 2020
ISSN: 2169-3536
Rights: This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
Divisions: Schools > School of Science and Technology
Record created by: Linda Sullivan
Date Added: 27 Jan 2020 16:15
Last Modified: 27 Jan 2020 16:15
URI: https://irep.ntu.ac.uk/id/eprint/39092

Actions (login required)

Edit View Edit View


Views per month over past year


Downloads per month over past year