Enhancement of Arabic Text Classification Using Semantic Relations of Arabic WordNet

Suhad A. Yousif; Venus W. Samawi; Islam Elkabani; Rached Zantout

doi:10.3844/jcssp.2015.498.509

Research Article Open Access

Enhancement of Arabic Text Classification Using Semantic Relations of Arabic WordNet

Suhad A. Yousif¹, Venus W. Samawi², Islam Elkabani¹ and Rached Zantout³

¹ Beirut Arab University, Lebanon
² Amman Arab University, Jordan
³ Rafik Hariri University, Lebanon

Abstract

Arabic text classification methods have emerged as a natural result of the existence of a massive amount of varied textual information (written in Arabic language) on the web. In most text classification processes, feature selection is crucial task since it highly affects the classification accuracy. Generally, two types of features could be used: Statistical based features and semantic and concept features. The main interest of this paper is to specify the most effective semantic and concept features on Arabic text classification process. In this study, two novel features that use lexical, semantic and lexico-semantic relations of Arabic WordNet (AWN) ontology are suggested. The first feature set is List of Pertinent Synsets (LoPS), which is list of synsets that have a specific relation with the original terms. The second feature set is List of Pertinent Words (LoPW), which is list of words that have a specific relation with the original terms. Fifteen different relations (defined in AWN ontology) are used with both proposed features. Naïve Bayes classifier is used to perform the classification process. The experimental results, which are conducted on BBC Arabic dataset, ‎show that using LoPS feature set improves the accuracy of Arabic text ‎classification compared with the well-known Bag-of-Word feature and the ‎recent Bag-of-Concept (synset) features. Also, it was found that LoPW (especially with related-to relation) improves the classification accuracy compared with LoPS, Bag-of-Word and Bag-of-Concept.

Journal of Computer Science

Volume 11 No. 3, 2015, 498-509

DOI: https://doi.org/10.3844/jcssp.2015.498.509

Submitted On: 13 January 2015 Published On: 15 April 2015

How to Cite: Yousif, S. A., Samawi, V. W., Elkabani, I. & Zantout, R. (2015). Enhancement of Arabic Text Classification Using Semantic Relations of Arabic WordNet. Journal of Computer Science, 11(3), 498-509. https://doi.org/10.3844/jcssp.2015.498.509

Copyright: © 2015 Suhad A. Yousif, Venus W. Samawi, Islam Elkabani and Rached Zantout. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

3,042 Views
2,333 Downloads
22 Citations

Download

Keywords

Arabic Text Classification
Naïve Bayes
Arabic WordNet
Semantic Relations