AN IMPROVED ARABIC WORD&apos;S ROOTS EXTRACTION METHOD USING N-GRAM TECHNIQUE

Nidal Yousef; Aymen Abu-Errub; Ashraf Odeh; Hayel Khafajeh

doi:10.3844/jcssp.2014.716.719

Research Article Open Access

AN IMPROVED ARABIC WORD'S ROOTS EXTRACTION METHOD USING N-GRAM TECHNIQUE

Nidal Yousef¹, Aymen Abu-Errub², Ashraf Odeh¹ and Hayel Khafajeh³

¹ AL Isra University, Jordan
² Al-Ahliyya Amman University, Jordan
³ Zarqa University, Jordan

Abstract

Arabic language is distinguished by its morphological richness, which forces the workers in the field of Arabic language Processing (i.e., information retrieval, document's classification, text summarizing) to deal with many words that seem to be different but in reality they came from an identical root word. One of the methods to overcome this problem is to return the words to their roots. This research aims to provide a new algorithm, that returns roots of Arabic words using n-gram technique without using morphological rules in order to avoid the complexity arising from the morphological richness of the language in one hand and the multiplicity of morphological rules in other hand. The proposed algorithm uses a list that contains over 4,500 identical roots words.

Journal of Computer Science

Volume 10 No. 4, 2014, 716-719

DOI: https://doi.org/10.3844/jcssp.2014.716.719

Submitted On: 11 August 2013 Published On: 26 December 2013

How to Cite: Yousef, N., Abu-Errub, A., Odeh, A. & Khafajeh, H. (2014). AN IMPROVED ARABIC WORD'S ROOTS EXTRACTION METHOD USING N-GRAM TECHNIQUE. Journal of Computer Science, 10(4), 716-719. https://doi.org/10.3844/jcssp.2014.716.719

Copyright: © 2014 Nidal Yousef, Aymen Abu-Errub, Ashraf Odeh and Hayel Khafajeh. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

3,052 Views
3,163 Downloads
8 Citations

Download

Keywords

Arabic Root Extraction
Natural Language Processing
N-Gram