Journal of Computer Science

A HYBRID METHOD USING LEXICON-BASED APPROACH AND NAIVE BAYES CLASSIFIER FOR ARABIC OPINION QUESTION ANSWERING

Khalid Khalifa and Nazlia Omar

DOI : 10.3844/jcssp.2014.1961.1968

Journal of Computer Science

Volume 10, Issue 10

Pages 1961-1968

Abstract

Opinion Question Answering (Opinion QA) is the task of enabling users to explore others opinions toward a particular service of product in order to make decisions. Arabic Opinion QA is more challenging due to its complex morphology compared to other languages and has many varieties dialects. On the other hand, there are insignificant research efforts and resources available that focus on Opinion QA in Arabic. This study aims to address the difficulties of Arabic opinion QA by proposing a hybrid method of lexicon-based approach and classification using Naïve Bayes classifier. The proposed method contains pre-processing phases such as, transformation, normalization and tokenization and exploiting auxiliary information (thesaurus). The lexicon-based approach is executed by replacing some words with its synonyms using the domain dictionary. The classification task is performed by Naïve Bayes classifier to classify the opinions based on the positive or negative sentiment polarity. The proposed method has been evaluated using the common information retrieval metrics i.e., Precision, Recall and F-measure. For comparison, three classifiers have been applied which are Naïve Bayes (NB), Support Vector Machine (SVM) and K-Nearest Neighbor (KNN). The experimental results have demonstrated that NB outperforms SVM and KNN by achieving 91% accuracy.

Copyright

© 2014 Khalid Khalifa and Nazlia Omar. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.