Journal of Computer Science

Arabic Named Entity Recognition Using Artificial Neural Network

Naji F. Mohammed and Nazlia Omar

DOI : 10.3844/jcssp.2012.1285.1293

Journal of Computer Science

Volume 8, Issue 8

Pages 1285-1293

Abstract

Problem statement: Named Entity Recognition (NER) is a task to identify proper names as well as temporal and numeric expressions, in an open-domain text. The NER task can help to improve the performance of various Natural Language Processing (NLP) applications such as Information Extraction (IE), Information Retrieval (IR) and Question Answering (QA) tasks. This study discusses on the Named Entity Recognition of Arabic (NERA). The motivation is due to the lack of resources for Arabic named entities and to enhance the accuracy that has been reached in previous NERA systems. Approach: This system is designed based on neural network approach. The main task of neural network approach is to automatically learn to recognize component patterns and make intelligent decisions based on available data and it can also be applied to classify new information within large databases. The use of machine learning approach to classify NER from Arabic text based on neural network technique is proposed. Neural network approach has performed successfully in many areas of artificial intelligence. The system involves three stages: the first stage is pre-processing that cleans the collected data, the second involves converting Arabic letters to Roman alphabets and the final stage applies neural network to classify the collected data. Results: The accuracy of the system is 92 %. The system is compared with decision tree using the same data. The results showed that the neural network approach achieved better than decision tree. Conclusion: These results prove that our technique is capable to recognize named entities of Arabic texts.

Copyright

© 2012 Naji F. Mohammed and Nazlia Omar. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.