Boosting Arabic Named Entity Recognition with K-Fold Cross Validation on LSTM and Bi-LSTM Models
- 1 Department of Computer Science, College of Basic Education, University of Diyala, Iraq
- 2 Department of Computer Science, Faculty of Computer Science and Mathematics, University of Kufa, Iraq
Abstract
Named-Entity-Recognition(NER) is one of the most important Information-Extraction (IE) use cases, whichis used to improve the performance of Natural Languages Processing (NLP) tasks,such as Relation-Extraction (RE), Question-Answering (QA). Recently, Arabic NER is tackled in differentways by researchers. In this study, we assess the performance of two widelyused models, namely, LSTM and Bi-LSTM on the NER task in the Arabic languageand perform a comparative study between these models. In contrast to thetraditional data partition technique widely used during the training, we employthe technique of k-fold cross-validation to improve the performance of eachmodel. The experimental results reveal that the performance of all models isimproved when k-fold cross-validation is applied. Additionally, according toour experiment results, the Bi-LSTM model outperforms the LSTM model in termsof our evaluation metric. We achieve the best F1 score of 94.17% withCNN-Bi-LSTM-CRF. An ablation study on k-fold cross-validation demonstrates thatthe F1 score increased from 87.28 to 94.17%.
DOI: https://doi.org/10.3844/jcssp.2022.792.800
Copyright: © 2022 Hamid Sadeq Mahdi Alsultani and Ahmed H. Aliwy. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
- 2,226 Views
- 1,095 Downloads
- 0 Citations
Download
Keywords
- Arabic Named Entity Recognition
- LSTM
- BiLSTM
- K-Fold Cross Validation