American Journal of Applied Sciences

Advances in Document Clustering with Evolutionary-Based Algorithms

Sarmad Makki, Razali Yaakob, Norwati Mustapha and Hamidah Ibrahim

DOI : 10.3844/ajassp.2015.689.708

American Journal of Applied Sciences

Volume 12, Issue 10

Pages 689-708

Abstract

Document clustering is the process of organizing a particular electronic corpus of documents into subgroups of similar text features. Formerly, a number of conventional algorithms had been applied to perform document clustering. There are current endeavors to enhance clustering performance by employing evolutionary algorithms. Thus, such endeavors became an emerging topic gaining more attention in recent years. The aim of this paper is to present an up-to-date and self-contained review fully devoted to document clustering via evolutionary algorithms. It firstly provides a comprehensive inspection to the document clustering model revealing its various components with its related concepts. Then it shows and analyzes the principle research work in this topic. Finally, it compiles and classifies various objective functions, the core of the evolutionary algorithms, from the related collection of research papers. The paper ends up by addressing some important issues and challenges that can be subject of future work.

Copyright

© 2015 Sarmad Makki, Razali Yaakob, Norwati Mustapha and Hamidah Ibrahim. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.