Journal of Computer Science

Document Clustering Based on Firefly Algorithm

Athraa Jasim Mohammed, Yuhanis Yusof and Husniza Husni

DOI : 10.3844/jcssp.2015.453.465

Journal of Computer Science

Volume 11, Issue 3

Pages 453-465


Document clustering is widely used in Information Retrieval however, existing clustering techniques suffer from local optima problem in determining the k number of clusters. Various efforts have been put to address such drawback and this includes the utilization of swarm-based algorithms such as particle swarm optimization and Ant Colony Optimization. This study explores the adaptation of another swarm algorithm which is the Firefly Algorithm (FA) in text clustering. We present two variants of FA; Weight- based Firefly Algorithm (WFA) and Weight-based Firefly Algorithm II (WFAII). The difference between the two algorithms is that the WFAII, includes a more restricted condition in determining members of a cluster. The proposed FA methods are later evaluated using the 20Newsgroups dataset. Experimental results on the quality of clustering between the two FA variants are presented and are later compared against the one produced by particle swarm optimization, K-means and the hybrid of FA and -K-means. The obtained results demonstrated that the WFAII outperformed the WFA, PSO, K-means and FA-Kmeans. This result indicates that a better clustering can be obtained once the exploitation of a search solution is improved.


© 2015 Athraa Jasim Mohammed, Yuhanis Yusof and Husniza Husni. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.