Research Article Open Access

An Integrated Framework for Mixed Data Clustering Using Self Organizing Map

Hari Prasad Devaraj and M. Punithavalli

Abstract

Problem statement: Clustering plays an important role in data mining of large data and helps in analysis. This develops a vast importance in research field for providing better clustering technique. There are several techniques exists for clustering the similar kind of data. But only very few techniques exists for clustering mixed data items. This leads to the requirement of better clustering technique for classification of mixed data. The cluster must be such that the similarity of items within the clusters is increased and the similarity of items from different clusters must be reduced. The existing techniques possess several advantages and at the same time various disadvantages also exists. Approach: To overcome those drawbacks, Self-Organizing Map (SOM) and Extended Attribute-Oriented Induction (EAOI) for clustering mixed data type data can be used. This will take more time for clustering. A modified SOM was proposed based on batch learning. Results: The experimentation for the proposed technique was carried with the help of UCI Adult Data Set. The number of clusters resulted for the proposed technique is lesser when compared to the usage of SOM. Also the outliers were not obtained by using the proposed technique. Conclusion: The experimental suggests that the proposed technique can be used to cluster the mixed data items with better accuracy of classification.

Journal of Computer Science
Volume 7 No. 11, 2011, 1639-1645

DOI: https://doi.org/10.3844/jcssp.2011.1639.1645

Submitted On: 4 March 2011 Published On: 18 August 2011

How to Cite: Devaraj, H. P. & Punithavalli, M. (2011). An Integrated Framework for Mixed Data Clustering Using Self Organizing Map. Journal of Computer Science, 7(11), 1639-1645. https://doi.org/10.3844/jcssp.2011.1639.1645

  • 3,429 Views
  • 2,839 Downloads
  • 1 Citations

Download

Keywords

  • Attribute-oriented induction
  • clustering technique
  • data mining
  • training pattern
  • self-organizing map
  • batch learning
  • Better Matching Unit (BMU)
  • numeric attributes
  • scientific data analysis