Secured Disclosure of Sensitive Data in Data Mining Techniques
Kirubhakar Gurusamy and Venkatesh Chakrapani
DOI : 10.3844/jcssp.2012.2042.2052
Journal of Computer Science
Volume 8, Issue 12
Recent advances in data collection, data dissemination and related technologies have inaugurated a new era of research where existing data mining algorithms should be reconsidered from the point of view of securing sensitive data. People have become increasingly unwilling to share their data. This frequently results in individuals either refusing to share their data or providing incorrect data. In turn, such problems in data collection can affect the success of data mining, which relies on sufficient amounts of accurate data in order to produce meaningful results. Based on the analysis of shortcomings of earlier technologies this study proposes a new method for securing numerical and categorical data. In this method the categorical data is converted into Binary form and perturbation based noise is introduced as a security method based on the security level anticipated. Several types of noise addition methods were employed and generalized results were evaluated in terms of misclassification error and privacy level. An average of misclassification error was below 50% for 75-90% security level, which is better than earlier methods which didn’t handle categorical data. The results obtained prove that the proposed method outperforms some of the currently existing methods thereby ensuring the possibility of securing sensitive data irrespective of its type being numerical or categorical.
© 2012 Kirubhakar Gurusamy and Venkatesh Chakrapani. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.