A Survey of Data Anonymization Techniques for Privacy-Preserving Mining in Bigdata

Helen Wilfred Raj; Santhi Balachandran

doi:10.3844/jcssp.2020.194.201

Review Article Open Access

A Survey of Data Anonymization Techniques for Privacy-Preserving Mining in Bigdata

Helen Wilfred Raj¹ and Santhi Balachandran¹

¹ SASTRA Deemed University, India

Abstract

Bigdata era is seeing the data burst occurring in a multitude of angles that are better expressed in terms of the 4Vs (Volume, Velocity, Velocity, Veracity). While trying to infer information from data, care should be exercised as not to reveal the identity of the data owner, which breaches the privacy rights. Leakage of information can happen right from the data collection point, at the data storage area, followed by the distribution of data to data users/miners and finally with published results. A cross-matching of all these points with the 4Vs (growing still) of big data, puts a huge challenge on how to extract the maximum possible information, without compromising on the privacy of the data owner. Anonymization of the original data should be done at one or more of the above-mentioned stages before the data are given for the mining process. This work makes a survey of the various anonymization techniques followed to transform the data in such a way that the privacy of the data owner is not compromised. Also, the sample data drawn should resemble and represent the original dataset in the maximum possible number of dimensions. The results of the various methodologies have been analyzed and the observations have been presented.

Journal of Computer Science

Volume 16 No. 2, 2020, 194-201

DOI: https://doi.org/10.3844/jcssp.2020.194.201

Submitted On: 15 July 2019 Published On: 31 December 2019

How to Cite: Raj, H. W. & Balachandran, S. (2020). A Survey of Data Anonymization Techniques for Privacy-Preserving Mining in Bigdata. Journal of Computer Science, 16(2), 194-201. https://doi.org/10.3844/jcssp.2020.194.201

Copyright: © 2020 Helen Wilfred Raj and Santhi Balachandran. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

7,432 Views
4,675 Downloads
4 Citations

Download

Keywords

Privacy-Preserving
Anonymization
Perturbation
Generalization
Dimensionality Reduction