An Extensive Survey of Privacy Preserving Data Mining Techniques

Abstract
Authors
Keywords
Conclusion
References

Data mining techniques are rising trends to aid organizations to analyze, find un-obvious patterns and details to benefit from the customer or user data. But this is classified as proprietary information disclosure and mining misuse. To avoid this, we introduce the concept of privacy preserving data mining (PPDM). The fundamental notions of the existing privacy preserving data mining methods, their merits, and shortcomings are presented. We discus five techniques namely Anonymization based PPDM, Perturbation based PPDM, Randomized response based PPDM, Condensation based PPDM and Cryptography based PPDM.

Published In : IJCSN Journal Volume 6, Issue 5

Date of Publication : October2017

Pages : 547-550

Figures :01

Tables : --

Bhargav Sundararajan : SRM University, Chennai, Tamil Nadu 603203, India

Deepthi Peri : SRM University, Chennai, Tamil Nadu 603203, India

Nita Radhakrishnan : SRM University, Chennai, Tamil Nadu 603203, India

Mehul Awasthi : SRM University, Chennai, Tamil Nadu 603203, India

privacy, data mining, anonymization, perturbation

The primary objective of PPDM is promoting algorithm to conceal sensitive data or over privacy. These sensitive data do not get revealed to unapproved parties or invader. In data mining there exists a trade of between utility and privacy of data. When we accomplish one it inevitably leads to the detrimental impact on the other. Many PPDM techniques in existence are reviewed in the paper. Ultimately, it is concluded with the fact that there is no single PPDM technique in existence that outshines every other technique with relation to each possible criteria such as use of data, performance, difficulty, compatibility with procedures for data mining, and so on. A particular algorithm may function better when compared to another, on a specific criterion. Various algorithms may be found to function better than one another on given criterion. Researchers are doing extensive research in ensuring that the sensitive data of a person is not revealed as well as not compromising the utility of data so that the data can be useful for many purposes.

[1] Ann Cavoukian, Information and Privacy Commissioner, Ontario, “Data Mining Staking a Claim on Your Privacy”, 1997. [2] The Economist. “The End of Privacy”, May 1st, 1999. pp: 15. [3] R. Agrawal and R. Srikant. “Privacy Preserving Data Mining”, ACM SIGMOD Conference on Management of Data, pp: 439-450, 2000. [4] D. Agrawal and C. Aggarwal, “On the Design and Quantification of Privacy Preserving Data Mining Algorithms”, PODS 2001. pp: 247-255. [5] W. Du and Z. Zhan, “Using Randomized Response Techniques for Privacy Preserving Data Mining”, SIGKDD 2003. pp. 505-510. [6] Elisa, B., N.F. Igor and P.P. Loredana. “A Framework for Evaluating Privacy Preserving Data Mining Algorithms”, Published by Data Mining Knowledge Discovery, 2005, pp.121- 154. [7] Sweeney L, "Achieving k-Anonymity privacy protection using generalization and suppression" International journal of Uncertainty, Fuzziness and Knowledge based systems, 10(5), 571- 588, 2002. [8] Evfimievski A., "Randomization in Privacy-Preserving Data Mining", ACM SIGKDD Explorations, 4, 2003. [9] Aggarwal C, Philip S Yu, "A condensation approach to privacy preserving data mining", EDBT, 183-199, 2004. [10] Benny Pinkas,"Cryptographic Techniques for Privacy preserving data mining", SIGKDD Explorations, Vol. 4, Issue 2, 12-19, 2002.