An Efficient Sliced Data Algorithm Design for Data Protection

Abstract
Authors
Keywords
Conclusion
References

Today, most enterprises are actively collecting and storing data in large databases. Privacy has become a key issue for progress in data mining. Maintaining the privacy of data mining has become increasingly popular because it allows sharing of privacy-sensitive data for analysis. Privacypreserving data mining is used to safeguard sensitive information from unsanctioned disclosure. Privacy is an important issue in data publishing years because of the increasing ability to store personal data about users. Privacypreserving data publishing (PPDP) provides methods and tools for publishing useful information while preserving data privacy. A number of techniques such as bucketization, generalization have been proposed to perform privacypreserving data mining. Recent work has shown that generalization not support for high- dimensional data. Bucketization cannot prevent membership disclosure and does not apply for data that do not have a clear separation between quasi-identifying attributes and sensitive attributes. A new technique is introduced that is known as slicing, which partitions the data both horizontally and vertically. Slicing provides better data utility than generalization and can be used for membership disclosure protection. Slicing can handle high dimensional data. Also slicing can be used for attribute disclosure protection and develop an efficient algorithm for computing the sliced data that obey the l-diversity requirement. Slicing is more effective than bucketization in workloads involving the sensitive attribute. Another advantage of slicing can be used to prevent membership disclosure.

Published In : IJCSN Journal Volume 3, Issue 4

Date of Publication : 01 August 2014

Pages : 187 - 190

Figures : --

Tables : --

Publication Link : An Efficient Sliced Data Algorithm Design for Data Protection

G. Hima Bindhu : M.Tech, CSE, LBRCE, Mylavaram, India

Dr. S. Sai Satyanarayana Reddy : Professor, CSE, LBRCE, Mylavaram, India

Data publishing

Generalization

Bucketization

Slicing

The slicing strategy overcomes the limitations of generalization and bucketization methods. It preserves better utility while protecting against privacy threats where each attribute is exactly in one column. An extension of slicing is overlapping slicing which duplicates an attribute in more than one column. The proposed tuple grouping algorithm is optimized ldiversity check algorithm which obtains more effective tuple grouping and provides the secure data. Another advantage of slicing is that it can handle high dimensional data.Its future work can be as privacy preservation as the big issue, large number of datasets is increasing security to such data must be available. Therefore, as the term privacy entered encryption and decryption and compression can further be done for such databases.

[1] Brickell.J and Shmatikov, “The Cost of Privacy: Destruction of Data Mining Utility in Anonymized Data Publishing”, Proc.ACM SIGKDD int’l conf. Knowledge Discovery Data Mining (KDD), 2008.

[2] D. Martin, D. Kifer, A. Machanavajjhala, J. Gehrke and J. Halphern, “ Worst-Case Background Knowledge for Privacy preserving data publishing.” In ICDE, 2006.

[3] A. Machanavajjhala, D. Kifer, J. Gehrke and M. Venkitasubramaniam, “l-diversity: Privacy Beyond kanonymity” in ICDE, 2007.

[4] L. Sweeney, “k-anonymity: A Model For Protecting Privacy”, Int’l J. Uncertainty Fuzziness and Knowledge-Based Systems, vol. 10, no. 5, pp. 557-570, 2002.

[5] G. Ghinita, Y. Tao, and P. Kalins, “On the Anonymization of Sparse High-Dimensional Data”, Proc. IEEE 24th Int’l Conf. Data Eng. (ICDE), pp. 715- 724, 2008.

[6] He.Y and Naughton.J, “Anonymization of set-valued Data via Top-Down, local generalization,” Proc. IEEE 25th Int”l Conf.Data Engineering (ICDE), 2009.

[7] Aggarwal.C, “On K-Anonymity and the Curse of Dimensionality,” Proc. Int”l Conf.Very Large Databases (VLDB), 2005.

[8] Li.N, Li.T, “Slicing: The new Approach for privacy Preserving Data Publishing”, Proc.ACM SIGKDD Int”l Conf.Knowledge Discovery and Data Mining (KDD), 2009.