Privacy Preservation through Data Generation
نویسندگان
چکیده
Many databases will or can not be disclosed without strong guarantees that no sensitive information can be extracted. To address this concern several data perturbation techniques have been proposed. However, it has been shown that either sensitive information can still be extracted from the perturbed data with little prior knowledge, or that many patterns are lost. In this paper we show that generating new data is an inherently safer alternative. We present a data generator based on the models obtained by the MDLbased KRIMP [18] algorithm. These are accurate representations of the data distributions and can thus be used to generate data with the same characteristics as the original data. Experimental results show a very large patternsimilarity between the generated and the original data, ensuring that viable conclusions can be drawn from the anonymised data. Furthermore, anonymity is guaranteed for suited databases and the quality–privacy trade-off can be balanced explicitly.
منابع مشابه
A Survey of Cryptographic and Non-cryptographic Techniques for Privacy Preservation
Cryptography is to become familiar with the requirement of large, complex, information rich data sets for it’s privacy preservation. The privacy preserving data mining has been generated; to go through the concept of privacy in data mining is hard. Several algorithms and approaches are being generated theoretically, but practically it is hard. Privacy in data mining can be achieved through seve...
متن کاملPrivacy-preserving Clustering of Data Streams
As most previous studies on privacy-preserving data mining placed specific importance on the security of massive amounts of data from a static database, consequently data undergoing privacy-preservation often leads to a decline in the accuracy of mining results. Furthermore, following by the rapid advancement of Internet and telecommunication technology, subsequently data types have transformed...
متن کاملAnalyzing the Privacy Preserving Using Big Data Techniques
Recently big data has become a hot research topic. The rising amounts of big data also increase the chance of violate the privacy of individuals. Since big data need high computational power and large storage, distributed systems are used. As multiple parties are concerned in these systems, the risk of privacy violation is improved. There have been a number of privacy-preserving methods develop...
متن کاملPrivacy Preserving Data Mining Using Additive Perturbation on Relational Streaming Data
Data mining concerns with extracting the required important data from the database and ignoring the rest. With the success of data mining, privacy preservation has also acquired the great importance. The new concept privacy preserving data mining PPDM, concerns with preserving the privacy of sensitive individuals data. In this paper, privacy of sensitive attribute data concerned with individual...
متن کاملA review on Security in Distributed Information Sharing
In recent year’s privacy preserving data mining has emerged as a very active research area in data mining. Over the last few years this has naturally lead to a growing interest in security or privacy issues in data mining. More precisely, it became clear that discovering knowledge through a combination of different databases raises important security issues. Privacy preserving data mining is on...
متن کاملA Privacy Preservation Framework for Big Data (Using Differential Privacy and Overlapped Slicing)
-We are in the midst of big data. The rate of data generation is increasing at a very rapid rate. We need to understand and analyze this data as quick as possible. A delay in millisecond to understand the data may cost not only money but also life. There are various processing and analytic mechanisms like Hadoop and MapReduce to process the data. But as big data comprises an enormous amount of ...
متن کامل