Distribution-Preserving k-Anonymity
نویسندگان
چکیده
Preserving the privacy of individuals by protecting their sensitive attributes is an important consideration during microdata release. However, it is equally important to preserve the quality or utility of the data for at least some targeted workloads. We propose a novel framework for privacy preservation based on the k-anonymity model that is ideally suited for workloads that require preserving the probability distribution of the quasi-identifier variables in the data. Our framework combines the principles of distribution-preserving quantization and k-member clustering, and we specialize it to two variants that respectively use intra-cluster and Gaussian dithering of cluster centers to achieve distribution preservation. We perform theoretical analysis of the proposed schemes in terms of distribution preservation, and describe their utility in workloads such as covariate shift and transfer learning where such a property is necessary. Using extensive experiments on real-world Medical Expenditure Panel Survey data, we demonstrate the merits of our algorithms over standard k-anonymization for a hallmark health care application where an insurance company wishes to understand the risk in entering a new market. Furthermore, by empirically quantifying the reidentification risk, we also show that the proposed approaches indeed maintain k-anonymity.
منابع مشابه
A Novel Anonymity Algorithm for Privacy Preserving in Publishing Multiple Sensitive Attributes
Publishing the data with multiple sensitive attributes brings us greater challenge than publishing the data with single sensitive attribute in the area of privacy preserving. In this study, we propose a novel privacy preserving model based on k-anonymity called (α, β, k)-anonymity for databases. (α, β, k)anonymity can be used to protect data with multiple sensitive attributes in data publishing...
متن کاملA Survey of Privacy Preserving Data Publishing using Generalization and Suppression
Nowadays, information sharing as an indispensable part appears in our vision, bringing about a mass of discussions about methods and techniques of privacy preserving data publishing which are regarded as strong guarantee to avoid information disclosure and protect individuals’ privacy. Recent work focuses on proposing different anonymity algorithms for varying data publishing scenarios to satis...
متن کاملEnhancing Informativeness in Data Publishing while Preserving Privacy using Coalitional Game Theory
k-Anonymity is one of the most popular conventional techniques for protecting the privacy of an individual. The shortcomings in the process of achieving k-Anonymity are presented and addressed by using Coalitional Game Theory (CGT) [1] and Concept Hierarchy Tree (CHT). The existing system considers information loss as a control parameter and provides anonymity level (k) as output. This paper pr...
متن کاملUtility - Preserving k - Anonymity
As technology advances and more and more person-specific data like health information becomes publicly available, much attention is being given to confidentiality and privacy protection. On one hand, increased availability of information can lead to advantageous knowledge discovery; on the other hand, this information belongs to individuals and their identities must not be disclosed without con...
متن کاملPrivacy-preserving data mining: A feature set partitioning approach
In privacy-preserving data mining (PPDM), a widely used method for achieving data mining goals while preserving privacy is based on k-anonymity. This method, which protects subject-specific sensitive data by anonymizing it before it is released for data mining, demands that every tuple in the released table should be indistinguishable from no fewer than k subjects. The most common approach for ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1711.01514 شماره
صفحات -
تاریخ انتشار 2017