A Polynomial Algorithm for Optimal Microaggregation
نویسندگان
چکیده
− Microaggregation is a technique that is used by statistical agencies to limit disclosure of sensitive microdata. Noting that no polynomial time algorithms are known to microaggregate optimally, Domingo-Ferrer and Mateo-Sanz have presented heuristic methods based on hierarchical clustering and genetic algorithms to identify sub-optimal solutions. We present an efficient polynomial time algorithm to solve the univariate microaggregation problem. Optimal partitions are shown to correspond to shortest paths in a network.
منابع مشابه
Improved Univariate Microaggregation for Integer Values
Privacy issues during data publishing is an increasing concern of involved entities. The problem is addressed in the field of statistical disclosure control with the aim of producing protected datasets that are also useful for interested end users such as government agencies and research communities. The problem of producing useful protected datasets is addressed in multiple computational priva...
متن کاملRepeated Record Ordering for Constrained Size Clustering
One of the main techniques used in data mining is data clustering, which has many applications in computer science, biology, and social sciences. Constrained clustering is a type of clustering in which side information provided by the user is incorporated into current clustering algorithms. One of the well researched constrained clustering algorithms is called microaggregation. In a microaggreg...
متن کاملA Polynomial Algorithm for Optimal Univariate Microaggregation
Microaggregation is a technique used by statistical agencies to limit disclosure of sensitive microdata. Noting that no polynomial algorithms are known to microaggregate optimally, Domingo-Ferrer and Mateo-Sanz have presented heuristic microaggregation methods. This paper is the first to present an efficient polynomial algorithm for optimal univariate microaggregation. Optimal partitions are sh...
متن کاملA novel local search method for microaggregation
In this paper, we propose an effective microaggregation algorithm to produce a more useful protected data for publishing. Microaggregation is mapped to a clustering problem with known minimum and maximum group size constraints. In this scheme, the goal is to cluster n records into groups of at least k and at most 2k_1 records, such that the sum of the within-group squ...
متن کاملA polynomial-time approximation to optimal multivariate microaggregation
Microaggregation is a family of methods for statistical disclosure control (SDC) of microdata (records on individuals and/or companies), that is, for masking microdata so that they can be released without disclosing private information on the underlying individuals. Microaggregation techniques are currently being used by many statistical agencies. The principle of microaggregation is to group o...
متن کامل