Improved Univariate Microaggregation for Integer Values

نویسنده

چکیده مقاله:

Privacy issues during data publishing is an increasing concern of involved entities. The problem is addressed in the field of statistical disclosure control with the aim of producing protected datasets that are also useful for interested end users such as government agencies and research communities. The problem of producing useful protected datasets is addressed in multiple computational privacy models such as $k$-anonymity in which data is clustered into groups of at least $k$ members. Microaggregation is a mechanism to realize $k$-anonymity. The objective is to assign records of a dataset to clusters and replace the original values with their associated cluster centers which are the average of assigned values to minimize information loss in terms of the sum of within group squared errors ($SSE$). While the problem is shown to be NP-hard in general, there is an optimal polynomial-time algorithm for univariate datasets. This paper shows that the assignment of the univariate microaggregation algorithm cannot produce optimal partitions for integer observations where the computed centroids have to be integer values. In other words, the integrality constraint on published quantities has to be addressed within the algorithm steps and the optimal partition cannot be attained using only the results of the general solution. Then, an effective method that considers the constraint is proposed and analyzed which can handle very large numerical volumes. Experimental evaluations confirm that the developed algorithm not only produces more useful datasets but also is more efficient in comparison with the general optimal univariate algorithm.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Polynomial Algorithm for Optimal Univariate Microaggregation

Microaggregation is a technique used by statistical agencies to limit disclosure of sensitive microdata. Noting that no polynomial algorithms are known to microaggregate optimally, Domingo-Ferrer and Mateo-Sanz have presented heuristic microaggregation methods. This paper is the first to present an efficient polynomial algorithm for optimal univariate microaggregation. Optimal partitions are sh...

متن کامل

Improving the Utility of Differential Privacy via Univariate Microaggregation

Differential privacy is a privacy model for anonymization that offers more robust privacy guarantees than previous models, such as k-anonymity and its extensions. However, it is often disregarded that the utility of differentially private outputs is quite limited, either because of the amount of noise that needs to be added to obtain them or because utility is only preserved for a restricted ty...

متن کامل

Improved Techniques for Factoring Univariate Polynomials

The paper describes improved techniques for factoring univariate polynomials over the integers. The authors modify the usual linear method for lifting modular polynomial factorizations so that efficient early factor detection can be performed. The new lifting method is universally faster than the classical quadratic method, and is faster than a linear method due to Wang, provided we lift suffic...

متن کامل

Improved Algorithms for Univariate Discretization of Continuous Features

In discretization of a continuous variable its numerical value range is divided into a few intervals that are used in classification. For example, Näıve Bayes can benefit from this processing. A commonlyused supervised discretization method is Fayyad and Irani’s recursive entropy-based splitting of a value range. The technique uses mdl as a model selection criterion to decide whether to accept ...

متن کامل

Improved algorithms for solving bivariate systems via Rational Univariate Representations

Given two coprime polynomials P and Q in Z[x, y] of degree bounded by d and bitsize bounded by τ , we address the problem of solving the system {P,Q}. We are interested in certified numerical approximations or, more precisely, isolating boxes of the solutions. We are also interested in computing, as intermediate symbolic objects, rational parameterizations of the solutions, and in particular Ra...

متن کامل

A novel local search method for microaggregation

In this paper, we propose an effective microaggregation algorithm to produce a more useful protected data for publishing. Microaggregation is mapped to a clustering problem with known minimum and maximum group size constraints. In this scheme, the goal is to cluster n records into groups of at least k and at most 2k_1 records, such that the sum of the within-group squ...

متن کامل

منابع من

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}


عنوان ژورنال

دوره 12  شماره 1

صفحات  35- 43

تاریخ انتشار 2020-01-01

با دنبال کردن یک ژورنال هنگامی که شماره جدید این ژورنال منتشر می شود به شما از طریق ایمیل اطلاع داده می شود.

میزبانی شده توسط پلتفرم ابری doprax.com

copyright © 2015-2023