A modification of the Lloyd algorithm for k-anonymous quantization
نویسندگان
چکیده
We address the problem of designing quantizers that cluster data while satisfying a k-anonymity requirement. A general data compression perspective is adopted, which considers both discrete and continuous probability distributions, and corresponding constraints on both cell sizes and quantizer index probabilities. Potential applications of this problem extend well beyond the important case of microdata anonymization, to include also optimized task allocation under workload constraints. Our contribution is twofold. First and most importantly, we present a theoretical analysis showing the optimality conditions which probability-constrained quantizers must satisfy, thereby theoretically characterizing optimal k-anonymous aggregation as a special case. As a second contribution, inspired by our theoretical analysis, we propose an alternating optimization algorithm for the design of this type of quantizers. Our algorithm is conceptually motivated by the popular Lloyd–Max algorithm for quantization design, originally intended for data compression, also known as the k-means method. Experimental results for synthetic and real data, with mean squared error as a distortion measure, confirm that our method outperforms MDAV, a popular fixed-size microaggregation algorithm for statistical disclosure control. This performance improvement is in terms of data utility, for the exact same k-anonymity constraint, but does come at the expense of higher computational sophistication. 2012 Elsevier Inc. All rights reserved.
منابع مشابه
An algorithm for k-anonymous microaggregation and clustering inspired by the design of distortion-optimized quantizers
Article history: Received 27 April 2010 Received in revised form 21 June 2011 Accepted 21 June 2011 Available online 2 July 2011 We present a multidisciplinary solution to the problems of anonymous microaggregation and clustering, illustrated with two applications, namely privacy protection in databases, and private retrieval of location-based information. Our solution is perturbative, is based...
متن کاملDetection of perturbed quantization (PQ) steganography based on empirical matrix
Perturbed Quantization (PQ) steganography scheme is almost undetectable with the current steganalysis methods. We present a new steganalysis method for detection of this data hiding algorithm. We show that the PQ method distorts the dependencies of DCT coefficient values; especially changes much lower than significant bit planes. For steganalysis of PQ, we propose features extraction from the e...
متن کاملNGTSOM: A Novel Data Clustering Algorithm Based on Game Theoretic and Self- Organizing Map
Identifying clusters is an important aspect of data analysis. This paper proposes a noveldata clustering algorithm to increase the clustering accuracy. A novel game theoretic self-organizingmap (NGTSOM ) and neural gas (NG) are used in combination with Competitive Hebbian Learning(CHL) to improve the quality of the map and provide a better vector quantization (VQ) for clusteringdata. Different ...
متن کاملAn efficient algorithm for finding the semi-obnoxious $(k,l)$-core of a tree
In this paper we study finding the $(k,l)$-core problem on a tree which the vertices have positive or negative weights. Let $T=(V,E)$ be a tree. The $(k,l)$-core of $T$ is a subtree with at most $k$ leaves and with a diameter of at most $l$ which the sum of the weighted distances from all vertices to this subtree is minimized. We show that, when the sum of the weights of vertices is negative, t...
متن کاملAn Effective Method for Initialization of Lloyd-Max's Algorithm of Optimal Scalar Quantization for Laplacian Source
In this paper an exact and complete analysis of the Lloyd–Max’s algorithm and its initialization is carried out. An effective method for initialization of Lloyd–Max’s algorithm of optimal scalar quantization for Laplacian source is proposed. The proposed method is very simple method of making an intelligent guess of the starting points for the iterative Lloyd–Max’s algorithm. Namely, the initia...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Inf. Sci.
دوره 222 شماره
صفحات -
تاریخ انتشار 2013