Mining Frequent Patterns Through Microaggregation in Differential Privacy

نویسندگان

  • Z. NI
  • Q. M. LI
  • X. Q. LIU
  • T. LI
  • R. HOU
چکیده

Frequent pattern mining has been widely employed to analyze transaction datasets, but the question of how sensitive information contained in a dataset should be protected remains remains relatively unanswered. The differential privacy model provides a robust privacy guarantee, but the k-anonymity model provides better dataset utility. In this paper, a synergetic approach is proposed to simultaneously protect privacy and enhance data utility when mining top-k frequent patterns. First, microaggregated data is released, which achieves kanonymity, regardless of the query types the user may be using. Second, top-k frequent patterns are selected based on microaggregated data using the exponential mechanism. Finally, the true support of each top-k frequent pattern is perturbed by adding Laplace noise.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mining Frequent Patterns with Differential Privacy

The mining of frequent patterns is a fundamental component in many data mining tasks. A considerable amount of research on this problem has led to a wide series of efficient and scalable algorithms for mining frequent patterns. However, releasing these patterns is posing concerns on the privacy of the users participating in the data. Indeed the information from the patterns can be linked with a...

متن کامل

Repeated Record Ordering for Constrained Size Clustering

One of the main techniques used in data mining is data clustering, which has many applications in computer science, biology, and social sciences. Constrained clustering is a type of clustering in which side information provided by the user is incorporated into current clustering algorithms. One of the well researched constrained clustering algorithms is called microaggregation. In a microaggreg...

متن کامل

A Two-Phase Algorithm for Differentially Private Frequent Subgraph Mining

Mining frequent subgraphs from a collection of input graphs is an important task for exploratory data analysis on graph data. However, if the input graphs contain sensitive information, releasing discovered frequent subgraphs may pose considerable threats to individual privacy. In this paper, we study the problem of frequent subgraph mining (FSM) under the rigorous differential privacy model. W...

متن کامل

Constrained Microaggregation: Adding Constraints for Data Editing

Privacy preserving data mining and statistical disclosure control have introduced several methods for data perturbation that can be used for ensuring the privacy of data respondents. Such methods, as rank swapping and microaggregation, perturbate the data introducing some kind of noise. Nevertheless, it is usual that data are edited with care after collection to remove inconsistencies, and such...

متن کامل

A Study of Differentially Private Frequent Itemset Mining

Frequent sets play an important role in many Data Mining tasks that try to search interesting patterns from databases, such as association rules, sequences, correlations, episodes, classifiers and clusters. FrequentItemsets Mining (FIM) is the most well-known techniques to extract knowledge from dataset. In this paper differential privacy aims to get means to increase the accuracy of queries fr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015