When Random Sampling Preserves Privacy

نویسندگان

  • Kamalika Chaudhuri
  • Nina Mishra
چکیده

Many organizations such as the U.S. Census publicly release samples of data that they collect about private citizens. These datasets are first anonymized using various techniques and then a small sample is released so as to enable “do-it-yourself” calculations. This paper investigates the privacy of the second step of this process: sampling. We observe that rare values – values that occur with low frequency in the table – can be problematic from a privacy perspective. To our knowledge, this is the first work that quantitatively examines the relationship between the number of rare values in a table and the privacy in a released random sample. If we require -privacy (where the larger is, the worse the privacy guarantee) with probability at least 1− δ, we say that a value is rare if it occurs in at most Õ( 1 ) rows of the table (ignoring log factors). If there are no rare values, then we establish a direct connection between sample size that is safe to release and privacy. Specifically, if we select each row of the table with probability at most then the sample is O( )-private with high probability. In the case that there are t rare values, then the sample is Õ( δ/t)-private with probability at least 1−δ.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Randomness Efficient Fast-Johnson-Lindenstrauss Transform with Applications in Differential Privacy and Compressed Sensing

This paper resolves an open problem raised by Blocki et al. (FOCS 2012), i.e., whether other variants of the Johnson-Lindenstrauss transform preserves differential privacy or not? We prove that a general class of random projection matrices that satisfies the Johnson-Lindenstrauss lemma also preserves differential privacy. This class of random projection matrices requires only n Gaussian samples...

متن کامل

Circulant Matrices and Differential Privacy

This paper resolves an open problem raised by Blocki et al. (FOCS 2012), i.e., whether other variants of the Johnson-Lindenstrauss transform preserves differential privacy or not? We prove that a general class of random projection matrices that satisfies the Johnson-Lindenstrauss lemma also preserves differential privacy. This class of random projection matrices requires only n Gaussian samples...

متن کامل

Provably Private Data Anonymization: Or, k-Anonymity Meets Differential Privacy

Privacy-preserving microdata publishing currently lacks a solid theoretical foundation. Most existing techniques are developed to satisfy syntactic privacy notions such as k-anonymity, which fails to provide strong privacy guarantees. The recently proposed notion of differential privacy has been widely accepted as a sound privacy foundation for statistical query answering. However, no general p...

متن کامل

Protecting location privacy and query privacy: a combined clustering approach

In this paper, a combined clustering algorithm namely enhanced clustering cloak (ECC), for protecting location privacy and query privacy is proposed. An iterative K-means clustering method is developed to group the user requests into clusters for providing location safety. Meanwhile, a hierarchical clustering method for preserving the query privacy is used when creating clusters. ECC provides u...

متن کامل

On Random Additive Perturbation for Privacy Preserving Data Mining

Title of Thesis: On Random Additive Perturbation for Privacy Preserving Data Mining Author: Souptik Datta, Master of Science, 2004 Thesis directed by: Dr. Hillol Kargupta, Associate Professor Department of Computer Science and Electrical Engineering Privacy is becoming an increasingly important issue in many data mining applications. This has triggered the development of many privacy-preserving...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006