Spherical microaggregation: Anonymizing sparse vector spaces

نویسندگان

  • Daniel Abril
  • Guillermo Navarro-Arribas
  • Vicenç Torra
چکیده

Abstract Unstructured texts are a very popular data type and still widely unexplored in the privacy preserving data mining field. We consider the problem of providing public information about a set of confidential documents. To that end we have developed a method to protect a Vector Space Model (VSM), to make it public even if the documents it represents are private. This method is inspired by microaggregation, a popular protection method from statistical disclosure control, and adapted to work with sparse and high dimensional data sets. URL http://www.sciencedirect.com/science/article/pii/S0167404814001679 [8] Source URL: https://www.iiia.csic.es/en/node/53885 Links [1] https://www.iiia.csic.es/en/staff/daniel-abril [2] https://www.iiia.csic.es/en/staff/guillermo-navarro-arribas [3] https://www.iiia.csic.es/en/staff/vicen%C3%A7-torra [4] https://www.iiia.csic.es/en/bibliography?f[keyword]=565 [5] https://www.iiia.csic.es/en/bibliography?f[keyword]=440 [6] https://www.iiia.csic.es/en/bibliography?f[keyword]=694 [7] https://www.iiia.csic.es/en/bibliography?f[keyword]=695 [8] http://www.sciencedirect.com/science/article/pii/S0167404814001679

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards a private vector space model for confidential documents

We introduce in this paper a method to anonymize document vector spaces. These vector spaces can be used to analyze confidential documents without disclosing private information. The method is inspired in microaggregation, a popular technique used in statistical disclosure control. URL http://doi.acm.org/10.1145/2480362.2480543 [9] Source URL: https://www.iiia.csic.es/en/node/54488 Links [1] ht...

متن کامل

TFRP: An efficient microaggregation algorithm for statistical disclosure control

Recently, the issue of Statistic Disclosure Control (SDC) has attracted much attention. SDC is a very important part of data security dealing with the protection of databases. Microaggregation for SDC techniques is widely used to protect confidentiality in statistical databases released for public use. The basic problem of microaggregation is that similar records are clustered into groups, and ...

متن کامل

Laplacian eigenmodes for spherical spaces

The possibility that our space is multi rather than singly connected has gained a renewed interest after the discovery of the low power for the first multipoles of the CMB by WMAP. To test the possibility that our space is a multi-connected spherical space, it is necessary to know the eigenmodes of such spaces. Excepted for lens and prism space, and in some extent for dodecahedral space, this r...

متن کامل

The evidence framework applied to sparse kernel logistic regression

In this paper we present a simple hierarchical Bayesian treatment of the sparse kernel logistic regression (KLR) model based on the evidence framework introduced by MacKay. The principal innovation lies in the re-parameterisation of the model such that the usual spherical Gaussian prior over the parameters in the kernel induced feature space also corresponds to a spherical Gaussian prior over t...

متن کامل

Microdata Protection Method Through Microaggregation: A Systematic Approach

Microdata protection in statistical databases has recently become a major societal concern and has been intensively studied in recent years. Statistical Disclosure Control (SDC) is often applied to statistical databases before they are released for public use. Microaggregation for SDC is a family of methods to protect microdata from individual identification. SDC seeks to protect microdata in s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computers & Security

دوره 49  شماره 

صفحات  -

تاریخ انتشار 2015