A survey on statistical disclosure control and micro-aggregation techniques for secure statistical databases

نویسندگان

  • Ebaa Fayyoumi
  • B. John Oommen
چکیده

This paper surveys the fields of Statistical Disclosure Control (SDC) and MicroAggregation Techniques (MATs), which are both areas fundamental to the science of secure Statistical DataBases (SDBs). The paper is written from the perspective of a computer scientist with the hope that it will prove to be a source of reference material useful to researchers and practitioners in the field. The paper first introduces the concept of SDC and describes the domain of its applications and the various data types that are currently used in SDBs. It then proceeds to focus on the family of micro-data types in SDBs. At this juncture, we introduce the importance of the relevant measures, namely the metrics termed as the Information Loss (IL) and the Disclosure Risk (DR), after which we survey the various methods of resolving the conflicting goals that these metrics represent. Thereafter, the paper summarizes the perturbative and nonperturbative SDC methods for micro-data protection, and it focuses on the families of MATs by formally stating the Micro-Aggregation Problem and surveying it in a comprehensive manner. Apart from the paper including a historical view of the field of MATs, it describes a broad selection of work that has been reported more recently. Indeed, we believe that this paper represents a complete overview of the state-of-theart techniques. Copyright © 2010 John Wiley & Sons, Ltd.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Fixed Structure Learning Automaton Micro-aggregation Technique for Secure Statistical Databases

We consider the problem of securing statistical databases and, more specifically, the micro-aggregation technique (MAT), which coalesces the individual records in the micro-data file into groups or classes, and on being queried, reports, for the all individual values, the aggregated means of the corresponding group. This problem is known to be NP-hard and has been tackled using many heuristic s...

متن کامل

Enhancing Micro-Aggregation Technique by Utilizing Dependence-Based Information in Secure Statistical Databases

We consider the Micro-Aggregation Problem (MAP) in secure statistical databases which involves partitioning a set of individual records in a micro-data file into a number of mutually exclusive and exhaustive groups. This problem, which seeks for the best partition of the micro-data file, is known to be NP-hard, and has been tackled using many heuristic solutions. In this paper, we would like to...

متن کامل

Privacy-Preserving Data Mining

Privacy-preserving data mining (PPDM) refers to the area of data mining that seeks to safeguard sensitive information from unsolicited or unsanctioned disclosure. Most traditional data mining techniques analyze and model the data set statistically, in aggregation, while privacy preservation is primarily concerned with protecting against disclosure individual data records. This domain separation...

متن کامل

An AI-Based Causal Strategy for Securing Statistical Databases Using Micro-aggregation

Although Artificial Intelligent (AI) techniques have been used in various applications, their use in maintaining security in Statistical DataBases (SDBs) has not been reported. This paper presents results, that to the best of our knowledge is pioneering, by which concepts from causal networks are used to secure SDBs. We consider the MicroAggregation Problem (MAP) in secure SDBs which involves p...

متن کامل

Measuring the disclosure protection of micro aggregated business microdata An analysis taking the example of German Structure of Costs Survey

To assess the effectiveness of an anonymisation method with respect to data protection, the disclosure risk associated with the protected data must be evaluated. We consider the scenario where a possible data intruder matches an external database with the entire of confidential data. In order to improve his external database he tries to assign as many correct pairs of records (that is, records ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Softw., Pract. Exper.

دوره 40  شماره 

صفحات  -

تاریخ انتشار 2010