K−means Clustering Microaggregation for Statistical Disclosure Control

نویسندگان

  • Md Enamul Kabir
  • Abdun Naser Mahmood
  • Abdul K Mustafa
چکیده

This paper presents a K-means clustering technique that satisfies the biobjective function to minimize the information loss and maintain k-anonymity. The proposed technique starts with one cluster and subsequently partitions the dataset into two or more clusters such that the total information loss across all clusters is the least, while satisfying the k-anonymity requirement. The structure of K−means clustering problem is defined and investigated and an algorithm of the proposed problem is developed. The performance of the K− means clustering algorithm is compared against the most recent microaggregation methods. Experimental results show that K−means clustering algorithm incurs less information loss than the latest microaggregation methods for all of the test situations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Repeated Record Ordering for Constrained Size Clustering

One of the main techniques used in data mining is data clustering, which has many applications in computer science, biology, and social sciences. Constrained clustering is a type of clustering in which side information provided by the user is incorporated into current clustering algorithms. One of the well researched constrained clustering algorithms is called microaggregation. In a microaggreg...

متن کامل

Practical Data-Oriented Microaggregation for Statistical Disclosure Control

ÐMicroaggregation is a statistical disclosure control technique for microdata disseminated in statistical databases. Raw microdata (i.e., individual records or data vectors) are grouped into small aggregates prior to publication. Each aggregate should contain at least k data vectors to prevent disclosure of individual information, where k is a constant value preset by the data protector. No exa...

متن کامل

Record Ordering Heuristics for Disclosure Control through Microaggregation

Statistical disclosure control (SDC) methods reconcile the need to release information to researchers with the need to protect privacy of individual records. Microaggregation is a SDC method that protects data subjects by guarantying k-anonymity: Records are partitioned into groups of size at least k and actual data values are replaced by the group means so that each record in the group is indi...

متن کامل

Novel Iterative Min-Max Clustering to Minimize Information Loss in Statistical Disclosure Control

In recent years, there has been an alarming increase of online identity theft and attacks using personally identifiable information. The goal of privacy preservation is to de-associate individuals from sensitive or microdata information. Microaggregation techniques seeks to protect microdata in such a way that can be published and mined without providing any private information that can be link...

متن کامل

Microdata Protection Method Through Microaggregation: A Systematic Approach

Microdata protection in statistical databases has recently become a major societal concern and has been intensively studied in recent years. Statistical Disclosure Control (SDC) is often applied to statistical databases before they are released for public use. Microaggregation for SDC is a family of methods to protect microdata from individual identification. SDC seeks to protect microdata in s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012