Collective Data-Sanitization for Preventing Sensitive Information Inference Attacks in Social Networks
نویسنده
چکیده
Releasing social network data could seriously breach user privacy. User profile and friendship relations are inherently private. Unfortunately, it is possible to predict sensitive information carried in released data latently by utilizing data mining techniques. Therefore, sanitizing network data prior to release is necessary. In this paper, we explore how to launch an inference attack exploiting social networks with a mixture of non-sensitive attributes and social relationships. We map this issue to a collective classification problem and propose a collective inference model. In our model, an attacker utilizes user profile and social relationships in a collective manner to predict sensitive information of related victims in a released social network dataset. To protect against such attacks, we propose a data sanitization method collectively manipulating user profile and friendship relations. The key novel idea lies that besides sanitizing friendship relations, the proposed method can take advantages of various data-manipulating methods. We show that we can easily reduce adversary’s prediction accuracy on sensitive information, while resulting in less accuracy decrease on non-sensitive information towards three social network datasets. To the best of our knowledge, this is the first work that employs collective methods involving various data-manipulating methods and social relationships to protect against inference attacks in social networks.
منابع مشابه
Preventing Private Information Inference Attacks on Social Networks Technical Report UTDCS-03-09
On-line social networks, such as Facebook, are increasingly utilized by many people. These networks allow users to publish details about themselves and connect to their friends. Some of the information revealed inside these networks is meant to be private. Yet it is possible that corporations could use learning algorithms on released data to predict undisclosed private information. In this pape...
متن کاملA Combined Model of Clustering and Classification Methods for Preserving Privacy in Social Networks against Inference and Neighborhood Attacks
In the last decade online social networks has gained remarkable attention. Facebook or Google+, are example social network services which allow people to create online profiles and share personal information with their friends. These networks publish details about users while some of the information revealed inside is private. In order to address privacy concerns, many social networks allow use...
متن کاملPreventing Disclosure of Sensitive Knowledge by Hiding Inference
'Data Mining' is a way of extracting data or uncovering hidden patterns of information from databases. So, there is a need to prevent the "inference rules" from being disclosed such that the more secure data sets cannot be identified from non-sensitive attributes. This can be done through removing/adding certain item sets in the transactions (Sanitization). The purpose is to...
متن کاملSecure Association Rule Sharing
The sharing of association rules is often beneficial in industry, but requires privacy safeguards. One may decide to disclose only part of the knowledge and conceal strategic patterns which we call restrictive rules. These restrictive rules must be protected before sharing since they are paramount for strategic decisions and need to remain private. To address this challenging problem, we propos...
متن کاملData sanitization in association rule mining based on impact factor
Data sanitization is a process that is used to promote the sharing of transactional databases among organizations and businesses, it alleviates concerns for individuals and organizations regarding the disclosure of sensitive patterns. It transforms the source database into a released database so that counterparts cannot discover the sensitive patterns and so data confidentiality is preserved ag...
متن کامل