Using Search Results to Microaggregate Query Logs Semantically

نویسندگان

  • Arnau Erola
  • Jordi Castellà-Roca
چکیده

Query log anonymization has become an important challenge nowadays. A query log contains the search history of the users, as well as the selected results and their position in the ranking. These data are used to provide a personalized re-ranking of results and trend studies. However, query logs can disclose sensitive information of the users. Hence, query logs must be submitted to an anonymization process to guarantee that: a) no sensitive information can be linked to an identity; b) the analysis of the anonymized data produces similar results than the original data, i.e. minimize data distortion. Latest anonymization approaches utilize microagreggation, a statistical disclosure control technique that provides a privacy comparable with k-anonymity, attempting to minimize the data distortion. We propose a new method that uses search results to optimize microaggregation, providing more data reliability than the existing methods.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extracting Semantically Related Queries By Exploiting User Session Information

This paper presents a simple and very effective collaborative approach to generate semantically related queries to a user query by employing aggregated user session statistics, as captured by search engine query logs. We show empirical evidence that one of the main causes of the temporal correlation between semantically related queries, which was previously reported in the literature, is the fa...

متن کامل

Why Not Use Query Logs As Corpora?

Generally, every Web search engine logs the user sessions. These records, called query logs, contain valuable information about the behaviour of Internet users and their language. There are only a few experiments on mining query logs, but they confirm that query logs are very useful for designing natural language applications in Web retrieval. This paper shows how lexical and semantic informati...

متن کامل

Mining Search Subtopics from Query Logs

Web queries are usually short and ambiguous. Subtopic mining plays an important role in understanding user’s search intent and has attracted many researchers' attention. In this paper, we describe our approach to identify users’ intents from query logs, which is a subtopic mining subtask of the NTCIR-9 Intent task for Chinese. We extract queries that are semantically related to the original que...

متن کامل

Analysis of users’ query reformulation behavior in Web with regard to Wholis-tic/analytic cognitive styles, Web experience, and search task type

Background and Aim: The basic aim of the present study is to investigate users’ query reformulation behavior with regard to wholistic-analytic cognitive styles, search task type, and experience variables in using the Web. Method: This study is an applied research using survey method. A total of 321 search queries were submitted by 44 users. Data collection tools were Riding’s Cognitive Style A...

متن کامل

Query Log Mining in Search Engines

The Web is a huge read-write information space where many items such as documents, images or other multimedia can be accessed. In this context, several information technologies have been developed to help users to satisfy their searching needs on the Web, and the most used are search engines. Search engines allow users to find Web resources formulating queries (a set of terms) and reviewing a l...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013