Study on the Combination of Probabilistic and Boolean IR Models for WWW Documents Retrieval

نویسندگان

  • Masaharu Yoshioka
  • Makoto Haraguchi
چکیده

In this paper, we describe our information retrieval (IR) system that is used for the NTCIR-4 Web Task A. First, we introduce our IR system, which is based on the probabilistic IR model. This system is quite similar to the Okapi system, and uses both a word index and a phrase index comprising combinations of two adjacent words. Second, we propose a method for clarifying queries that combines the probabilistic IR model and the Boolean IR model. Since it is not easy to construct a Boolean query that covers all relevant documents, a mechanism for clarifying the Boolean query is required. In this paper, we propose “appropriate Boolean query reformulation for IR” (ABRIR) that support Boolean query formation and score documents based on combining probabilistic and Boolean IR models. Finally, we discuss the effectiveness of the method based on the results of experiments.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improved Skips for Faster Postings List Intersection

Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...

متن کامل

Improved Skips for Faster Postings List Intersection

Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...

متن کامل

Geographic Information Retrieval and the World Wide Web: A Match Made in Electronic Space

This article looks at the access to geographic information through a review of information science theory and its application to the WWW. The two most common retrieval systems are information and data retrieval. A retrieval system has seven elements: retrieval models, indexing, match and retrieval, relevance, order, query languages and query specification. The goal of information retrieval is t...

متن کامل

Automatizing the Assignment of the Submitted Manuscripts to Reviewers: A Systematic Review of Research Texts

Purpose: To systematicly review the automatazation of the assignment of the submitted manuscripts to reviewers in order to identify the status of research studies in this field in terms of types of evidence of expertise, types of retrieval models used, and the research gaps, and finally some suggestions for has been offered for future research. Method: The current research followed the systema...

متن کامل

0 Logical Imaging and Probabilistic Information Retrieval

In Information Retrieval (IR), probabilistic modelling relates to the use of a retrieval model that ranks documents in decreasing order of their estimated probability of relevance to a user’s information need expressed by a query. In an IR system based on a probabilistic model, the user is always guided to examine first the documents which are the most likely to be relevant to his or her need. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004