Score Distributions in Information Retrieval

نویسندگان

  • Avi Arampatzis
  • Stephen E. Robertson
  • Jaap Kamps
چکیده

We review the history of modeling score distributions, focusing on the mixture of normal-exponential by investigating the theoretical as well as the empirical evidence supporting its use. We discuss previously suggested conditions which valid binary mixture models should satisfy, such as the Recall-Fallout Convexity Hypothesis, and formulate two new hypotheses considering the component distributions under some limiting conditions of parameter values. From all the mixtures suggested in the past, the current theoretical argument points to the two gamma as the most-likely universal model, with the normal-exponential being a usable approximation. Beyond the theoretical contribution, we provide new experimental evidence showing vector space or geometric models, and BM25, as being “friendly” to the normal-exponential, and that the non-convexity problem that the mixture possesses is practically not severe.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Models of Score Distributions in Information Retrieval

Empirical modeling of a number of different text search engines shows that the score distributions on a per query basis may be fitted approximately using an exponential distribution for the set of nonrelevant documents and a normal distribution for the set of relevant documents. This model fits not only probabilistic search engines like INQUERY but also vector space search engines like SMART an...

متن کامل

On Score Distributions and Relevance

We discuss the idea of modelling the statistical distributions of scores of documents, classified as relevant or non-relevant. Various specific combinations of standard statistical distributions have been used for this purpose. Some theoretical considerations indicate problems with some of the choices of pairs of distributions. Specifically, we revisit a generalisation of the well-known inverse...

متن کامل

Score Following and Retrieval Based on Chroma and Octave Representation

With the studies of effective representation of music signals and music scores, i.e. chroma and octave features, this work conducts score following and score retrieval. To complement the shortage of chromagram representation, energy distributions in different octaves are used to describe tone height information. By transforming music signals and scores into sequences of feature vectors, score f...

متن کامل

Matching Scores of System Relevance and User-Oriented Relevance in SID, ISC and Google Scholar

Background and Aim: The main aim of Information storage and retrieval systems is keeping and retrieving the related information means providing the related documents with users’ needs or requests. This study aimed to answer this question that how much are the system relevance and User- Oriented relevance are matched in SID, SCI and Google Scholar databases. Method: In this study 15 keywords of ...

متن کامل

Score Standardization for Robust Comparison of Retrieval Systems

Information retrieval systems are evaluated by applying them to standard test collections of documents, topics, and relevance judgements. An evaluation metric is then used to score a system’s output for each topic; these scores are averaged to obtain an overall measure of effectiveness. However, different topics have differing degrees of difficulty and differing variability in scores, leading t...

متن کامل

Measuring the Ability of Score Distributions to Model Relevance

Modelling the score distribution of documents returned from any information retrieval (IR) system is of both theoretical and practical importance. The goal of which is to be able to infer relevant and nonrelevant documents based on their score to some degree of confidence. In this paper, we show how the performance of mixtures of score distributions can be compared using inference of query perf...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009