Integrating the Probabilistic Models BM25/BM25F into Lucene
نویسندگان
چکیده
This document describes the BM25 and BM25F implementation using the Lucene Java Framework. The implementation described here can be downloaded from [Pérez-Iglesias 08a]. Both models have stood out at TREC by their performance and are considered as stateof-the-art in the IR community. BM25 is applied to retrieval on plain text documents, that is for documents that do not contain fields, while BM25F is applied to documents with structure.
منابع مشابه
The Probabilistic Relevance Framework: BM25 and Beyond
The Probabilistic Relevance Framework (PRF) is a formal framework for document retrieval, grounded in work done in the 1970–1980s, which led to the development of one of the most successful text-retrieval algorithms, BM25. In recent years, research in the PRF has yielded new retrieval models capable of taking into account document meta-data (especially structure and link-graph information). Aga...
متن کاملA Short Note on Proximity-based Scoring of Documents with Multiple Fields
e BM25 ranking function is one of the most well known query relevance document scoring functions and many variations of it are proposed. e BM25F function is one of its adaptations designed formodeling documentswithmultiple fields. e Expanded Span method extends a BM25-like function by taking into considerations of the proximity between term occurrences. In this note, we combine these two var...
متن کاملWhen Simple is (more than) Good Enough: Effective Semantic Search with (almost) no Semantics
• Baseline retrieval • Flat text representation • Standard retrieval models (LM, BM25) • Fielded representation • Predicates holding title values are put in a separate field • Fielded variants of retrieval models (LMF, BM25F) • Entity importance • Weigh trusted, high-quality sources higher (DBpedia) • Extended preprocessing • Content extraction from URIs ...
متن کاملA Comparative Study of Probabilistic and Language Models for Information Retrieval
Language models for information retrieval have received much attention in recent years, with many claims being made about their performance. However, previous studies evaluating the language modelling approach for information retrieval used different query sets and heterogeneous collections, which make reported results difficult to compare. This research is a broad-based study that evaluates la...
متن کاملA Probabilistic Model of Learning Fields in Islamic Economics and Finance
In this paper an epistemological model of learning fields of probabilistic events is formalized. It is used to explain resource allocation governed by pervasive complementarities as the sign of unity of knowledge. Such an episteme is induced epistemologically into interacting, integrating and evolutionary variables representing the problem at hand. The end result is the formalization of a p...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/0911.5046 شماره
صفحات -
تاریخ انتشار 2009