منابع مشابه
CS 6740 : Advanced Language Technologies February 4 , 2010 Lecture 3 : Pivoted Document Length Normalization
In this lecture, we examine the impact of the length of a document on its relevance to queries. We show that document relevance is positively correlated with document length, and see that relevance scores that use the normalization techniques we’ve studied so far (L∞, L1, L2) do not capture this correlation correctly. Finally, we present the “pivoted document length normalization” technique int...
متن کاملDocument Length Normalization
In the previous lecture we discussed pivoted document length normalization [Singhal et al. 96], a simple technique that applies a correction for the observation that document relevance correlates with document length. Through careful empirical verification of previous assumptions, they showed that the seemingly simple normalization term could have a big impact on results. However, in our discus...
متن کاملDocument Normalization Revisited
Cosine Pivoted Document Length Normalization has reached a point of stability where many researchers indiscriminantly apply a specific value of 0.2 regardless of the collection. Our efforts, however, demonstrate that applying this specific value without tuning for the document collection degrades average precision by as much as 20%.
متن کاملFrom Controlled Document Authoring to Interactive Document Normalization
This paper presents an approach to normalize documents in constrained domains. This approach reuses resources developed for controlled document authoring and is decomposed into three phases. First, candidate content representations for an input document are automatically built. Then, the content representation that best corresponds to the document according to an expert of the class of document...
متن کاملOntology Based Pivoted normalization using Vector Based Approach for information Retrieval
Research Scholar, Computer Science and Engineering Department, Lingaya’s University, Faridabad Associate Professor, Computer Science and Engineering Department, Lingaya’s University, Faridabad [email protected], [email protected] ABSTRACT An ample amount of documents present on web puts the users in state of dilemma. Users get confused about relevance of documents. Relevance means ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: ACM SIGIR Forum
سال: 2017
ISSN: 0163-5840
DOI: 10.1145/3130348.3130365