Ranked Document Selection

نویسندگان

J. Ian Munro

Gonzalo Navarro

Rahul Shah

Sharma V. Thankachan

چکیده

Let D be a collection of string documents of n characters in total. The top-k document retrieval problem is to preprocess D into a data structure that, given a query (P, k), can return the k documents of D most relevant to pattern P . The relevance of a document d for a pattern P is given by a predefined ranking function w(P, d). Linear space and optimal query time solutions already exist for this problem. In this paper we consider a novel problem, document selection queries, which aim to report the kth document most relevant to P (instead of reporting all top-k documents). We present a data structure using O(n log n) space, for any constant > 0, answering selection queries in time O(log k/ log logn), and a linear-space data structure answering queries in time O(log k), given the locus node of P in a (generalized) suffix tree of D. We also prove that it is unlikely that a succinct-space solution for this problem exists with poly-logarithmic query time.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Multistage Feature Selection Model for Document Classification Using Information Gain and Rough Set

Huge number of documents are increasing rapidly, therefore, to organize it in digitized form text categorization becomes an challenging issue. A major issue for text categorization is its large number of features. Most of the features are noisy, irrelevant and redundant, which may mislead the classifier. Hence, it is most important to reduce dimensionality of data to get smaller subset and prov...

متن کامل

RRLUFF: Ranking function based on Reinforcement Learning using User Feedback and Web Document Features

Principal aim of a search engine is to provide the sorted results according to user’s requirements. To achieve this aim, it employs ranking methods to rank the web documents based on their significance and relevance to user query. The novelty of this paper is to provide user feedback-based ranking algorithm using reinforcement learning. The proposed algorithm is called RRLUFF, in which the rank...

متن کامل

Enhancing Query Formulation for Spoken Document Retrieval

The popularity and ubiquity of multimedia associated with spoken documents has spurred a lot of research interest in spoken document retrieval (SDR) in the recent past. Beyond much effort devoted to developing robust indexing and modeling techniques for representing spoken documents, a recent line of thought targets at the improvement of query modeling for better reflecting the user’s informati...

متن کامل

Comparing scientific production of prioritized health areas of Iran\'s comprehensive scientific map with outlook horizon 1404 countries, a scientometric study: brief report

Background: Studying and evolution of medical sciences is so important to draw up the future path with a view to per capita of science production. The purpose of this study was to clarify the status and position of Iran in science production and compare it with four competitor countries of the region for 2025. Methods: This research is conducted using the scientometric method our ranked citati...

متن کامل

Robust Query-Specific Pseudo Feedback Document Selection for Query Expansion

In document retrieval using pseudo relevance feedback, after initial ranking, a fixed number of top-ranked documents are selected as feedback to build a new expansion query model. However, very little attention has been paid to an intuitive but critical fact that the retrieval performance for different queries is sensitive to the selection of different numbers of feedback documents. In this pap...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Ranked Document Selection

نویسندگان

چکیده

منابع مشابه

A Multistage Feature Selection Model for Document Classification Using Information Gain and Rough Set

RRLUFF: Ranking function based on Reinforcement Learning using User Feedback and Web Document Features

Enhancing Query Formulation for Spoken Document Retrieval

Comparing scientific production of prioritized health areas of Iran\'s comprehensive scientific map with outlook horizon 1404 countries, a scientometric study: brief report

Robust Query-Specific Pseudo Feedback Document Selection for Query Expansion

عنوان ژورنال:

اشتراک گذاری