Memory Efficient Ranking
نویسندگان
چکیده
Fast and effective ranking of a collection of documents with respect to a query requires several structures, including a vocabulary, inverted file entries, arrays of term weights and document lengths, an array of partial similarity accumulators, and address tables for inverted file entries and documents. Of all of these structures, the array of document lengths and the array of accumulators are the components accessed most frequently in a ranked query, and it is crucial to acceptable performance that they be held in main memory. Here we describe an approximate ranking process that makes use of a compact array of in-memory low precision approximations for the lengths. Combined with another simple rule for reducing the memory required by the partial similarity accumulators, the approximation heuristic allows the ranking of large document collections using less than one byte of memory per document, an eight-fold reduction compared with the space required by conventional techniques. Moreover, in our experiments retrieval effectiveness was unaffected by the use of these heuristics.
منابع مشابه
Ranking efficient DMUs using the infinity norm and virtual inefficient DMU in DEA
In many applications, ranking of decision making units (DMUs) is a problematic technical task procedure to decision makers in data envelopment analysis (DEA), especially when there are extremely efficient DMUs. In such cases, many DEA models may usually get the same efficiency score for different DMUs. Hence, there is a growing interest in ranking techniques yet. The purpose of this paper is ra...
متن کاملRanking efficient DMUs using minimizing distance in DEA
In many applications, ranking of decision making units (DMUs) is a problematic technical task procedure to decision makers in data envelopment analysis (DEA), especially when there are extremely efficient DMUs. In such cases, many DEA models may usually get the same efficiency score for different DMUs. Hence, there is a growing interest in ranking techniques yet. The main purpose of this paper ...
متن کاملRanking Efficient DMUs Using the Ideal point and Norms
In this paper, presenting two simple methods for ranking of efficient DMUs in DEA models that included to add one virtual DMU as ideal DMU and is using the additive model. Note that, we use an ideal point just for comparing efficient DMUs with. Although these methods are simple, they have ability for ranking all efficient DMUs, extreme points and the others, also they are capable of ranking t...
متن کاملRanking Efficient Decision Making Units in Data Envelopment Analysis based on Changing Reference Set
One of the drawbacks of Data Envelopment Analysis (DEA) is the problem of lack of discrimination among efficient Decision Making Units (DMUs). A method for removing this difficulty is called changing reference set proposed by Jahanshahloo and et.al (2007). The method has some drawbacks. In this paper a modified method and new method to overcome this problems are suggested. The main advantage of...
متن کاملEfficient Algorithms and Data Structures for Massive Data Sets
For many algorithmic problems, traditional algorithms that optimise on the number of instructions executed prove expensive on I/Os. Novel and very different design techniques, when applied to these problems, can produce algorithms that are I/O efficient. This thesis adds to the growing chorus of such results. The computational models we use are the external memory model and the W-Stream model. ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Inf. Process. Manage.
دوره 30 شماره
صفحات -
تاریخ انتشار 1994