The Power of Selecting Key Blocks with Local Pre-ranking for Long Document Information Retrieval

نویسندگان

چکیده

On a wide range of natural language processing and information retrieval tasks, transformer-based models, particularly pre-trained models like BERT, have demonstrated tremendous effectiveness. Due to the quadratic complexity self-attention mechanism, however, such difficulties long documents. Recent works dealing with this issue include truncating documents, in which case one loses potential relevant information, segmenting them into several passages, may lead miss some high computational when number passages is large, or modifying mechanism make it sparser as sparse-attention at risk again missing information. We follow here slightly different approach first selects key blocks document by local query-block pre-ranking, then few are aggregated form short that can be processed model BERT. Experiments conducted on standard Information Retrieval datasets demonstrate effectiveness proposed approach.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

the use of appropriate madm model for ranking the vendors of mci equipments using fuzzy approach

abstract nowadays, the science of decision making has been paid to more attention due to the complexity of the problems of suppliers selection. as known, one of the efficient tools in economic and human resources development is the extension of communication networks in developing countries. so, the proper selection of suppliers of tc equipments is of concern very much. in this study, a ...

15 صفحه اول

Generalized ensemble model for document ranking in information retrieval

A generalized ensemble model (gEnM) for document ranking is proposed in this paper. The gEnM linearly combines basis document retrieval models and tries to retrieve relevant documents at high positions. In order to obtain the optimal linear combination of multiple document retrieval models or rankers, an optimization program is formulated by directly maximizing the mean average precision. Both ...

متن کامل

Document Re-ranking Based on Automatically Acquired Key Terms in Chinese Information Retrieval

For Information Retrieval, users are more concerned about the precision of top ranking documents in most practical situations. In this paper, we propose a method to improve the precision of top N ranking documents by reordering the retrieved documents from the initial retrieval. To reorder documents, we first automatically extract Global Key Terms from document set, then use extracted Global Ke...

متن کامل

Document Re-ranking by Generality in Bio-medical Information Retrieval

Document ranking is well known to be a crucial process in information retrieval (IR). It presents retrieved documents in an order of their estimated degrees of relevance to query. Traditional document ranking methods are based on different measurements of similarity between documents and query. Due to information explosion and the popularity of WWW information retrieval, the increased variety o...

متن کامل

Semi-supervised ranking for document retrieval

Ranking functions are an important component of information retrieval systems. Recently there has been a surge of research in the field of “learning to rank”, which aims at using labeled training data and machine learning algorithms to construct reliable ranking functions. Machine learning methods such as neural networks, support vector machines, and least squares have been successfully applied...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: ACM Transactions on Information Systems

سال: 2023

ISSN: ['1558-1152', '1558-2868', '1046-8188', '0734-2047']

DOI: https://doi.org/10.1145/3568394