Improving query expansion using pseudo-relevant web knowledge for information retrieval
نویسندگان
چکیده
• Web knowledge-based query expansion technique uses the top N pseudo relevant web pages Top returned by search engines Google, Bing, and DuckDuckGo as data sources Proposed weighting techniques have properly capture term relationship while weighing terms Experimental results show an improvement of 25.89% over unexpanded queries FIRE dataset In field information retrieval, (QE) has long been used a to deal with fundamental issue word mismatch between user’s target information. context expanded terms, existing often fail appropriately term-term whole relationship, resulting in low retrieval effectiveness. Our proposed QE approach addresses this proposing three models based on (1) tf-idf, (2) k-nearest neighbor (kNN) cosine similarity , (3) correlation score. Further, extract initial set we use pseudo-relevant knowledge consisting popular namely, DuckDuckGo, response original query. Among models, tf-idf scores each individual obtained from content, kNN-based obtain score weighs selected respect The model, called (WKQE), achieves Mean Average Precision (MAP) 30.83% Geometric precision (GMAP) dataset. A comparative analysis WKQE other related approaches clearly shows significant performance. We also analyzed effect varying number documents effectiveness model.
منابع مشابه
Query Expansion for Web Information Retrieval
Information retrieval (IR) systems utilize user feedback for generating optimal queries with respect to a particular information need. However the methods that have been developed in IR for generating these queries do not memorize information gathered from previous search processes, and hence can not use such information in new search processes. Thus a new search process can not profit from the...
متن کاملImproving Query Expansion for Information Retrieval Using Wikipedia
Query expansion (QE) is one of the key technologies to improve retrieval efficiency. Many studies on query expansion with relationships from single local corpus suffer from two problems resulting in low retrieval performance: term relationships are limited and unlisted query terms have no expansion terms. To address these problems, relationships between terms captured from Wikipedia are superim...
متن کاملQuery Architecture Expansion in Web Using Fuzzy Multi Domain Ontology
Due to the increasing web, there are many challenges to establish a general framework for data mining and retrieving structured data from the Web. Creating an ontology is a step towards solving this problem. The ontology raises the main entity and the concept of any data in data mining. In this paper, we tried to propose a method for applying the "meaning" of the search system, But the problem ...
متن کاملUsing Query-Relevant Documents Pairs for Cross-Lingual Information Retrieval
The world wide web is a natural setting for cross-lingual information retrieval. The European Union is a typical example of a multilingual scenario, where multiple users have to deal with information published in at least 20 languages. Given queries in some source language and a target corpus in another language, the typical approximation consists in translating either the query or the target d...
متن کاملKnowledge-Based Approaches to Query Expansion in Information Retrieval
Textual information is becoming increasingly available in electronic forms. Users need tools to sift through non-relevant information and retrieve only those pieces relevant to their needs. The traditional methods such as Boolean operators and key terms have somehow reached their limitations. An emerging trend is to combine the traditional information retrieval and artificial intelligence techn...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Pattern Recognition Letters
سال: 2022
ISSN: ['1872-7344', '0167-8655']
DOI: https://doi.org/10.1016/j.patrec.2022.04.013