Promoting Ranking Diversity for Biomedical Information Retrieval Using Wikipedia
نویسندگان
چکیده
Traditional Information Retrieval models assume that the relevance of a document is independent of the relevance of other documents. However, in reality, this assumption may not hold. The usefulness of retrieving a document usually depends on previous ranked documents, since a user may want to see the top ranked documents concerning different aspects of his/her information need instead of reading relevant documents that only deliver redundant information. In this talk, I will discuss how to find relevant documents that can deliver more different aspects of a query. In particular, I will discuss new models derived from the survival analysis theory for measuring aspect novelty. I will discuss how to use Wikipedia to detect aspects covered by retrieved documents. An aspect filter based on a two-stage model will be introduced and a new re-ranking method that combines the novelty and the relevance of a retrieved document at the aspect level will also be presented. Through extensive experiments on standard large-scale TREC biomedical collections, I will show that the proposed models and methods are effective in promoting ranking diversity for biomedical information retrieval.
منابع مشابه
Learning to rank diversified results for biomedical information retrieval from multiple features
BACKGROUND Different from traditional information retrieval (IR), promoting diversity in IR takes consideration of relationship between documents in order to promote novelty and reduce redundancy thus to provide diversified results to satisfy various user intents. Diversity IR in biomedical domain is especially important as biologists sometimes want diversified results pertinent to their query....
متن کاملKISTI at TREC 2014 Clinical Decision Support Track: Concept-based Document Re-ranking to Biomedical Information Retrieval
With fast development of medical information systems and software, clinical decision support (CDS) systems continue to develop new methods to deal with diverse information coming from heterogeneous sources such as a large volume of electronic medical records (EMRs), patient genomic data, existing genomic pharmaceutical databases, curated disease-specific databases, peer-reviewed research, etc. ...
متن کاملThe Impact of Document Level Ranking on Focused Retrieval
Document retrieval techniques have proven to be competitive methods in the evaluation of focused retrieval. Although focused approaches such as XML element retrieval and passage retrieval allow for locating the relevant text within a document, using the larger context of the whole document often leads to superior document level ranking. In this paper we investigate the impact of using the docum...
متن کاملTwo-dimensional ranking of Wikipedia articles
The Library of Babel, described by Jorge Luis Borges, stores an enormous amount of information. The Library exists ab aeterno. Wikipedia, a free online encyclopaedia, becomes a modern analogue of such a Library. Information retrieval and ranking of Wikipedia articles become the challenge of modern society. While PageRank highlights very well known nodes with many ingoing links, CheiRank highlig...
متن کاملTwo-dimensional ranking of Wikipedia articles
The Library of Babel, described by Jorge Luis Borges, stores an enormous amount of information. The Library exists ab aeterno. Wikipedia, a free online encyclopaedia, becomes a modern analogue of such a Library. Information retrieval and ranking of Wikipedia articles become the challenge of modern society. While PageRank highlights very well known nodes with many ingoing links, CheiRank highlig...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010