Pitt at TREC 2005: HARD and Enterprise
نویسندگان
چکیده
The University of Pittsburgh team participated in two tracks for TREC 2005: the High Accuracy Retrieval from Documents (HARD) track and the Enterprise Retrieval track. The goal of Pitt’s HARD study in TREC 2005 was to examine the effectiveness of applying Self Organizing Maps (SOM) as a visual presentation tool and as a clustering tool in the context of HARD tasks, especially its role in clarification form generation. Our experiment results demonstrate that SOM can be used as a clustering tool to generate terms for query expansion based on interactive relevance feedback. It produced significant improvement over the baseline when measured by R-Prec. However, its effectiveness of being a visualization tool for users to make relevance feedback still needs careful examination and further studies. Our goal in this year’s enterprise search track was to study the effect of query expansion based on an expansion corpus in retrieving emails from an email corpus. The expansion corpus consisted of the WWW, People and ESW sub-collections of the W3C test collection. The results indicate that query expansion based on the expansion corpus can achieve significant improvement over the no expansion baselines. However, there is no significant difference to the simpler query expansion approach using blind relevance feedback. Interestingly the terms used in these two query expansion approaches are different, with averagely only 6 term overlap among 20 possible terms. Further study is needed for examining the effect of combining these two approaches.
منابع مشابه
The Lowlands' TREC Experiments 2005
This paper describes our participation to the TREC HARD track (High Accuracy Retrieval of Documents) and the TREC Enterprise track. The main goal of our HARD participation is the development and evaluation of so-called query profiles: Short summaries of the retrieved results that enable the user to perform more focused search, for instance by zooming in on a particular time period. The main goa...
متن کاملPitt at TREC 2006: Identifying Experts via Email Discussions
Identifying experts in a certain domain or a subject area has always been a challenge in various settings including commercial, academia, and governmental institutions. Our interests in this year’s TREC Enterprise track are to utilize the email communications as the basis for identifying experts and their expertise on certain topics. In this report, we presented a method for identifying experts...
متن کاملUniversity of Glasgow at TREC 2005: Experiments in Terabyte and Enterprise Tracks with Terrier
With our participation in TREC 2005, we continue experiments using Terrier, a modular and scalable Information Retrieval (IR) framework, in 4 tasks from the Terabyte and Enterprise tracks. In the Terabyte track, we investigate new Divergence From Randomness weighting models, and a novel query expansion approach that can take into account various Web evidence, namely content, title and anchor te...
متن کاملA Menagerie of Tracks at Maryland: HARD, Enterprise, QA, and Genomics, Oh My!
This year, the University of Maryland participated in four separate tracks: HARD, enterprise, question answering, and genomics. Our HARD experiments involved a trained intermediary who searched for documents on behalf of the user, created clarification forms manually, and exploited user responses accordingly. The aim was to better understand the nature of single-iteration clarification dialogs ...
متن کامل