Ranking the NTCIR Systems Based on Multigrade Relevance
نویسنده
چکیده
At NTCIR-4, new retrieval effectiveness metrics called Q-measure and R-measure were proposed for evaluation based on multigrade relevance. This paper shows that Q-measure inherits both the reliability of noninterpolated Average Precision and the multigrade relevance capability of Average Weighted Precision through a theoretical analysis, and then verify the above claim through experiments by actually ranking the systems submitted to the NTCIR-3 CLIR Task. Our experiments confirm that the Q-measure ranking is very highly correlated with the Average Precision ranking and that it is more reliable than Average Weighted Precision.
منابع مشابه
New Performance Metrics Based on Multigrade Relevance: Their Application to Question Answering
This paper proposes two new Information Retrieval performance metrics based on multigrade relevance, called Q-measure and R-measure, which are akin to Cumulative Gain and Average Weighted Precision but are arguably more reliable. We then show how Qmeasure can be applied to Question Answering involving ranked lists of exact answers, and discuss its advantages over Reciprocal Rank through an expe...
متن کاملRanking Retrieval Systems without Relevance Assessments: Revisited
We re-examine the problem of ranking retrieval systems without relevance assessments in the context of collaborative evaluation forums such as TREC and NTCIR. The problem was first tackled by Soboroff, Nicholas and Cahan in 2001, using data from TRECs 3-8 [16]. Our long-term goal is to semi-automate repeated evaluation of search engines; our short-term goal is to provide NTCIR participants with...
متن کاملToshiba BRIDJE at NTCIR-4 CLIR: Monolingual/Bilingual IR and Flexible Feedback
Toshiba participated in the Monolingual/Bilingual tasks at NTCIR-4 CLIR using our CLIR system called BRIDJE. We submitted 24 runs covering three topic languages (Japanese, English and Chinese) and two document languages (Japanese and English) and achieved the highest performances in the E-J-D, CJ-D, C-J-T, E-E-D, J-E-D, J-E-T subtasks. We had 12 more runs which we were not allowed to submit due...
متن کاملExperiments on Cross-language and Patent Retrieval at NTCIR-3 Workshop
The Berkeley group participated in the crosslanguage retrieval task and the patent retrieval task at the third NTCIR workshop. This paper describes our experiments on cross-language and patent retrieval. We present an automatic relevance feedback procedure for document ranking formula based on logistic regression, and a procedure for automatically extracting Chinese/Japanese translations of Eng...
متن کاملHITS' Graph-based System at the NTCIR-9 Cross-lingual Link Discovery Task
This paper presents HITS’ system for the NTCIR-9 crosslingual link discovery task. We solve the task in three stages: (1) anchor identification and ambiguity reduction, (2) graphbased disambiguation combining different relatedness measures as edge weights for a maximum edge weighted clique algorithm, and (3) supervised relevance ranking. In the fileto-file evaluation with Wikipedia ground-truth...
متن کامل