News Topic Tracking and Re-ranking with Query Expansion Based on Near-Duplicate Detection
نویسندگان
چکیده
Increase of digital storage capacity enabled the creation of large-scale news video archives. To make full use of the archive, it is necessary to grasp the development and dependencies of news stories. Considering this problem, we investigate tracking and re-ranking methodologies of news stories. The archive used as a test-bed consists of more than 30,000 news stories. This paper proposes a novel scheme of mining topic-related stories through a query-expansion algorithm on the basis of near duplicates built on top of text. Experiments showed that the queryexpansion algorithm based on near-duplicate constraints outperformed traditional methods that only use textual features.
منابع مشابه
Cross-Lingual Retrieval of Identical News Events by Near-Duplicate Video Segment Detection
Recently, for reusing large quantities of accumulated news video, technology for news topic searching and tracking has become necessary. Moreover, since we need to understand a certain topic from various viewpoints, we focus on identical event detection in various news programs from different countries. Currently, text information is generally used to retrieve news video. However, cross-lingual...
متن کاملPageRank with Text Similarity and Video Near-Duplicate Constraints for News Story Re-ranking
Pseudo-relevance feedback is a popular and widely accepted query reformulation strategy for document retrieval and re-ranking. However, problems arise in this task when assumed-to-be relevant documents are actually irrelevant which causes a drift in the focus of the reformulated query. This paper focuses on news story retrieval and re-ranking, and offers a new perspective through the exploratio...
متن کاملRe-ranking Method Based on Topic Word Pairs
How to improve the rankings of the relevant documents plays a key role in information retrieval. In this paper, a re-ranking approach based on topic words pair is proposed to improve precision while recall is preserved. The topic word pairs contain two correlated words, one of which is the original query word and the other come from the documents. The selection is based on Probabilistic Latent ...
متن کاملTowards Auto-Documentary: Tracking the evolution of news in time
News videos constitute an important source of information for tracking and documenting important events. In these videos, news stories are often accompanied by short video clips that tend to be repeated during the course of the event. Automatic detection of such repetitions is essential for creating auto-documentaries. In this paper, we propose methods for detecting and tracking the evolution o...
متن کاملTags Re-ranking Using Multi-level Features in Automatic Image Annotation
Automatic image annotation is a process in which computer systems automatically assign the textual tags related with visual content to a query image. In most cases, inappropriate tags generated by the users as well as the images without any tags among the challenges available in this field have a negative effect on the query's result. In this paper, a new method is presented for automatic image...
متن کامل