Broadcast News Parsing using Visual Cues: A Robust Face Detection Approach
نویسندگان
چکیده
Automatic content-based analysis and indexing of broadcast news recordings or digitized news archives is becoming an important tool in the framework of many multimedia interactive services such as news summarization, browsing, retrieval and news-on-demand (NoD) applications. Existing approaches have achieved high performance in such applications but heavily rely on textual cues such as closed caption tokens and teletext transcripts. In this work we present an efficient technique for temporal segmentation and parsing of news recordings based on visual cues that can either be employed as stand-alone application for non-closed captioned broadcasts or integrated with audio and textual cues of existing systems. The technique involves robust face detection by means of color segmentation, skin color matching and shape processing, and is able to identify typical news instances like anchorpersons, reports and outdoor shots.
منابع مشابه
Adaptive anchor detection using online trained audio/visual model
An anchor person is the hosting character in broadcast programs. Anchor segments in video often provide the landmarks for detecting the content boundaries so that it is important to identify such segments during automatic content-based multimedia indexing. Previous e orts are mostly focused on audio information (e.g. acoustic speaker models) or visual information (e.g. visual anchor model such ...
متن کاملImproving broadcast news transcription with a precision grammar and discriminative reranking
We propose a new approach of integrating a precision grammar into speech recognition. The approach is based on a novel robust parsing technique and discriminative reranking. By reranking 100-best output of the LIMSI German broadcast news transcription system we achieved a significant reduction of the word error rate by 9.6% relative. To our knowledge, this is the first significant improvement f...
متن کاملInter-video Similarity for Video Parsing
In this paper we present a method for automatic detection of visual patterns in a given news video format by investigating similarities in a set of videos of that format. The approach aims at reducing the manual effort needed to create models of news broadcast formats for automatic video indexing and retrieval. Our algorithm has only very few parameters and can be run fully unsupervised. It sho...
متن کاملUCBN: A new audio-visual broadcast news corpus for multimodal speaker verification studies
The performance of face, voice, and multimodal speaker verification systems in complex and non-controlled scenarios, is typically lower than systems developed in highly controlled environments. With the aim to facilitate the development of robust multi-modal speaker recognition systems, a new multi-modal (audio-visual) Australian broadcast UCBN (University of Canberra Broadcast News) corpus was...
متن کاملBroadcast News Story Boundary Detection Using Visual, Audio and Text Features
News video story segmentation is vital for video summarization, story linking, and curation. We present a multimodal segmentation algorithm which fuses video, audio and text cues for story boundary detection. We show that broadcast news closed captioning is a rich and readily available source that improves story boundary detection. Furthermore, we propose an empirical distribution-based feature...
متن کامل