Two-stage Story Segmentation and Detection on Broadcast News Using Genetic Algorithm
نویسندگان
چکیده
This paper proposes a two-stage story segmentation and detection approach on Mandarin broadcast news. In the two-stage paradigm, a topic classifier is first constructed to find the topic on the broadcast news within a sliding window and determine the potential story boundaries. Then, the problem for story segmentation is transformed to the determination of a chromosome (number sequence) in a search space. The genetic algorithm is then adopted to globally determine the chromosome, which represents the final story boundaries. A topic strength measure is defined as the fitness function used in the genetic algorithm. In order to evaluate our proposed approach, the word-based and syllable-based story segmentation systems were constructed. Experimental results show our proposed method achieves a better performance with 32.94% missing probability and 22.83% false alarm probability compared to the Makhoul’s method for the segmentation and detection on Mandarin broadcast news.
منابع مشابه
Broadcast News Story Boundary Detection Using Visual, Audio and Text Features
News video story segmentation is vital for video summarization, story linking, and curation. We present a multimodal segmentation algorithm which fuses video, audio and text cues for story boundary detection. We show that broadcast news closed captioning is a rich and readily available source that improves story boundary detection. Furthermore, we propose an empirical distribution-based feature...
متن کاملStory Segmentation and Topic Detection in the Broadcast News Domain
In this paper we present algorithms for story segmentation and topic detection. Both algorithms are online algorithms and use a combination of machine learning, statistical natural language processing and information retrieval techniques. The story segmentation algorithm is a two stage algorithm that uses a decision tree based probabilistic model in the rst stage and incorporates aspects of our...
متن کاملSpoken and Written News Story Segmentation Using Lexical Chains
In this paper we describe a novel approach to lexical chain based segmentation of broadcast news stories. Our segmentation system SeLeCT is evaluated with respect to two other lexical cohesion based segmenters TextTiling and C99. Using the Pk and WindowDiff evaluation metrics we show that SeLeCT outperforms both systems on spoken news transcripts (CNN) while the C99 algorithm performs best on t...
متن کاملStory Segmentation of Broadcast News in English, Mandarin and Arabic
In this paper, we present results from a Broadcast News story segmentation system developed for the SRI NIGHTINGALE system operating on English, Arabic and Mandarin news shows to provide input to subsequent question-answering processes. Using a rule-induction algorithm with automatically extracted acoustic and lexical features, we report success rates that are competitive with state-ofthe-art s...
متن کاملStory Segmentation and Detection of Commercials in Broadcast News Video
The Informedia Digital Library Project [Wactlar96] allows full content indexing and retrieval of text, audio and video material. Segmentation is an integral process in the Informedia digital video library. The success of the Informedia project hinges on two critical assumptions: that we can extract sufficiently accurate speech recognition transcripts from the broadcast audio and that we can seg...
متن کامل