Modeling Broadcast News Prosody Using Conditional Random Fields for Story Segmentation

نویسندگان

  • Xiaoxuan Wang
  • Lei Xie
  • Bin Ma
  • Eng Siong Chng
  • Haizhou Li
چکیده

This paper proposes to model broadcast news prosody using conditional random fields (CRF) for news story segmentation. Broadcast news has both editorial prosody and speech prosody that convey essential structural information for story segmentation. Hence we extract prosodic features, including pause duration, pitch, intensity, rapidity, speaker change and music, for a sequence of boundary candidates. A linearchain CRF is used to label each candidate with boundary/nonboundary tags based on the prosodic features. Important interlabel relations and contextual feature interactions are effectively captured by CRF’s sequential learning framework. Experiments show that the CRF approach outperforms decision tree (DT), support vector machines (SVM) and maximum entropy (ME) classifiers in prosody-based story segmentation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Broadcast News Story Segmentation Using Conditional Random Fields and Multimodal Features

This paper proposes to integrate multi-modal features using conditional random fields (CRF) for broadcast news story segmentation. We study story boundary cues from lexical, audio and video modalities, where lexical features consist of lexical similarity, chain strength and overall cohesiveness, acoustic features involve pause duration, pitch, speaker change and audio event type, and visual fea...

متن کامل

Discovery and fusion of salient multimodal features toward news story segmentation

In this paper, we present our new results in news video story segmentation and classification in the context of TRECVID video retrieval benchmarking event 2003. We applied and extended the Maximum Entropy statistical model to effectively fuse diverse features from multiple levels and modalities, including visual, audio, and text. We have included various features such as motion, face, music/spe...

متن کامل

Discovery and Fusion of Salient Multi-modal Features towards News Story Segmentation

In this paper, we present our new results in news video story segmentation and classification in the context of TRECVID video retrieval benchmarking event 2003. We applied and extended the Maximum Entropy statistical model to effectively fuse diverse features from multiple levels and modalities, including visual, audio, and text. We have included various features such as motion, face, music/spe...

متن کامل

Broadcast News Story Boundary Detection Using Visual, Audio and Text Features

News video story segmentation is vital for video summarization, story linking, and curation. We present a multimodal segmentation algorithm which fuses video, audio and text cues for story boundary detection. We show that broadcast news closed captioning is a rich and readily available source that improves story boundary detection. Furthermore, we propose an empirical distribution-based feature...

متن کامل

News Video Story Segmentation Using Fusion of Multi-level Multi-modal Features in Trecvid

In this paper, we present our new results in news video story segmentation and classification in the context of TRECVID video retrieval benchmarking event 2003. We applied and extended the Maximum Entropy statistical model to effectively fuse diverse features from multiple levels and modalities, including visual, audio, and text. We have included various features such as motion, face, music/spe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010