Multi-View Approach for Speaker Turn Role Labeling in TV Broadcast News Shows
نویسندگان
چکیده
Speaker role recognition in TV Broadcast News shows is addressed in this paper. Speaker turns are assigned a role among anchor, reporter and other. A multi-view approach is proposed exploiting the complementarities of lexical cues obtained from Automatic Speech Recognition output and acoustical cues obtained from speech signal analysis. Early and late fusions are compared. 90.1% classification accuracy is obtained on automatically segmented speaker turns for a 6.5 hours test corpus of 14 shows mixing news and conversational speech. Further analyses are provided for other speaker turns showing interesting perspectives towards finer-grained speaker role characterization.
منابع مشابه
A speaker-role based approach for detecting politicians in TV broadcast news
Politician speaker turn detection in TV Broadcast News shows is addressed in this paper. Politician speech model combines acoustical and lexical cues as well as contextual information, and does not use any specific politician model (person-independent). Politician speaker turn detection is coupled with an automatic role labeling step, which determines the contextual information and the set on w...
متن کاملInitial Study on Automatic Identification of Speaker Role in Broadcast News Speech
Identifying a speaker’s role (anchor, reporter, or guest speaker) is important for finding the structural information in broadcast news speech. We present an HMM-based approach and a maximum entropy model for speaker role labeling using Mandarin broadcast news speech. The algorithms achieve classification accuracy of about 80% (compared to the baseline of around 50%) using the human transcripti...
متن کاملThe need to create a media block for the convergence of overseas news networks
As a general diplomacy arm of the Islamic Republic of Iran, VoSiMa has extensive activities in international broadcasting of its radio and television programs. These programs are broadcast in different languages, such as English, French, Azeri, Arabic, and ... for regional and transnational audiences. The large volume of the organization's international activities is in the form of news and new...
متن کاملSpeaker Change Detection in Broadcast TV Using Bidirectional Long Short-Term Memory Networks
Speaker change detection is an important step in a speaker diarization system. It aims at finding speaker change points in the audio stream. In this paper, it is treated as a sequence labeling task and addressed by Bidirectional long short term memory networks (Bi-LSTM). The system is trained and evaluated on the Broadcast TV subset from ETAPE database. The result shows that the proposed model ...
متن کاملUCBN: A new audio-visual broadcast news corpus for multimodal speaker verification studies
The performance of face, voice, and multimodal speaker verification systems in complex and non-controlled scenarios, is typically lower than systems developed in highly controlled environments. With the aim to facilitate the development of robust multi-modal speaker recognition systems, a new multi-modal (audio-visual) Australian broadcast UCBN (University of Canberra Broadcast News) corpus was...
متن کامل