Transcrigal: A Bilingual System for Automatic Indexing of Broadcast News

نویسندگان

  • Carmen García-Mateo
  • Javier Dieguez-Tirado
  • Laura Docío Fernández
  • Antonio Cardenal López
چکیده

This paper describes a Broadcast News (BN) database called Transcrigal-DB. The news shows are mainly in Galician language, although around 11% of data is in Spanish. This database has been constructed for automatic speech recognition (ASR) purposes. A BN-ASR reference system is also described and evaluated on the test partition of Transcrigal-DB. The reference system has been designed having in mind that both languages, Spanish and Galician, may be used. Performance of the reference is improved when language adaptation techniques are taken into consideration.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Language Resources for a Bilingual Automatic Index System of Broadcast News in Basque and Spanish

Automatic Indexing of Broadcast News is a developing research area of great recent interest [1]. This paper describes the development steps for designing an automatic index system of broadcast news for both Basque and Spanish. This application requires of appropriate Language Resources to design all the components of the system. Nowadays, large and well-defined resources can be found in most wi...

متن کامل

Development of Resources for a Bilingual Automatic Index System of Broadcast News in Basque and Spanish

The development of an automatic index system of broadcast news requires appropriate Video and Language Resources (LR) to design all the components of the system. Nowadays, large and well-defined resources can be found in most widely used languages (Informedia), but there is a lot of work to do with respect to minority languages. The main goal of this work is the design of resources in Basque an...

متن کامل

Speaker role based structural classification of broadcast news stories

This paper is concerned with automatic classification of broadcast news stories based on speaker roles such as anchor, reporter and others. The story classification is the first step for many related tasks such as browsing, indexing, and summarising the news broadcast. We use broadcast news audio and its automatic speech recogniser transcripts to implement the classification system. It builds o...

متن کامل

Recognition, indexing and retrieval of british broadcast news with the THISL system

This paper described the THISL spoken document retrieval system for British and North American Broadcast News. The system is based on the ABBOT large vocabulary speech recognizer and a probabilistic text retrieval system. We discuss the development of a realtime British English Broadcast News system, and its integration into a spoken document retrieval system. Detailed evaluation is performed u...

متن کامل

Content-based video indexing of TV broadcast news using hidden Markov models

This paper presents a new approach to content-based video indexing using Hidden Markov Models (HMMs). In this approach one feature vector is calculated for each image of the video sequence. These feature vectors are modeled and classified using HMMs. This approach has many advantages compared to other video indexing approaches. The system has automatic learning capabilities. It is trained by pr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004