Real-time spoken language identification and recognition for speech-to-speech translation
نویسندگان
چکیده
For spoken language systems to effectively operate across multiple languages it is critical to rapidly apply the correct language-specific speech recognition models. Prior approaches consist of either, first identifying the language being spoken and selecting the appropriate languagespecific speech recognition engine; or alternatively, performing speech recognition in parallel and selecting the language and recognition hypothesis with maximum likelihood. Both these approaches, however, introduce a significant delay before back-end natural language processing can proceed. In this work, we propose a novel method for joint language identification and speech recognition that can operate in near real-time. The proposed approach compares partial hypotheses generated on-the-fly during decoding and generates a classification decision soon after the first full hypothesis has been generated. When applied within our English-Iraqi speech-to-speech translation system the proposed approach correctly identified the input language with 99.6% accuracy while introducing minimal delay to the end-to-end system.
منابع مشابه
مقایسه روش های طیفی برای شناسایی زبان گفتاری
Identifying spoken language automatically is to identify a language from the speech signal. Language identification systems can be divided into two categories, spectral-based methods and phonetic-based methods. In the former, short-time characteristics of speech spectrum are extracted as a multi-dimensional vector. The statistical model of these features is then obtained for each language. The ...
متن کاملSpoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting
Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...
متن کاملNICT/ATR Asian Spoken Language Translation System for Multi-Party Travel Conversation
This paper presents the recent advances in the Asian spoken language translation system developed by the National Institute of Information and Communications Technology/Advanced Telecommunications Research Institute International (NICT/ATR). The system was designed to translate the common spoken utterances of travel conversation from a certain source language into multi-target languages in orde...
متن کاملSolutions to Problems Inherent in Spoken-language Translation: The ATR-MATRIX Approach
ATR has built a multi-language speech translation system called ATR-MATRIX. It consists of a spoken-language translation subsystem, which is the focus of this paper, together with a highly accurate speech recognition subsystem and a high-definition speech synthesis subsystem. This paper gives a road map of solutions to the problems inherent in spoken-language translation. Spokenlanguage transla...
متن کاملSpoken Language Identifier 18 - 551 Final Written Report
There are many applications in industry that have a demand for automatic language recognition. One such application is the use of Language Line Services. As discussed in the Marc Zissman's Comparison of Four Approaches to Automatic Language Identification of Telephone Speech [9], these services will distinguish the language being spoken by a telephone caller, and redirect the call to a represen...
متن کامل