Spoken Language Identifier 18 - 551 Final Written Report
نویسنده
چکیده
There are many applications in industry that have a demand for automatic language recognition. One such application is the use of Language Line Services. As discussed in the Marc Zissman's Comparison of Four Approaches to Automatic Language Identification of Telephone Speech [9], these services will distinguish the language being spoken by a telephone caller, and redirect the call to a representative who is fluent in that language. Most of these services are done manually, where human receptionists attempt to distinguish the language being spoken through trial and error. The introduction of an automatic system could reduce overhead time, as well as error rates for these systems. Another likely application would be to combine a language recognition system with automatic real-time speech translation. Many services like Babelfish or Google Translate have been released that allow for text-to-text translation, but recently these services are being adapted for real-time speech-to-speech translation. While the accuracy of these systems continues to be improved, a permanent shortcoming is the need to know the input language. For many situations the input language is known, but there are occasions when one wishes to translate speech of an unknown language. A system that is able to identify the language spoken and then feed the signal to a specific translator would be able to act as a universal translation device, something that has only been present only in science fiction. During the developmental stages of our project, we first devoted most of our time reading materials about previous research and projects on speech processing and recognition that had been done in previous years. We found background on different techniques on speech recognition in discussions with Professor Richard Stern because of his knowledge on signal processing and speech manipulation and he pointed us to different research papers that he thought would be useful to us. In addition, he allowed us access to databases of speech that we could later use while training and testing our system. In addition to these discussions, we developed much of our system breakdown from reading Lee's chapter on the principles of spoken language recognition in the Springer Handbook of Speech Processing [1]. From this, we made decisions on what algorithms to use and how each one could be implemented. We decided against using word recognition because of the huge amounts of data that would have to be stored and our limitations on the DSK, and instead …
منابع مشابه
Automatic Conversion of Dialectal Tamil Text to Standard Written Tamil Text using FSTs
We present an efficient method to automatically transform spoken language text to standard written language text for various dialects of Tamil. Our work is novel in that it explicitly addresses the problem and need for processing dialectal and spoken language Tamil. Written language equivalents for dialectal and spoken language forms are obtained using Finite State Transducers (FSTs) where spok...
متن کاملAdult’s Learning Strategies for Receptive Skill Self-managing or Teacher-managing
Receptive language skill refers to answering appropriately to another person's spoken language. A lot of teachers try to develop receptive language skills in their language learners. When receptive language skills are not appropriately acquired, learners may miss significant learning opportunities resulting in delays in the development and acquisition of spoken language. The goals of this paper...
متن کاملModality and morphology: what we write may not be what we say.
Written language is an evolutionarily recent human invention; consequently, its neural substrates cannot be determined by the genetic code. How, then, does the brain incorporate skills of this type? One possibility is that written language is dependent on evolutionarily older skills, such as spoken language; another is that dedicated substrates develop with expertise. If written language does d...
متن کاملUsing Unscripted Spoken Texts in the Teaching of Second Language Listening
Most spoken texts that are used in second language (L2) listening classroom activities are scripted texts, where the text is written, revised, polished, and then read aloud with artificially clear enunciation and slow rate of speech. This article explores the field’s overreliance on these scripted texts, at the expense of including unscripted spoken texts that have very different textual and ph...
متن کامل