Lexical access to large vocabularies for speech recognition
نویسندگان
چکیده
A large vocabulary isolated word recognition system based on the hypothesize-and-test paradigm is described. The system has been, however, devised as a word hypothesizer for a continuous speech understanding system able to answer to queries put to a geographical database. Word preselection is achieved by segmenting and classifying the input signal in terms of broad phonetic classes. Due to low redundancy of this phonetic code for lexical access, to achieve high performance, a lattice of phonetic segments is generated, rather than a single sequence of hypotheses. It can be organized as a graph, and word hypothesization is obtained by matching this graph against the models of all vocabulary words. A word model is itself a phonetic representation made in terms of a graph accounting for deletion, substitution, and insertion errors. A modified Dynamic Programming (DP) matching procedure gives an efficient solution to this graph-to-graph matching problem. Hidden Markov Models (HMM's) of subword units a re used as a more detailed knowledge in the verification step. The word candidates generated by the previous step a re represented as sequences of diphone-like subword units, and the Viterbi algorithm is used for evaluating their likelihood. To reduce storage and computational costs, lexical knowledge is organized in a tree structure where the initial common subsequences of word descriptions a re shared, and a beam-search strategy carries on the most promising paths only. The results show that a complexity reduction of about 73 percent can be achieved by using the two pass approach with respect to the direct approach, while the recognition accuracy remains comparable.
منابع مشابه
A New Decoder Design For Large Vocabula
An important problem in large vocabulary speech recognition for agglutinative languages like Turkish is the high out of vocabulary (OOV) rate caused by extensive number of distinct words. Recognition systems using words as the basic lexical elements have difficulty in dealing with such virtually unlimited vocabulary. We propose a new time-synchronous lexical tree decoder design using morphemes ...
متن کاملLexical Access in Persian Normal Speakers: Picture Naming, Verbal Fluency and Spontaneous Speech
Objectives: Lexical access is the process by which the basic conceptual, syntactical and morpho-phonological information of words are activated. Most studies of lexical access have focused on picture naming. There is hardly any previous research on other parameters of lexical access such as verbal fluency and analysis of connected speech in Persian normal participants. This study investigates t...
متن کاملVocabulary Decomposition for Estonian Open Vocabulary Speech Recognition
Speech recognition in many morphologically rich languages suffers from a very high out-of-vocabulary (OOV) ratio. Earlier work has shown that vocabulary decomposition methods can practically solve this problem for a subset of these languages. This paper compares various vocabulary decomposition approaches to open vocabulary speech recognition, using Estonian speech recognition as a benchmark. C...
متن کاملDiscourse Community Collocations and L2 Writing Content
Taking the position that writing can be an important skill to foster knowledge building pedagogy, this article explores vocabulary as a supportive tool for this purpose. Having this in mind, a compilation of conceptually loaded vocabularies pertaining to seven discourse communities was developed, two of which were given to a group of L2 writers to investigate the implications of phraseology for...
متن کاملLexical Access in Connected Speech Recognition
This paper addresses two issues concerning lexical access in connected speech recognition: 1) the nature of the pre-lexical representation used to initiate lexical lookup 2) the points at which lexical look-up is triggered off this representation. The results of an experiment are reported which was designed to evaluate a number of access strategies proposed in the literature in conjunction with...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEEE Trans. Acoustics, Speech, and Signal Processing
دوره 37 شماره
صفحات -
تاریخ انتشار 1989