نتایج جستجو برای: segmental hmm
تعداد نتایج: 28370 فیلتر نتایج به سال:
We present our approach to unsupervised training of speech recognizers. Our approach iteratively adjusts sound units that are ptimized for the acoustic domain of interest. We thus enable the use of speech recognizers for applications in speech domains here transcriptions do not exist. The resulting recognizer is a state-of-the-art recognizer on the optimized units. Specifically we ropose buildi...
HMM-based Speech-To-Text (STT) systems are widely deployed not only for dictation tasks but also as the first processing stage of many automatic speech applications such as spoken topic classification. However, the necessity of transcribed data for training the HMMs precludes its use in domains where transcribed speech is difficult to come by because of the specific domain, channel or language....
The segment model (SM) is a family of methods that use the segmental distribution rather than frame-based density (e.g. HMM) to represent the underlying characteristics of the observation sequence. It has been proved to be more precise than HMM. However, their high level of complexity prevents these models from being used in practical systems. In this paper, we propose a framework that can redu...
In this paper, we suggested a Reference Sentence Alignment (RSA) method to segment and label the speech automatically based on the multiple pronunciation phoneme segmental kmeans algorithm and HMM. Furthermore, based on the search path created by this method, information of pitch and energy of speech can be obtained and labeled synchronously. This segmentation and labeling strategy was applied ...
In this paper we present our Hidden Markov Model (HMM)-based, Mandarin Chinese Text-to-Speech (TTS) system. Mandarin Chinese or Putonghua, “the common spoken language”, is a tone language where each of the 400 plus base syllables can have up to 5 different lexical tone patterns. Their segmental and supra-segmental information is first modeled by 3 corresponding HMMs, including: (1) spectral env...
This paper encompasses the approaches of segmental modelling and the use of dynamic features in addressing the constraints of the IID assumption in standard HMM. Phonetic features are introduced which capture the transitional dynamics across a phoneme unit via a DCT transformation of a variable length segment. Alongside this, the use of a hybrid phoneme model is proposed. Classification experim...
One of the goals of text-to-speech (TTS) systems is to produce natural-sounding synthesised speech. Towards this end various natural language processing (NLP) tasks are performed to model the prosodic aspects of the TTS voice. One of the fundamental NLP tasks being used is the part-of-speech (POS) tagging of the words in the text. This paper investigates the effects of POS information on the na...
Incremental speech synthesis aims at delivering the synthetic voice while the sentence is still being typed. One of the main challenges is the online estimation of the target prosody from a partial knowledge of the sentence’s syntactic structure. In the context of HMM-based speech synthesis, this typically results in missing segmental and suprasegmental features, which describe the linguistic c...
This paper introduces the speech synthesis system developed by USTC for Blizzard Challenge 2012. An audiobook speech corpus is adopted as the training data for system construction this year. Similar to our previous systems, the hidden Markov model (HMM) based unit selection and waveform concatenation approach is followed to develop our speech synthesis system using this corpus. Considering the ...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید