Train&align: A new online tool for automatic phonetic alignment
نویسندگان
چکیده
Several automatic phonetic alignment tools have been proposed in the literature. They usually rely on pre-trained speaker-independent models to align new corpora. Their drawback is that they cover a very limited number of languages and might not perform properly for different speaking styles. This paper presents a new tool for automatic phonetic alignment available online. Its specificity is that it trains the model directly on the corpus to align, which makes it applicable to any language and speaking style. Experiments on three corpora show that it provides results comparable to other existing tools. It also allows the tuning of some training parameters. The use of tied-state triphones, for example, shows further improvement of about 1.5% for a 20 ms threshold. A manually-aligned part of the corpus can also be used as bootstrap to improve the model quality. Alignment rates were found to significantly increase, up to 20%, using only 30 seconds of bootstrapping data.
منابع مشابه
EasyAlign: An Automatic Phonetic Alignment Tool Under Praat
We provide a user-friendly automatic phonetic alignment tool for continuous speech, named EasyAlign. It is developed as a plug-in of Praat, the popular speech analysis software, and it is freely available. Its main advantage is that one can easily align speech from an orthographic transcription. It requires a few minor manual steps and the result is a multi-level annotation within a TextGrid co...
متن کاملEasyAlign: a friendly automatic phonetic alignment tool under Praat
We propose a user-friendly automatic phonetic alignment tool for continuous speech: EasyAlign. It is developed and freely distributed as a plug-in of Praat, the popular speech analysis software. Its main advantage is that one can easily align speech from an orthographic transcription. It requires a few minor manual steps and the result is a multi-level annotation within a TextGrid composed of p...
متن کاملToward an Optimum Feature Set and HMM Model Parameters for Automatic Phonetic Alignment of Spontaneous Speech
Many speech segmentation techniques have been proposed to automate phonetic alignment. Most of the techniques require, however, labeled data to train, and perform well only for read, high-quality speech. Automatic phonetic alignment, for lower quality varied data with no labeled training data, the subject of this paper, is a much more challenging domain. An HMMbased automatic speech recognizer ...
متن کاملAutomatic Phone Alignment - A Comparison between Speaker-Independent Models and Models Trained on the Corpus to Align
Several automatic phonetic alignment tools have been proposed in the literature. They generally use speaker-independent acoustic models of the language to align new corpora. The problem is that the range of provided models is limited. It does not cover all languages and speaking styles (spontaneous, expressive, etc.). This study investigates the possibility of directly training the statistical ...
متن کاملEnhanced CORILGA: Introducing the Automatic Phonetic Alignment Tool for Continuous Speech
The Corpus Oral Informatizado da Lingua Galega (CORILGA) project aims at building a corpus of oral language for Galician, primarily designed to study the linguistic variation and change. This project is currently under development and it is periodically enriched with new contributions. The long-term goal is that all the speech recordings will be enriched with phonetic, syllabic, morphosyntactic...
متن کامل