Spoken Book Alignment Using Wfsts
نویسندگان
چکیده
The framework of this paper is a national project known as IPSOM, whose main goal is to improve the access to digitally stored spoken books, used primarily by the visually impaired community, by providing tools for easily detecting and indexing units (words, sentences, topics). Simultaneously, the project also aims to broaden the usage of multimedia spoken books (for instance in didactic applications, etc.), by providing multimedia interfaces for access and retrieval. Hence, spoken book alignment is a major task. From the point of view of research, one of the most interesting aspects of the IPSOM project is the fact that indexed spoken books provide an invaluable resource for datadriven prosodic modelling and unit selection in the context of text-to-speech synthesis. This motivated doing the alignment not only on the basis of words, but rather sub-word units and also automatically generating multiple pronunciations by applying phonological rules in a WFST (Weighted Finite State Transducer) framework.
منابع مشابه
Spoken Language Processing Using Weighted Finite State Transducers
The main goal of this paper is to illustrate the advantages of weighted finite state transducers (WFSTs) for spoken language processing, namely in terms of their capacity to efficiently integrate different types of knowledge sources. We shall illustrate their applicability in several areas: large vocabulary continuous speech recognition, automatic alignment using pronunciation modeling rules, g...
متن کاملA method for structure estimation of weighted finite-state transducers and its application to grapheme-to-phoneme conversion
Weighted finite-state transducers (WFSTs) are widely used as a fundamental data structure in several spoken language processing systems since they can provide a unified representation of many types of probabilistic models. Even though the use of accurate WFSTs is important in many spoken language systems, WFSTs are conventionally obtained by transforming probabilistic models that are not estima...
متن کاملWFSTDM Builder – Network-based Spoken Dialogue System Builder for Easy Prototyping
This paper introduces a network-based spoken dialog system development tool kit: WFSTDM Builder developed by NICT. WFSTDM Builder provides functions to share and edit SLU and scenario so that developers can create a wfst-based spoken dialogue system instantly with this tool. One can test the scenario by accessing to the servers connected such as ASR, TTS and WFSTDM server via not only the tool’...
متن کاملAligning and recognizing spoken books in different varieties of Portuguese
This paper tries to present digital spoken books as a useful diagnostic tool for detecting alignment and recognition problems and for studying the porting of these technologies to different varieties of the same language Portuguese, in our case. We summarize the main differences between European and Brazilian Portuguese (EP/BP) and describe how they affect the GtoP system. Despite the small siz...
متن کاملAutomatic Alignment of Map Task Dialogs Using Wfsts
The goal of this work is the automatic alignment of a map task dialog corpus collected for European Portuguese. The Coral corpus has been orthographically labeled, however off-the-shelf alignment techniques do not work because of the large amount of cross-talk and pronunciation variation. This paper addresses these two issues. The cross-talk problem is dealt with by using a pre-processing stage...
متن کامل