Spanish dialects: phonetic transcription
نویسندگان
چکیده
It is well known that canonical Spanish, the dialectal variant ‘central’ of Spain, so called Castilian, can be transcribed by rules. This paper deals with the automatic grapheme to phoneme transcription rules in several Spanish dialects from Latin America. Spanish is a language spoken by more than 300 million people, has an important geographical dispersion compared among other languages and has been historically influenced by many native languages. In this paper authors expand the Castilian transcription rules to a set of different dialectal variants of Latin America. Transcriptions are based on SAMPA symbols. The paper includes an identification of sounds that doesn't appear in Castilian, extend accepted SAMPA symbols for Spanish (Castilian) to different dialectal variants, describes the necessary rules to implement an automatic Orthographic to Phonetic transcription in several dialectal Spanish variants and show some quantitative results of dialectal differences.
منابع مشابه
Multidialectal Spanish acoustic modeling for speech recognition
During the last years, language resources for speech recognition have been collected for many languages and specifically, for global languages. One of the characteristics of global languages is their wide geographical dispersion, and consequently, their wide phonetic, lexical, and semantic dialectal variability. Even if the collected data is huge, it is difficult to represent dialectal variants...
متن کاملData driven multidialectal phone set for Spanish dialects
This paper addresses the use of a data-driven approach to determine a multidialectal phone set for an automatic speech recognition system for Spanish dialects. This approach is based on a decision tree clustering algorithm that tries to cluster contextual units of different dialects. This procedure avoids the definition of a global phonetic inventory and the previous study of similarity of soun...
متن کاملIncorporating linguistic knowledge into automatic dialect identification of Spanish
Automatic dialect identification, like automatic language identification , has often been approached through the use of phonetic frequencies and phonetic sequence modeling. While such statistical systems perform well on language identification problems, they are less adept at the more difficult problem of automatic dialect identification, particularly on short segments of speech. In this paper ...
متن کاملPhonetic Unification of Multiple Accents for Spanish and Arabic Languages
Languages like Spanish and Arabic are spoken over a large geographic area. The people that speak these languages develop differences in accent, annotation and phonetic delivery. This leads to difficulty in standardization of languages for education and communication (both text and oral). The problem is addressed by phonetic dictionaries to some extent. They provide the correct pronunciation for...
متن کاملDIMEx100: A New Phonetic and Speech Corpus for Mexican Spanish
In this paper the phonetic and speech corpus DIMEx100 for Mexican Spanish is presented. We discuss both the linguistic motivation and the computational tools employed for the design, collection and transcription of the corpus. The phonetic transcription methodology is based on recent empirical studies proposing a new basic set of allophones and phonological rules for the dialect of the central ...
متن کامل