Phonological Processing for Urdu Text to Speech System

نویسنده

  • Sarmad Hussain
چکیده

Determining and modeling phonological phenomena is necessary to generate speech from textual input. These phenomena include letter to sound conversion, syllabification, sound change, stress assignment and intonation assignment. This paper presents work on Urdu phonological processes and provides algorithms to convert textual input into phonologically annotated output, required for Urdu text-to-speech system. Current paper builds on earlier work on letter to sound conversion rules and adds details of syllabification, sound change rules and stress assignment algorithm. Intonation assignment module is still under investigation and is not discussed in this paper.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Letter-To-Sound Conversion For Urdu Text-To-Speech System

Urdu is spoken by more than 100 million people across a score countries and is the national language of Pakistan (http://www. ethnologue.com). There is a great need for developing a text-to-speech system for Urdu because this population has low literacy rate and therefore speech interface would greatly assist in providing them access to information. One of the significant parts of a text-to-spe...

متن کامل

AGHAZ: An Expert System Based approach for the Translation of English to Urdu

–Machine Translation (MT ) of English text to its Urdu equivalent is a difficult challenge. Lot of attempts has been made, but a few limited solutions are provided till now. We present a direct approach, using an expert system to translate English text into its equivalent Urdu, using The Unicode Standard, Version 4.0 (ISBN 0-321-18578-1) Range: 0600–06FF. The expert system works with a knowledg...

متن کامل

Multilingual phonological analysis and speech synthesis

We give an overview of multilingual speech synthesis using the IPOX system. The first part discusses work in progress for various languages: Tashlhit Berber, Urdu and Dutch. The second part discusses a multilingual phonological grammar, which can be adapted to a particular language by setting parameters and adding languagespecific details.

متن کامل

A House United: Bridging the Script and Lexical Barrier between Hindi and Urdu

In Computational Linguistics, Hindi and Urdu are not viewed as a monolithic entity and have received separate attention with respect to their text processing. From part-of-speech tagging to machine translation, models are separately trained for both Hindi and Urdu despite the fact that they represent the same language. The reasons mainly are their divergent literary vocabularies and separate or...

متن کامل

A Study in Urdu Corpus Construction

We are interested in contributing a small, publicly available Urdu corpus of written text to the natural language processing community. The Urdu text is stored in the Unicode character set, in its native Arabic script, and marked up according to the Corpus Encoding Standard (CES) XML Document Type Definition (DTD). All the tags and metadata are in English. To date, the corpus is made entirely o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008