Speech to text engine for jawi language

نویسندگان

  • Zaini Arifah Othman
  • Nor Aniza Abdullah
  • Zaidi Razak
  • Mohd. Yakub Zulkifli Bin Mohd. Yusoff
چکیده

This paper focused on the development of speech translation to special character that is Malay speech to Jawi text engine. Jawi is a unique character derived from Arabic but it is read in Malay language. There are not many research can be found on speech technology developed for Jawi and this research would be useful to researcher who wish to venture its benefit to many related ICT applications. The use of Zero Crossing Rate (ZCR) as a robust algorithm for accurate automatic detection of speech signal syllable boundary has been discussed. The combination of Linear Predictive Coding (LPC) and Artificial Neural Network (ANN) are used in this research to extract and classify the speech signals with backpropagation training method. This paper also, discussed on the use of Jawi Unicode in the final character tagging process to represent each of the Jawi character existed in the spoken word. As there are no standard lists of Jawi Unicode published, in this research, the existing of Jawi Unicode table produced by previous research is further investigated and enhanced in order to have better accuracy in Jawi character-phoneme representation. This list is based on the combination of Traditional Arabic and other scripts. A prototype educational learning tool was also, developed to enable school children to recognize and read Jawi text, check their pronunciation, and learn from their mistakes independently.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Finite State Model for Urdu Nastalique Optical Character Recognition

Finite state technology is being used since long to model NLP (Natural Language Processing) applications specially it has very successfully applied to machine translation and speech recognition systems. Character recognition in cursive scripts or handwritten Latin script also have attracted researchers’ attention and some research is also done in this area. Optical character recognition is the ...

متن کامل

A multilingual text processing engine for the PAPAGENO text-to-speech synthesis system

Automatic synthesis of speech from arbitrary text requires two basic operations: linguistic analysis of input text and speech waveform generation. The achieved quality of the second stage very much depends on the reliability and richness of information generated in the first stage. In this paper we discuss possibilities and problems of text analysis for multilingual speech synthesis. The langua...

متن کامل

Spoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting

Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...

متن کامل

L2 Learners’ Lexical Inferencing: Perceptual Learning Style Preferences, Strategy Use, Density of Text, and Parts of Speech as Possible Predictors

This study was intended first to categorize the L2 learners in terms of their learning style preferences and second to investigate if their learning preferences are related to lexical inferencing. Moreover, strategies used for lexical inferencing and text related issues of text density and parts of speech were studied to determine their moderating effects and the best predictors of lexical infe...

متن کامل

Using Text Surrounding Method to Enhance Retrieval of Online Images by Google Search Engine

Purpose: the current research aimed to compare the effectiveness of various tags and codes for retrieving images from the Google. Design/methodology: selected images with different characteristics in a registered domain were carefully studied. The exception was that special conceptual features have been apportioned for each group of images separately. In this regard, each group image surr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Int. Arab J. Inf. Technol.

دوره 11  شماره 

صفحات  -

تاریخ انتشار 2014