Morpheme Graph Construction for Speech and NaturalLanguage

نویسندگان

  • Byeongchang Kim
  • Jong-Hyeok Lee
چکیده

This paper describes a morphological analysis method of continuous spoken Korean to solve the integration problem of speech recognition and natural language processing. The method centers on a Viterbi search-based morphological analysis on top of speech signal processing and MLP-based phone recognition. The main contribution of this paper is to introduce a Viterbi search-based morphological analysis technique for agglutinative languages' speech processing. In several experiments, we obtained average 84.4% of continuous morpheme recognition performance in the morpheme graph directly built from the average 75.9% of phone recognition performance .

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Viterbi-based morphological analysis for speech and natural language integration

This paper presents a statistical/symbolic hybrid morphological analysis, called V-morph, for large scale speech and natural language integration for Korean. In the V-morph approach, statistical Viterbi-based lexical decoding and symbolic morphological modeling are integrated together on top of connectionist phoneme recognition engine. Linguistic characteristics of Korean are appropriately cons...

متن کامل

A Morpheme-based Part-of-Speech Tagger for Chinese

This paper presents a morpheme-based part-of-speech tagger for Chinese. It consists of two main components, namely a morpheme segmenter to segment each word in a sentence into a sequence of morphemes, based on forward maximum matching, and a lexical tagger to label each morpheme with a proper tag indicating its position pattern in forming a word of a specific class, based on lexicalized hidden ...

متن کامل

Towards better language modeling for Thai LVCSR

One of the difficulties of Thai language modeling is the process of text corpus preparation. Because there is no explicit word boundary marker in written Thai text, word segmentation must be performed prior to training a language model. This paper presents two approaches to language model construction for Thai LVCSR based on pseudo-morpheme merging. The first approach merges pseudo-morphemes us...

متن کامل

Use of high-level linguistic constraints for constructing feature-based phonological model in speech recognition

Modeling phonological units of speech is a critical issue in speech recognition. In this paper, we report our recent development of an overlapping feature-based phonological model which gives long-span contextual dependency. We extend our earlier work by incorporating high-level linguistic constaints in automatic construction of the feature overlapping patterns. The main linguistic information ...

متن کامل

Modeling Cross-morpheme Pro for Korean Large Vocabulary Cont

In this paper, we describe a cross-morpheme pronunciation variation model which is especially useful for constructing morpheme-based pronunciation lexicon for Korean LVCSR. There are a lot of pronunciation variations occurring at morpheme boundaries in continuous speech. Since phonemic context together with morphological category and morpheme boundary information affect Korean pronunciation var...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007