Paraphrasing Spontaneous Speech Using Weighted Finite-state Transducers

نویسندگان

  • Takaaki Hori
  • Daniel Willett
  • Yasuhiro Minami
چکیده

This paper describes an integrated framework to paraphrase spontaneous speech into written-style sentences. Most current speech recognition systems try to transcribe whole spoken words correctly. However, recognition results of spontaneous speech are usually difficult to understand, even if the recognition is perfect, because spontaneous speech includes redundant information, and it has a different style from that of written sentences. Especially, the style of spoken Japanese is much different from that of written one. Therefore, techniques to paraphrase recognition results are indispensable for generating captions or minutes from speech. To realize efficient speech paraphrasing, we attempt to translate spontaneous speech directly into writtenstyle sentences using a Weighted Finite-State Transducer (WFST). This approach enables to use all the knowledge sources in a one-pass search strategy and reduces the search error, since the constraint of the paraphrasing model is used from the beginning of the search. We conducted experiments on a 20k-word Japanese lecture speech recognition and paraphrasing task. Our approach yielded improvements on both recognition accuracy and paraphrasing accuracy compared with other approaches that deal with speech recognition and paraphrasing performed separately.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech summarization using weighted finite-state transducers

This paper proposes an integrated framework to summarize spontaneous speech into written-style compact sentences. Most current speech recognition systems attempt to transcribe whole spoken words correctly. However, recognition results of spontaneous speech are usually difficult to understand, even if the recognition is perfect, because spontaneous speech includes redundant information, and its ...

متن کامل

Flexible Speech Synthesis Using Weighted Finite State Transducers

Flexible Speech Synthesis Using Weighted Finite State Transducers

متن کامل

Use of Weighted Finite State Transducers inPart of Speech

This paper addresses issues in part of speech disambiguation using nite-state transducers and presents two main contributions to the eld. One of them is the use of nite-state machines for part of speech tagging. Linguistic and statistical information is represented in terms of weights on transitions in weighted nite-state transducers. Another contribution is the successful combination of techni...

متن کامل

Unit selection for speech synthesis using splicing costs with weighted finite state transducers

In this paper we describe how unit selection for concatenative speech synthesis can be implemented efficiently for sub-phonetic units using weighted finite state transducers (WFST). We also introduce splicing costs as a measure to indicate which unit boundaries are particularly good or poor joint points. Splicing costs extend the flexibility offered by the unit selection paradigm. Through a per...

متن کامل

Part-of-Speech Tagging Using Parallel Weighted Finite-State Transducers

We use parallel weighted finite-state transducers to implement a part-of-speech tagger, which obtains state-of-the-art accuracy when used to tag the Europarl corpora for Finnish, Swedish and English. Our system consists of a weighted lexicon and a guesser combined with a bigram model factored into two weighted transducers. We use both lemmas and tag sequences in the bigram model, which guarante...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003