A Parser Coping With Self-Repaired Japanese Utterances And Large Corpus-Based Evaluation
نویسندگان
چکیده
Self-repair(Levelt 1988) is a repair of utterance by speaker him/herself. A truman speaker makes self-repairs very frequently in spontaneous speedt. (Blackmer and Mitton 1991) reported that self-repairs are made once every 4.8 seconds in dialogues taken fi'om radio talk shows. Self-repair is one ldnd of "permissible illformedness", that is a human listener can feel ill-formedness in it hut he/she is able to recognize its intended meaning. Thus your partner does not need to interrupt dialogue. Itow do you feel if your partner interrupts dialogue every 5 seconds to ask "What do you mean?" or so? You will give up dialogue or choose means of writing. Speaking without self-repair is the most difficult modality of natural language communication. The goal of our work is to make a dialogue system coping with self-repaired utterances. In this paper we propose a parser called SERUP(SElf-Repaired Utterance Parser), which plays a major part in understanding a self-repaired utterance. That is, because our approach is to translate a self-repaired utterance (Ex.1) into a wellformed version that does not contain selfrepair (Ex.2) and parse the well-formed one, we do not need to change the subsequent processes.
منابع مشابه
Evaluation of a robust parser for spoken Japanese
We implemented a parser designed to handle ill-formedness in Japanese speech. The parser was evaluated by utilizing newly collected speech data, which was obtained from an experiment designed to produce ill-formed data effectively. Introducing the proposed method increased the number of correctly analyzed utterances from 171 to 322, from among 532 utterances in the corpus.
متن کاملConnectionist and Symbolic Processing in Speech-to-Speech Translation: The JANUS System
We present JANUS, a speech-to-speech translation system that utilizes diverse processing strategies including connectionist learning, traditional AI knowledge representation approaches, dynamic programming, and stochastic techniques. JANUS translates continuously spoken English utterances into Japanese and German speech utterances. The overall system performance on a corpus of conference regist...
متن کاملDependency Analysis of Spontaneous Monologue Speech Using Pause and F0 Information: A Preliminary Study
This paper deals with the problem of exploiting prosodic information in syntactic analysis of spontaneous monologue utterances of non-professional speakers. Duration of pauses at phrase boundaries and relative F0 contour features, which improve parsing accuracy of read sentences, were also found to be effective for parsing spontaneous speech. Dependency analysis was performed by the minimum pen...
متن کاملEmergence of syntactic movements in a distribution- sensitive adaptive parser with multiple working memories
A self-organizing neural model of language comprehension at syntactic and semantic levels has been developed, in which the competence to process movements emerges as the result of de-synchronization among three working memories. The model consists of a self-organizing parser, a self-organizing lexicon, and a clausal working memory. The parser consists of an argument stream and a head stream, ea...
متن کاملUsing a Partially Annotated Corpus to Build a Dependency Parser for Japanese
We explore the use of a partially annotated corpus to build a dependency parser for Japanese. We examine two types of partially annotated corpora. It is found that a parser trained with a corpus that does not have any grammatical tags for words can demonstrate an accuracy of 87.38%, which is comparable to the current state-of-the-art accuracy on the Kyoto University Corpus. In contrast, a parse...
متن کامل