Using Bilingual Chinese-English Word Alignments to Resolve PP-Attachment Ambiguity in English
نویسندگان
چکیده
Errors in English parse trees impact the quality of syntax-based MT systems trained using those parses. Frequent sources of error for English parsers include PP-attachment ambiguity, NP-bracketing ambiguity, and coordination ambiguity. Not all ambiguities are preserved across languages. We examine a common type of ambiguity in English that is not preserved in Chinese: given a sequence “VP NP PP”, should the PP be attached to the main verb, or to the object noun phrase? We present a discriminative method for exploiting bilingual Chinese-English word alignments to resolve this ambiguity in English. On a heldout test set of Chinese-English parallel sentences, our method achieves 86.3% accuracy on this PP-attachment disambiguation task, an improvement of 4% over the accuracy of the baseline Collins parser (82.3%).
منابع مشابه
Disambiguation of English PP Attachment using Multilingual Aligned Data
Prepositional phrase attachment (PP attachment) is a major source of ambiguity in English. It poses a substantial challenge to Machine Translation (MT) between English and languages that are not characterized by PP attachment ambiguity. In this paper we present an unsupervised, bilingual, corpus-based approach to the resolution of English PP attachment ambiguity. As data we use aligned linguist...
متن کاملPrepositional Attachment Disambiguation Using Bilingual Parsing and Alignments
In this paper, we attempt to solve the problem of Prepositional Phrase (PP) attachments in English. The motivation for the work comes from NLP applications like Machine Translation, for which, getting the correct attachment of prepositions is very crucial. The idea is to correct the PPattachments for a sentence with the help of alignments from parallel data in another language. The novelty of o...
متن کاملA Preliminary Study of Prosodic Disambiguation by Chinese EFL Learners
This study investigated whether Chinese learners of English as a foreign language (EFL learners hereafter) could use prosodic cues to resolve syntactically ambiguous sentences in English. 8 sentences with 3 types of syntactic ambiguity were adopted. They were far/near PP attachment, left/right word attachment and wide/narrow scope. In the production experiment, 15 Chinese college students who p...
متن کاملWord Alignment Based on Bilingual Bracketing
In this paper, an improved word alignment based on bilingual bracketing is described. The explored approaches include using Model-1 conditional probability, a boosting strategy for lexicon probabilities based on importance sampling, applying Parts of Speech to discriminate English words and incorporating information of English base noun phrase. The results of the shared task on French-English, ...
متن کاملClass Based Sense Definition Model for Word Sense Tagging and Disambiguation
We present an unsupervised learning strategy for word sense disambiguation (WSD) that exploits multiple linguistic resources including a parallel corpus, a bilingual machine readable dictionary, and a thesaurus. The approach is based on Class Based Sense Definition Model (CBSDM) that generates the glosses and translations for a class of word senses. The model can be applied to resolve sense amb...
متن کامل