Base Noun Phrase Translation Using Web Data and the EM Algorithm

نویسندگان

  • Yunbo Cao
  • Hang Li
چکیده

We consider here the problem of Base Noun Phrase translation. We propose a new method to perform the task. For a given Base NP, we first search its translation candidates from the web. We next determine the possible translation(s) from among the candidates using one of the two methods that we have developed. In one method, we employ an ensemble of Naïve Bayesian Classifiers constructed with the EM Algorithm. In the other method, we use TF-IDF vectors also constructed with the EM Algorithm. Experimental results indicate that the coverage and accuracy of our method are significantly better than those of the baseline methods relying on existing technologies.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mining Interesting Aspects of a Product using Aspect-based Opinion Mining from Product Reviews (RESEARCH NOTE)

As the internet and its applications are growing, E-commerce has become one of its rapid applications. Customers of E-commerce were provided with the opportunity to express their opinion about the product on the web as a text in the form of reviews. In the previous studies, mere founding sentiment from reviews was not helpful to get the exact opinion of the review. In this paper, we have used A...

متن کامل

A Method of Cross-Lingual Question-Answering Based on Machine Translation and Noun Phrase Translation using Web documents - Yokohama National University at NTCIR-6 CLQA

We propose a method of English-Japanese cross lingual question-answering (E-J CLQA) that uses machine translation (MT) and an existing Japanese QA system. We also introduce noun phrase translation using Web documents in order to compensate the insufficiencies in the bilingual dictionary of the MT system. We combine several phrase translation techniques including 1) phrase translation using Wiki...

متن کامل

Investigating Embedded Question Reuse in Question Answering

The investigation presented in this paper is a novel method in question answering (QA) that enables a QA system to gain performance through reuse of information in the answer to one question to answer another related question. Our analysis shows that a pair of question in a general open domain QA can have embedding relation through their mentions of noun phrase expressions. We present methods f...

متن کامل

Determiners and Number in English contrasted with Japanese, as exemplified in Machine Translation

The fact that concepts are grammaticalized differently in different languages is a major problem for translation, especially for machine translation. Two major examples of this are syntactic number, and the use of (in)definite articles (a, some, the). In languages such as English, nouns are marked for number and the choice of article (or of no article) must be made for every noun phrase. In con...

متن کامل

The Role of Lexicalization and Pruning for Base Noun Phrase Grammars

This paper explores the role of lexicalization and pruning of grammars for base noun phrase identification. We modify our original framework (Cardie & Pierce 1998) to extract lexicalized treebank grammars that assign a score to each potential noun phrase based upon both the part-of-speech tag sequence and the word sequence of the phrase. We evaluate the modified framework on the “simple” and “c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002