Effective Constituent Projection across Languages

نویسندگان

  • Wenbin Jiang
  • Yajuan Lü
  • Yang Liu
  • Qun Liu
چکیده

We describe an effective constituent projection strategy, where constituent projection is performed on the basis of dependency projection. Especially, a novel measurement is proposed to evaluate the candidate projected constituents for a target language sentence, and a PCFG-style parsing procedure is then used to search for the most probable projected constituent tree. Experiments show that, the parser trained on the projected treebank can significantly boost a state-of-the-art supervised parser. When integrated into a tree-based machine translation system, the projected parser leads to translation performance comparable with using a supervised parser trained on thousands of annotated trees.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Relaxed Cross-lingual Projection of Constituent Syntax

We propose a relaxed correspondence assumption for cross-lingual projection of constituent syntax, which allows a supposed constituent of the target sentence to correspond to an unrestricted treelet in the source parse. Such a relaxed assumption fundamentally tolerates the syntactic non-isomorphism between languages, and enables us to learn the target-language-specific syntactic idiosyncrasy ra...

متن کامل

Bracketing Input for Accurate Parsing

Syntax parsers can benefit from speakers' intuition about constituent structures indicated in the input string in the form of parentheses. Focusing on languages like Korean, whose orthographic convention requires more than one word to be written without spaces, we describe an algorithm for passing the bracketing information across the tagger to the probabilistic CFG parser, together with one fo...

متن کامل

Deterministic Fuzzy Automaton on Subclasses of Fuzzy Regular ω-Languages

In formal language theory, we are mainly interested in the natural language computational aspects of ω-languages. Therefore in this respect it is convenient to consider fuzzy ω-languages. In this paper, we introduce two subclasses of fuzzy regular ω-languages called fuzzy n-local ω-languages and Buchi fuzzy n-local ω-languages, and give some closure properties for those subclasses. We define a ...

متن کامل

Balancing Effort and Information Transmission During Language Acquisition: Evidence From Word Order and Case Marking

Across languages of the world, some grammatical patterns have been argued to be more common than expected by chance. These are sometimes referred to as (statistical) language universals. One such universal is the correlation between constituent order freedom and the presence of a case system in a language. Here, we explore whether this correlation can be explained by a bias to balance productio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010