A Framework For Incorporating Alignment Information In Parsing
نویسندگان
چکیده
The standard PCFG approach to parsing is quite successful on certain domains, but is relatively inflexible in the type of feature information we can include in its probabilistic model. In this work, we discuss preliminary work in developing a new probabilistic parsing model that allows us to easily incorporate many different types of features, including crosslingual information. We show how this model can be used to build a successful parser for a small handmade gold-standard corpus of 188 sentences (in 3 languages) from the Europarl corpus.
منابع مشابه
بررسی مقایسهای تأثیر برچسبزنی مقولات دستوری بر تجزیه در پردازش خودکار زبان فارسی
In this paper, the role of Part-of-Speech (POS) tagging for parsing in automatic processing of the Persian language is studied. To this end, the impact of the quality of POS tagging as well as the impact of the quantity of information available in the POS tags on parsing are studied. To reach the goals, three parsing scenarios are proposed and compared. In the first scenario, the parser assigns...
متن کاملAn improved joint model: POS tagging and dependency parsing
Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...
متن کاملTwo-Level Alignment by Words and Phrases Based on Syntactic Information
As a part of work on alignment of the English and Korean parallel corpus, this paper presents a statistical translation model incorporating linguistic knowledge of syntactic and phrasal information for better translations. For this, we propose three models: First, we incorporate syntactic information such as part of speech into the word-based lexical alignment. Based on this model, we propose t...
متن کاملSyntax, Parsing and Production of Natural Language in a Framework of Information Compression by Multiple Alignment, Unification and Search
This article introduces the idea that information compression by multiple alignment, unification and search (ICMAUS) provides a framework within which natural language syntax may be represented in a simple format and the parsing and production of natural language may be performed in a transparent manner. In this context, multiple alignment has a meaning which is similar to its meaning in bio-in...
متن کاملIt Depends on the Translation: Unsupervised Dependency Parsing via Word Alignment
We reveal a previously unnoticed connection between dependency parsing and statistical machine translation (SMT), by formulating the dependency parsing task as a problem of word alignment. Furthermore, we show that two well known models for these respective tasks (DMV and the IBM models) share common modeling assumptions. This motivates us to develop an alignment-based framework for unsupervise...
متن کامل