Chinese Syntactic Parsing Based on Extended GLR Parsing Algorithm with PCFG*
نویسندگان
چکیده
This paper presents an extended GLR parsing algorithm with grammar PCFG* that is based on Tomita’s GLR parsing algorithm and extends it further. We also define a new grammar—PCFG* that is based on PCFG and assigns not only probability but also frequency associated with each rule. So our syntactic parsing system is implemented based on rule-based approach and statistics approach. Furthermore our experiments are executed in two fields: Chinese base noun phrase identification and full syntactic parsing. And the results of these two fields are compared from three ways. The experiments prove that the extended GLR parsing algorithm with PCFG* is an efficient parsing method and a straightforward way to combine statistical property with rules. The experiment results of these two fields are presented in this paper.
منابع مشابه
An Approach to Automatic Identification of Chinese Base Noun Phrases
This paper presents an approach to identify Chinese base noun phrases. This method is based on GLR algorithm and extends GLR parsing algorithm further. It is a mixed approach that combines rule-based method and statistical method by using PCFG system. From the experiment results, this method is not only simple but also feasible and efficient to base noun phrases identification.
متن کاملAn Effective Framework for Chinese Syntactic Parsing
This paper presents an effective framework for Chinese syntactic parsing, which includes two parts. The first one is a parsing framework, which is based on an improved bottom-up chart parsing algorithm, and integrates the idea of the beam search strategy of N best algorithm and heuristic function of A* algorithm for pruning, then get multiple parsing trees. The second is a novel evaluation mode...
متن کاملThe Parsing Algorithm of Translation Corresponding Tree (TCT) Grammar
In machine translation (MT), parsing acts as a kernel step to analyze and acquire the syntactic information of an input sentence for the purpose to reproduce the corresponding translation in target language according to the syntactic relationships between the source and target sentences. The parsing process is guided by a set of language formalism, and the design of such algorithm is highly dep...
متن کاملEmpirical Support for Probabilistic GLR Parsing
This paper discusses the e ectiveness of a new probabilistic generalized LR model (PGLR) in word-based parsing (morphological and syntactic analysis) tasks, in which we have to consider the word segmentation and multiple part-of-speech problems. Parsing a sentence from the morphological level makes the task much more complex because of the increase of parse ambiguity stemming from word segmenta...
متن کاملPCFG parsing with CRF tagging for head recognition
This paper presents our work for participation in the 2009 CIPS-Parseval shared task on Chinese syntactic tree parsing, for which we adopt a general PCFG parsing procedure with a conditional random fields (CRF) tagger for head constituent recognition. Our experiments show that an acceptable tagging result is obtained on the basis of a standard PCFG parsing output and a further evaluation on oth...
متن کامل