Influence of Conditional Independence Assumption on Verb Subcategorization Detection
نویسندگان
چکیده
Learning Bayesian Belief Networks from corpora has been applied to the automatic acquisition of verb subcategorization frames for Modern Greek (MG). We are incorporating minimal linguistic resources, i.e morphological tagging and phrase chunking, since a general-purpose syntactic parser for MG is currently unavailable. Comparative experimental results have been evaluated against Naive Bayes classification, which is based on the conditional independence assumption along with two widely used methods, Log-Likelihood (LLR) and Relative Frequencies Threshold (RFT). We have experimented with a balanced corpus in order to assure unbiased behaviour of the training model. Results have depicted that obtaining the inferential dependencies of the training data could lead to a precision improvement of about 4% compared to that of Naive Bayes and 7% compared to LLR and RFT Moreover, we have been able to achieve a precision exceeding 87% on the identification of subcategorization frames which are not known beforehand, while limited training data are proved to endow with satisfactory results.
منابع مشابه
Verb Sense and Verb Subcategorization Probabilities
Roland, Douglas William (Ph.D., Linguistics) Verb Sense and Verb Subcategorization Probabilities Thesis directed by Associate Professor Daniel S. Jurafsky This dissertation investigates a variety of problems in psycholinguistics and computational linguistics caused by the differences in verb subcategorization probabilities found between various corpora and experimental data sets. For psycholing...
متن کاملOptimum Local Decision Rules in a Distributed Detection System with Dependent Observations
The theory of distributed detection is receiving a lot of attention. A common assumption used in previous studies is the conditional independence of the observations. In this paper, the optimization of local decision rules for distributed detection networks with correlated observations is considered. We focus on presenting the detection theory for parallel distributed detection networks with fi...
متن کاملOptimum Local Decision Rules in a Distributed Detection System with Dependent Observations
The theory of distributed detection is receiving a lot of attention. A common assumption used in previous studies is the conditional independence of the observations. In this paper, the optimization of local decision rules for distributed detection networks with correlated observations is considered. We focus on presenting the detection theory for parallel distributed detection networks with fi...
متن کاملSubcategorization acquisition
Manual development of large subcategorised lexicons has proved difficult because predicates change behaviour between sublanguages, domains and over time. Yet access to a comprehensive subcategorization lexicon is vital for successful parsing capable of recovering predicate-argument relations, and probabilistic parsers would greatly benefit from accurate information concerning the relative likel...
متن کاملHow Verb Subcategorization Frequencies Are Affected By Corpus Choice
The probabilistic relation between verbs and their arguments plays an important role in modern statistical parsers and supertaggers, and in psychological theories of language processing. But these probabilities are computed in very different ways by the two sets of researchers. Computational linguists compute verb subcategorization probabilities from large corpora while psycholinguists compute ...
متن کامل