Learning Uniication-based Grammars Using the Spoken English Corpus
نویسندگان
چکیده
This paper describes a grammar learning system that combines model-based and data-driven learning within a single framework. Our results from learning grammars using the Spoken English Corpus (SEC) suggest that combined model-based and data-driven learning can produce a more plausible grammar than is the case when using either learning style in isolation.
منابع مشابه
Can punctuation help learning?
The quality of learnt natural language grammars can be enhanced by exploiting the linguistic devices that comprise a corpus. This paper considers one such device, namely punctuation. After brieey considering the linguistics of punctuation, a model capturing some of these properties is presented. Following this, a series of experiments learning uniication-based natural language grammars, using t...
متن کاملLearning Unification-Based Grammars Using the Spoken English Corpus
This paper describes a grammar learning system that combines model-based and data-driven learning within a single framework. Our results from learning grammars using the Spoken English Corpus (SEC) suggest that combined model-based and data-driven learning can produce a more plausible grammar than is the case when using either learning style in isolation.
متن کاملAcquiring Plausible Uni cation-Based Grammars using Model-Based and Data-Driven Learning
Undergeneration is a problem that undermines successful parsing of unrestricted texts. A popular solution to this problem is automatic grammar correction (or machine learning of grammar). Broadly speaking, grammar correction approaches can be classiied as being either data-driven, or model-based. Data-driven learners use data-intensive methods to acquire grammar. They typically use grammar form...
متن کاملMore for less: learning a wide covering grammar from a small training set
This paper describes a grammar learning system which combines model-based and data-driven learning within a single framework. Results from learning grammars with the Spoken English Corpus (SEC) suggest that a combined model-based and data-driven learner can acquire a wide coverage grammar from only a small training corpus. In this paper, we present some results of our grammar learning system. W...
متن کاملLearning unification-based natural language grammars
Practical text processing systems need wide covering grammars. When parsing unrestricted language, such grammars often fail to generate all of the sentences that humans would judge to be grammatical. This problem undermines successful parsing of the text and is known as undergeneration. There are two main ways of dealing with undergeneration: either by sentence correction, or by grammar correct...
متن کامل