ABL: Alignment-Based Learning
نویسنده
چکیده
This ])al)er introdu(:es a new tyl)e of grammar learning algorithm, iilst)ired l)y sl;ring edit distance (Wagner and Fis(:her, 1974). The algor i thm takes a (:ortms of tlat S(~lltell(:es as input and returns a (:ortms of lat)elled, l)ra(:ket(~(1 sen-~ ten(:(~s. The method works on 1)airs of unstru(:tllr(?(l SelltellC(~,s that have one or more words in ( :onunon. W]lc, ll two senten( 'es are (tivided in to parts that are the same in 1)oth s(mten(:es and parl;s tha|; are (litl'erent, this intbrmation is used to lind l)arl;s that are interchangeal)le. These t)arts are taken as t)ossil)le (:onstituents of the same tyl)e. After this alignment learning stel) , the selection learning stc l) sel(~('ts the most l)rot)at)le constituents from all 1)ossit)le (:onstituents. This m(;thod was used to t)ootstrat) s tructure (m the ATIS (:ortms (Mar(:us et al., 1f)93) and on the OVIS ~ (:ort)us (Bommma et ~d., 1997). While the results are en(:om:aging (we ol)|;ained Ul) to 89.25 % non-crossing l)ra(:kets precision), this 1)at)er will 1)oint out some of the shortcomings of our at)l)roa(:h and will suggest 1)ossible solul;ions. 1 I n t r o d u c t i o n Unsupervised learning of syntactic s tructure is one of the hardest 1)rol)lems in NLP. Although people are adept at learning grammatical structure, it is ditficult to model this 1)recess and therefore it is hard to make a eomtmter learn
منابع مشابه
Alignment-Based Learning versus Data-Oriented Parsing
This chapter will briefly describe the Alignment-Based Learning (ABL) framework and relate it to Data-Oriented Parsing from different viewpoints. Firstly, ABL can be used to bootstrap an initial treebank, which can then be used by DOP. Secondly, ABL can be used to enhance the robustness of DOP. Thirdly, a DOP model can be used to disambiguate ambiguous syntactic structures found during the lear...
متن کاملString Alignment in Grammatical Inference
This thesis is concerned with unsupervised learning of syntactic structure from plain text corpora by aligning sentences. Based on Harris’ (1951) linguistic notion of substitutability, sentences in a plain text corpus can be compared to each other and those parts that have similar context and in addition can be substituted for each other without resulting in ungrammatical sentences are consider...
متن کاملString Alignment in Grammatical Inference: what Suffix Trees can do
This thesis is concerned with unsupervised learning of syntactic structure from plain text corpora by aligning sentences. Based on Harris’ (1951) linguistic notion of substitutability, sentences in a plain text corpus can be compared to each other and those parts that have similar context and in addition can be substituted for each other without resulting in ungrammatical sentences are consider...
متن کاملGrammatical Inference Using Suffix Trees
The goal of the Alignment-Based Learning (ABL) grammatical inference framework is to structure plain (natural language) sentences as if they are parsed according to a context-free grammar. The framework produces good results even when simple techniques are used. However, the techniques used so far have computational drawbacks, resulting in limitations with respect to the amount of language data...
متن کاملBootstrapping structure into language : alignment-based learning
. . . refined and abstract meanings largely grow out of more concrete meanings. — Bloomfield (1933) This thesis introduces a new unsupervised learning framework, called AlignmentBased Learning, which is based on the alignment of sentences and Harris’s (1951) notion of substitutability. Instances of the framework can be applied to an untagged, unstructured corpus of natural language sentences, r...
متن کاملABL: Alignment-Based Learning
This paper introduces a new type of grammar learning algorithm, inspired by string edit distance (Wagner and Fischer, 1974). The algorithm takes a corpus of flat sentences as input and returns a corpus of labelled, bracketed sentences. The method works on pairs of unstructured sentences that have one or more words in common. When two sentences are divided into parts that are the same in both se...
متن کامل