Learning Cover Context-Free Grammars from Structural Data
نویسندگان
چکیده
We consider the problem of learning an unknown context-free grammar when the only knowledge available and of interest to the learner is about its structural descriptions with depth at most l. The goal is to learn a cover context-free grammar (CCFG) with respect to l, that is, a CFG whose structural descriptions with depth at most l agree with those of the unknown CFG. We propose an algorithm, called LA, that efficiently learns a CCFG using two types of queries: structural equivalence and structural membership. We show that LA runs in time polynomial in the number of states of a minimal deterministic finite cover tree automaton (DCTA) with respect to l. This number is often much smaller than the number of states of a minimum deterministic finite tree automaton for the structural descriptions of the unknown grammar.
منابع مشابه
Efficient Learning of Context-Free Grammars from Positive Structural Examples
In this paper, we introduce a new normal form for context-free grammars, called reversible context-free grammars, for the problem of learning context-free grammars from positive-only examples. A context-free grammar G = (N, Z, P, S) is said to be reversible if (1) A + G( and B -+ a in P implies A = B and (2) A -+ a@ and A --f aCfl in P implies B = C. We show that the class of reversible context...
متن کاملLearning context-free grammars from stochastic structural information
We consider the problem of learning context-free grammars from stochastic structural data. For this purpose, we have developed an algorithm (tlips) which identiies any rational tree set from stochastic samples and approximates the probability distribution of the trees in the language. The procedure identiies equivalent subtrees in the sample and outputs the hypothesis in linear time with the nu...
متن کاملImplicit Learning of Recursive Context-Free Grammars
Context-free grammars are fundamental for the description of linguistic syntax. However, most artificial grammar learning experiments have explored learning of simpler finite-state grammars, while studies exploring context-free grammars have not assessed awareness and implicitness. This paper explores the implicit learning of context-free grammars employing features of hierarchical organization...
متن کاملStudying impressive parameters on the performance of Persian probabilistic context free grammar parser
In linguistics, a tree bank is a parsed text corpus that annotates syntactic or semantic sentence structure. The exploitation of tree bank data has been important ever since the first large-scale tree bank, The Penn Treebank, was published. However, although originating in computational linguistics, the value of tree bank is becoming more widely appreciated in linguistics research as a whole. F...
متن کاملLambek Calculus Proofs and Tree Automata
We investigate natural deduction proofs of the Lambek calculus from the point of view of tree automata. The main result is that the set of proofs of the Lambek calculus cannot be accepted by a finite tree automaton. The proof is extended to cover the proofs used by grammars based on the Lambek calculus, which typically use only a subset of the set of all proofs. While Lambek grammars can assign...
متن کامل