Learning of Finite Unions of Tree Patterns with Internal Structured Variables from Queries

نویسندگان

  • Satoshi Matsumoto
  • Takayoshi Shoudai
  • Tetsuhiro Miyahara
  • Tomoyuki Uchida
چکیده

We consider the polynomial time learnability of finite unions of ordered tree patterns with internal structured variables, in the query learning model of Angluin (1988). An ordered tree pattern with internal structured variables, called a term tree, is a rooted tree pattern which consists of tree structures with ordered children and internal structured variables. A term tree is suited for representing structural features in semistructured or tree structured data such as HTML/XML files. The language L(t) of a term tree t is the set of all trees which are obtained from t by substituting arbitrary trees for all variables in t. Moreover, for a finite set H of term trees, L(H) = ⋃ t∈H L(t). Let H∗, which is a target of learning, be a finite set of term trees. An oracle for restricted subset queries answers “yes” for an input set H if L(H) ⊆ L(H∗), and answers “no”, otherwise. An oracle for equivalence queries returns “yes” for an input set H if L(H) = L(H∗), and returns a counterexample in L(H) ∪ L(H∗) − L(H) ∩ L(H∗), otherwise. We show that any finite union of languages defined by m term trees is exactly identifiable in polynomial time using at most 2mn restricted subset queries and at most m + 1 equivalence queries, where n is the maximum size of counterexamples.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Learning of Ordered Tree Patterns with Internal Structured Variables

We show that some foundamental classes of ordered tree pattern languages are polynomial time inductively inferable from positive data and exactly learnable in polynomial time using queries. We report experimental results on applying our learnining method to a collection of tree structured data.

متن کامل

cient Learning of Semi - structured Data from Queries

This paper studies the polynomial-time learnability of the classes of ordered gapped tree patterns (OGT) and ordered gapped forests (OGF) under the into-matching semantics in the query learning model of Angluin. The class OGT is a model of semi-structured database query languages, and a generalization of both the class of ordered/unordered tree pattern languages and the class of non-erasing reg...

متن کامل

Efficient Learning of Semi-structured Data from Queries

This paper studies the learning complexity of classes of structured patterns for HTML/ XML-trees in the query learning framework of Angluin. We present polynomial time learning algorithms for ordered gapped tree patterns, OGT, and ordered gapped forests, OGF, under the into-matching semantics using equivalence queries and subset queries. As a corollary, the learnability with equivalence and mem...

متن کامل

Exact Learning of Tree Patterns

Tree patterns are natural candidates for representing rules and hypotheses in many tasks such as information extraction and symbolic mathematics. A tree pattern is a tree with labeled nodes where some of the leaves may be labeled with variables, whereas a tree instance has no variables. A tree pattern matches an instance if there is a consistent substitution for the variables that allows a mapp...

متن کامل

Learning Boxes in High Dimension 1

We present exact learning algorithms that learn several classes of (discrete) boxes in f0; : : : ; ` 1gn. In particular we learn: (1) The class of unions of O(log n) boxes in time poly(n; log `) (solving an open problem of [16, 12]; in [3] this class is shown to be learnable in time poly(n; `)). (2) The class of unions of disjoint boxes in time poly(n; t; log `), where t is the number of boxes....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEICE Transactions

دوره 91-D  شماره 

صفحات  -

تاریخ انتشار 2002