Algorithms for Learning Regular Expressions

نویسنده

  • Henning Fernau
چکیده

We describe algorithms that directly infer regular expressions from positive data and characterize the regular language classes that can be learned this way.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Theory and Algorithms for Information Extraction and Classification in Textual Data Mining

Regular expressions can be used as patterns to extract features from semi-structured and narrative text [8]. For example, in police reports a suspect’s height might be recorded as “{CD} feet {CD} inches tall”, where {CD} is the part of speech tag for a numeric value. The result in [1] shows us that regular expressions could have higher performance than explicit expressions in some applications ...

متن کامل

Learning Regular Languages via Alternating Automata

Nearly all algorithms for learning an unknown regular language, in particular the popular L∗ algorithm, yield deterministic finite automata. It was recently shown that the ideas of L∗ can be extended to yield non-deterministic automata, and that the respective learning algorithm, NL∗, outperforms L∗ on randomly generated regular expressions. We conjectured that this is due to the existential na...

متن کامل

Learning Regular Expressions from Noisy Sequences

The presence of long gaps dramatically increases the difficulty of detecting and characterizing complex events hidden in long sequences. In order to cope with this problem, a learning algorithm based on an abstraction mechanism is proposed: it can infer the general model of complex events from a set of learning sequences. Events are described by means of regular expressions, and the abstraction...

متن کامل

Algorithms for learning regular expressions from positive data

Article history: Received 26 October 2007 Revised 5 December 2008 Available online 24 January 2009

متن کامل

Wiki Vandalysis - Wikipedia Vandalism Analysis - Lab Report for PAN at CLEF 2010

Wikipedia describes itself as the “free encyclopedia that anyone can edit”. Along with the helpful volunteers who contribute by improving the articles, a great number of malicious users abuse the open nature of Wikipedia by vandalizing articles. Deterring and reverting vandalism has become one of the major challenges of Wikipedia as its size grows. Wikipedia editors fight vandalism both manuall...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005