TextMarker: A Tool for Rule-Based Information Extraction

نویسندگان

  • Peter Kluegl
  • Martin Atzmueller
  • Frank Puppe
چکیده

This paper presents TEXTMARKER– a powerful toolkit for rule-based information extraction. TEXTMARKER is based on UIMA and provides versatile information processing and advanced extraction techniques. We thoroughly describe the system and its capabilities for human-like information processing and rapid prototyping of information extraction applications.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Test-Driven Development of Complex Information Extraction Systems using TextMarker

Information extraction is concerned with the location of specific items in textual documents. Common process models for this task use ad-hoc testing methods against a gold standard. This paper presents an approach for the testdriven development of complex information extraction systems. We propose a process model for test-driven information extraction, and discuss its implementation using the r...

متن کامل

Rule-Based Information Extraction for Structured Data Acquisition using TextMarker

Information extraction is concerned with the location of specific items in (unstructured) textual documents, e.g., being applied for the acquisition of structured data. Then, the acquired data can be applied for mining methods requiring structured input data, in contrast to other text mining methods that utilize a bag-of-words approach. This paper presents a semi-automatic approach for structur...

متن کامل

A Framework for Semi-Automatic Development of Rule-based Information Extraction Applications

For the successful processing and handling of (large scale) document collections, effective information extraction methods are essential. This paper presents a framework for the semiautomatic development of rule-based information extraction applications based on the TEXTMARKER language utilizing machine learning methods. We describe the approach in detail and present the TEXTRULER system as an ...

متن کامل

Integrating the Rule-Based IE Component TextMarker into UIMA

In this paper we describe the integration of the rule-based IE component TEXTMARKER into the UIMA framework. We present a conceptual overview on the TEXTMARKER system before we describe the UIMA integration in detail.

متن کامل

Reliability Measures Measurement under Rule-Based Fuzzy Logic Technique

In reliability theory, the reliability measures contend the very important and depreciative role for any system analysis. Measurement of reliability measures is not easy due to ambiguity and vagueness which exist within reliability parameters. It is also very difficult to incorporate a large amount of uncertainty in well-established methodologies and techniques. However, fuzzy logic provides an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009