Heterogeneous Natural Language Processing Tools via Language Processing Chains

نویسنده

  • Diman Karagiozov
چکیده

One of the most recent developments in NLP is the emergence of linguistic annotation metasystems which make use of existing processing tools and implement pipelined architecture. In this paper we describe a system that offers a new perspective in exploiting NLP meta-systems by providing a common processing framework. This framework supports most of common NLP tasks by chaining tools that are able to communicate on the basis of common formats. As a demonstration of the effectiveness of the system to manage heterogeneous NLP tools, we developed an English processing chain, pipelining OpenNLP-based and C++ NLP implementations. Furthermore, we conducted experiments to test the stability and measure the performance of the English processing chain. A baseline processing chain for the Bulgarian language illustrates the capabilities of the system to support and manage processing chains for more languages.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

LREC 2008 Workshop Sustainability of Language Resources and Tools for Natural Language Processing

This paper describes the concept and usage of ALPE (Automated Linguistic Processing Environment) a system designed to facilitate the management and deployment of large and dynamic collections of linguistic resources and tools. ALPE can build linguistic processing chains involving the annotation formats and the tools integrated into a hierarchical structure. The particularities and advantages of...

متن کامل

روش جدید متن‌کاوی برای استخراج اطلاعات زمینه کاربر به‌منظور بهبود رتبه‌بندی نتایج موتور جستجو

Today, the importance of text processing and its usages is well known among researchers and students. The amount of textual, documental materials increase day by day. So we need useful ways to save them and retrieve information from these materials. For example, search engines such as Google, Yahoo, Bing and etc. need to read so many web documents and retrieve the most similar ones to the user ...

متن کامل

ALPE as LT4eL processing chain environment

This paper briefly describes the concept, initial implementation and usage of the ALPE system for natural language processing. A hierarchy connecting annotation schemas, processing tools and resources is used as working environment for the system, which can perform various complex NL processing tasks. ALPE will be used to build linguistic processing chains involving the annotation formats and t...

متن کامل

On the Link between Identity Processing and Learning Styles among Young Language learners

The present study attempted to investigate the probable relationship between Iranian young language learners’ identity processing styles and their learning styles. To this end, 29 advanced learners, 23 females and 6 males were randomly selected from an English language Institute. Twenty nine advanced young language learners were chosen randomly out of whole advanced young language learners in t...

متن کامل

A set of Tools for Integrating Linguistic and Non-Linguistic Information

In this position paper we describe the actual state of the development of an integrated set of tools (called SCHUG) for language processing supporting interaction with disparate sources of information, making thus Natural Language Processing (NLP) and Human Language Technology (HLT) even more relevant for Information Technology (IT) applications. The set of tools is realizing the communication ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011