Reductionistic, Tree and Rule Based Tagger for Polish

نویسندگان

  • Maciej Piasecki
  • Grzegorz Godlewski
چکیده

The paper presents an approach to tagging of Polish based on the combination of handmade reduction rules and selecting rules acquired by Induction of Decision Trees. The general open architecture of the tagger is presented, where the overall process of tagging is divided into subsequent steps and the overall problem is reduced to subproblems of ambiguity classes. A special language of constraints and the use of constraints as elements of decision trees are described. The results of the experiments performed with the tagger are also presented.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multiclassifier Approach to Tagging of Polish

The large tagset, the limited size of corpora and the free word order are the main causes for achieving low accuracy of tagging Polish by applying the commonly used techniques based on stochastic modelling. The proposed architecture of the Polish tagger called TaKIPI created the possibility for using different types of classifiers in tagging, but only C4.5 Decision Trees were applied initially....

متن کامل

POLISH TAGGER TaKIPI: RULE BASED CONSTRUCTION AND OPTIMISATION

A large number of different tags, limited corpora and the free word order are the main causes of low accuracy of tagging in Polish (automatic disambiguation of morphological descriptions) by applying commonly used techniques based on stochastic modelling. In the paper the rule-based architecture of the TaKIPI Polish tagger combining handwritten and automatically extracted rules is presented. Th...

متن کامل

A Rule-Based Tagger for Polish Based on Genetic Algorithm

In the paper an approach to the construction of rule-based morphosyntactic tagger for Polish is proposed. The core of the tagger are modules of rules (classification systems), acquired from the IPI PAN corpus by application of Genetic Algorithms. Each module is specialised in making decisions concerning different parts of a tag (a structure of attributes). The acquired rules are combined with l...

متن کامل

Voltage Sag Compensation with DVR in Power Distribution System Based on Improved Cuckoo Search Tree-Fuzzy Rule Based Classifier Algorithm

A new technique presents to improve the performance of dynamic voltage restorer (DVR) for voltage sag mitigation. This control scheme is based on cuckoo search algorithm with tree fuzzy rule based classifier (CSA-TFRC). CSA is used for optimizing the output of TFRC so the classification output of the network is enhanced. While, the combination of cuckoo search algorithm, fuzzy and decision tree...

متن کامل

FTAG : current status and parsing scheme

As far as electronic syntactic resources go, one can distinguish rule-based versus statistics-based grammars, as well as program-dependent versus reusable grammars. Lexicalized Tree adjoning grammars (LTAGs) have been used to develop reusable wide-coverage rule-based grammars for different languages (cf. Doran et al. 1994, 1998 for English, Abeillé 1991 and Candito 1999 for French). We describe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005