Morpho-Syntactic Descriptions in MULTEXT-East - the Case of Serbian

نویسندگان

  • Cvetana Krstev
  • Dusko Vitas
  • Tomaz Erjavec
چکیده

Cvetana Krstev,∗ Duško Vitas† and Tomaž Erjavec‡ ∗Faculty of Philology, University of Belgrade Studentski trg 3, 11000 Begrade Serbia and Montenegro [email protected] †Faculty of Mathematics, University of Belgrade Studentski trg 16, 11000 Begrade Serbia and Montenegro [email protected] ‡Department of Knowledge Technologies Jožef Stefan Institute Jamova 39, 1000 Ljubljana Slovenia [email protected]

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MULTEXT-East Resources for Serbian

The paper presents the MULTEXT-East language resources for the Serbian language. MULTEXT-East is a multilingual dataset for language engineering research and development. This standardised and linked set of resources covers a large number of mainly Central and Eastern European languages and includes the EAGLES-based morphosyntactic specifications, defining the features that describe wordlevel s...

متن کامل

Using a Large Set of EAGLES-compliant Morpho-Syntactic Descriptors as a Tagset for Probabilistic Tagging

The paper presents one way of reconciling data sparseness with the requirement of high accuracy tagging in terms of fine-grained tagsets. For lexicon encoding, EAGLES elaborated a set of recommendations aimed at covering multilingual requirements and therefore resulted in a large number of features and possible values. Such an encoding, used for tagging purposes, would lead to very large tagset...

متن کامل

MULTEXT-East Version 3: Multilingual Morphosyntactic Specifications, Lexicons and Corpora

The paper presents the third edition of the MULTEXT-East language resources, a multilingual dataset for language engineering research and development. This standardised and linked set of resources covers a large number of mainly Central and Eastern European languages and includes the EAGLES-based morphosyntactic specifications, defining the features that describe word-level syntactic annotation...

متن کامل

A Description of Morphological Features of Serbian: a Revision using Feature System Declaration

In this paper we discuss some well-known morphological descriptions used in various projects and applications (most notably MULTEXT-East and Unitex) and illustrate the encountered problems on Serbian. We have spotted four groups of problems: the lack of a value for an existing category, the lack of a category, the interdependence of values and categories lacking some description, and the lack o...

متن کامل

Morpho-syntactic Clues for Terminological Processing in Serbian

In this paper we discuss morpho-syntactic clues that can be used to facilitate terminological processing in Serbian. A method (called SRCE) for automatic extraction of multiword terms is presented. The approach incorporates a set of generic morpho-syntactic filters for recognition of term candidates, a method for conflation of morphological variants and a module for foreign word recognition. Mo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Informatica (Slovenia)

دوره 28  شماره 

صفحات  -

تاریخ انتشار 2004