Raslan 2009

نویسندگان

  • P. Sojka
  • A. Horák
  • Petr Sojka
  • Aleš Horák
چکیده

The article describes the process of constructing a spell checker for the Esperanto language and its implementation as a dictionary (i.e. an affix file and a word list) for the Hunspell spell-checking engine. In comparison to existing solutions, the chosen approach takes note of morphologically complex words, which are common in Esperanto due to its agglutinative nature, and applies a set of rules describing allowed morpheme compounds, along with semantic classification of all involved word roots. The result has been tested with a user community and is presently being incorporated into the OpenOffice.org office suite.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Problems of Machine Translation Evaluation

In this article we deal with general aspects of machine translation evaluation. We describe several commonly used methods of the evaluation and discuss their problems and shortcomings. Then we outline a few thoughts and ideas which try to solve mentioned problems and stand behind a design of a new method of machine translation evaluation.

متن کامل

Discovering Grammatical Relations in Czech Sentences

The syntactic parser synt developed at NLP Centre, Faculty of Informatics, Masaryk University, can provide as one of its possible outputs a list of dependency relations discovered in the analysed sentence. In the paper, we present the result of codification and translation of the (rather technically labeled) dependency relations from synt to linguistically significant relations. The resulting r...

متن کامل

The Saara Framework: An Anaphora Resolution System for Czech

Determining reference and referential links in discourse is one of the biggest and most important challenges in natural language understanding. In particular, computing coreference classes over the set of referring expressions in text is crucial for its further syntactic and semantic processing. We present a system for automatic anaphora resolution that can be used on arbitrary texts in Czech. ...

متن کامل

Semantic Network Integrity Maintenance via Heuristic Semi-Automatic Tests

In this article we discuss issues connected with maintaining content integrity of general-purpose semantic network that is in development. Construction of a semantic network from scratch is a long process that usually requires both linguistic work done by hand and semiautomatic methods to add or translate the data which must be subsequently reviewed. In this process many systemic and/or languag...

متن کامل

Applying Word Sketches to Russian

The paper describes work on writing a Russian Sketch grammar for the system Sketch Engine. The objective of such a system is to provide lexicographers with sufficient lexical material and tools for getting information about a word’s collocability and to generate lists of the most frequent phrases for a given word, and then to classify them for appropriate syntactic models. The system will give ...

متن کامل

Fast Morphological Analysis of Czech

This paper presents a new Czech morphological analyser which takes an advantage of Jan Daciuk’s algorithms for minimal deterministic acyclic finite state automata. The new analyser is six times faster than the current analyser ajka concerning the proper analysis, i.e. returning possible lemmata and tags for a given word form, but for some other related tasks is the difference even bigger.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009