Raslan 2009
نویسندگان
چکیده
The article describes the process of constructing a spell checker for the Esperanto language and its implementation as a dictionary (i.e. an affix file and a word list) for the Hunspell spell-checking engine. In comparison to existing solutions, the chosen approach takes note of morphologically complex words, which are common in Esperanto due to its agglutinative nature, and applies a set of rules describing allowed morpheme compounds, along with semantic classification of all involved word roots. The result has been tested with a user community and is presently being incorporated into the OpenOffice.org office suite.
منابع مشابه
Problems of Machine Translation Evaluation
In this article we deal with general aspects of machine translation evaluation. We describe several commonly used methods of the evaluation and discuss their problems and shortcomings. Then we outline a few thoughts and ideas which try to solve mentioned problems and stand behind a design of a new method of machine translation evaluation.
متن کاملDiscovering Grammatical Relations in Czech Sentences
The syntactic parser synt developed at NLP Centre, Faculty of Informatics, Masaryk University, can provide as one of its possible outputs a list of dependency relations discovered in the analysed sentence. In the paper, we present the result of codification and translation of the (rather technically labeled) dependency relations from synt to linguistically significant relations. The resulting r...
متن کاملThe Saara Framework: An Anaphora Resolution System for Czech
Determining reference and referential links in discourse is one of the biggest and most important challenges in natural language understanding. In particular, computing coreference classes over the set of referring expressions in text is crucial for its further syntactic and semantic processing. We present a system for automatic anaphora resolution that can be used on arbitrary texts in Czech. ...
متن کاملSemantic Network Integrity Maintenance via Heuristic Semi-Automatic Tests
In this article we discuss issues connected with maintaining content integrity of general-purpose semantic network that is in development. Construction of a semantic network from scratch is a long process that usually requires both linguistic work done by hand and semiautomatic methods to add or translate the data which must be subsequently reviewed. In this process many systemic and/or languag...
متن کاملApplying Word Sketches to Russian
The paper describes work on writing a Russian Sketch grammar for the system Sketch Engine. The objective of such a system is to provide lexicographers with sufficient lexical material and tools for getting information about a word’s collocability and to generate lists of the most frequent phrases for a given word, and then to classify them for appropriate syntactic models. The system will give ...
متن کاملFast Morphological Analysis of Czech
This paper presents a new Czech morphological analyser which takes an advantage of Jan Daciuk’s algorithms for minimal deterministic acyclic finite state automata. The new analyser is six times faster than the current analyser ajka concerning the proper analysis, i.e. returning possible lemmata and tags for a given word form, but for some other related tasks is the difference even bigger.
متن کامل