نتایج جستجو برای: lexical entries

تعداد نتایج: 45357  

2004
Jim Breen

The JMdict project has at its aim the compilation of a multilingual lexical database with Japanese as the pivot language. Using an XML structure designed to cater for a mix of languages and a rich set of lexicographic information, it has reached a size of approximately 100,000 entries, with most entries having translations in English, French and German. The compilation involves information re-u...

1994
Toru Hisamitsu Katsumi Marukawa Yoshihiro Shima Hiromichi Fujisawa Yoshihiko Nitta

OCR error correction using Japanese morphological analysis contains two time-consuming procedures: extraction of candidate words from combinations of candidate characters, and finding the most plausible word sequence in combinations of the candidate words. In this paper an optimal word extraction technique, and the use of lexical entries that are tailored for Japanese verb inflection, are inves...

2013
Roger Evans

In this paper we introduce an approach to lexical description which is sufficiently powerful to support language processing tasks such as part-of-speech tagging or sentence recognition, traditionally considered the province of external algorithmic components. We show how this approach can be implemented in the lexical description language, DATR, and provide examples of modelling extended lexica...

2010
Amitava Das Sivaji Bandyopadhyay

Advances in NLP techniques have led to a great demand for tagging and analysis of the sentiments from unstructured natural language data over the last few years. A typical approach to sentiment analysis is to start with a lexicon of positive and negative words and phrases. In these lexicons, entries are tagged with their prior out of context polarity. Unfortunately all efforts found in literatu...

2008
Violaine Prince Jacques Chauché

This paper describes a solution to lexical transfer as a trade-off between a dictionary and an ontology. It shows its association to a translation tool based on morpho-syntactical parsing of the source language. It is based on the English Roget Thesaurus and its equivalent, the French Larousse Thesaurus, in a computational framework. Both thesaurii are transformed into vector spaces, and all mo...

2006
Julia Hockenmaier

We present an algorithm which creates a German CCGbank by translating the syntax graphs in the German Tiger corpus into CCG derivation trees. The resulting corpus contains 46,628 derivations, covering 95% of all complete sentences in Tiger. Lexicons extracted from this corpus contain correct lexical entries for 94% of all known tokens in unseen text.

2008
Marc Kemps-Snijders Claus Zinn Jacquelijn Ringersma Menzo Windhouwer

In this paper, we describe a unifying approach to tackle data heterogeneity issues for lexica and related resources. We present LEXUS, our software that implements the Lexical Markup Framework (LMF) to uniformly describe and manage lexica of different structures. LEXUS also makes use of a central Data Category Registry (DCR) to address terminological issues with regard to linguistic concepts as...

1996
Eneko Agirre German Rigau

This paper presents a method for the resolution of lexical ambiguity of nouns and its automatic evaluation over the Brown Corpus. The method relies on the use of the widecoverage noun taxonomy of WordNet and the notion of conceptual distance among concepts, captured by a Conceptual Density formula developed for this purpose. This fully automatic method requires no hand coding of lexical entries...

2006
Stefan Schulz Kornél G. Markó Philipp Daumke Udo Hahn Susanne Hanser Percy Nohama Roosewelt L. Andrade Edson José Pacheco Martin Romacker

We present the lexico-semantic foundations underlying a multilingual lexicon the entries of which are constituted by so-called subwords. These subwords reflect semantic atomicity constraints in the medical domain which diverge from canonical lexicological understanding in NLP. We focus here on criteria to identify and delimit reasonable subword units, to group them into functionally adequate sy...

2007
Yannick Marchand Connie R. Adsett Robert I. Damper

• The three lexical databases • 18,016 words were both found in the Webster’s Pocket Dictionary and the Wordsmyth English Dictionary-Thesaurus. • These 2 independent dictionaries, each consisting of 18,016 syllabified entries, are referred as S&R and Wordsmyth, respectively. • A third database, Intersection, was derived consisting of the 13,594 words in the two above independent dictionaries wi...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید