نتایج جستجو برای: lexicography
تعداد نتایج: 2475 فیلتر نتایج به سال:
In this paper, the authors address the significance and complexity of tokenization, the beginning step of NLP. Notions of word and token are discussed and defined from the viewpoints of lexicography and pragmatic implementation, respectively. Automatic segmentation of Chinese words is presented as an illustration of tokenization. Practical approaches to identification of compound tokens in Engl...
^he "user-perspective" has emerged as an all-important criterion in the selection and lexicographical treatment of lexical items in modern dictionary compilation. Starting with one of the early American advocates (Barnhart [1962]), the concept made its way into reference works (Hartmann and James [1998]) and is a recurrent theme of both the British (practical) and German (theoretical) schools o...
We present some variations affecting the association measure and thresholding on a technique for learning Selectional Restrictions from on-line corpora. It uses a wide-coverage noun taxonomy and a statistical measure to generalize the appropriate semantic classes. Evaluation measures for the Selectional Restrictions learning task are discussed. Finally, an experimental evaluation of these varia...
TshwaneLex is the world's only lexicography software suite with which the entire lexicographic process, from initial compilation all the way to final product, may be conducted in the language of one's choice. This is possible thanks to various aspects of internationalisation, localisation and customisation that are built into TshwaneLex. These are discussed by means of examples drawn from a wid...
This paper describes a tool that combines features found in empirical sign language lexicography and in sign language discourse transcription. It supports the user in lexicon building while working on the transcription of a corpus. While it tries to reach a certain level of compatibility with upcoming multimedia annotation tools, it offers a number of unique features considered essential due to...
Abstract We describe PAX, ”Portable Audio Concordance System”, a proof-of-concept prototype of a multipurpose, multilingual audio concordance toolkit. The primary goal is to support efficient grammar and lexicon construction in the documentation of unwritten languages; languages currently included are Ega, Anyi, and Koulango (Ivory Coast), additional samples in German and English. The approach ...
This paper introduces PersPred, the first manually elaborated syntactic and semantic database for Persian Complex Predicates (CPs). Beside their theoretical interest, Persian CPs constitute an important challenge in Persian lexicography and for NLP. The first delivery, PersPred 11, contains 700 CPs, for which 22 fields of lexical, syntactic and semantic information are encoded. The semantic cla...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید