Options for Automatic Creation of Dictionary Definitions from Corpora

نویسندگان

  • Marie Stará
  • Vojtech Kovár
چکیده

This paper maps the possibilities of using existing corpus tools to acquire definitions for Czech in an automatic way. It compares definitions from Dictionary of contemporary Czech (Slovník současné češtiny pro školu a veřejnost) and data acquired using Thesaurus and Word sketch in corpus czTenTen12.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dictionary of Abstract and Concrete Words of the Russian Language: A Methodology for Creation and Application

The paper describes the first stage of a project on creating an electronic dictionary with numerical estimates of the degree of abstractness and concreteness of Russian words. Our approach is to integrate data obtained from several different sources: text corpora, psycholinguistic experiments, published dictionaries, markers of abstractness (certain suffixes) and a translation of a similar dict...

متن کامل

Empirical Acquisition Of Differentiating Relations From Definitions

This paper describes a new automatic approach for extracting conceptual distinctions from dictionary definitions. A broad-coverage dependency parser is first used to extract the lexical relations from the definitions. Then the relations are disambiguated using associations learned from tagged corpora. This contrasts with earlier approaches using manually developed rules for disambiguation.

متن کامل

Linguistic Technologies Applied Lexicography and Scientific Text Corpora

Nowadays applied lexicography is a special domain of applied linguistics and language engineering in the framework of problemoriented automated and automatic dictionaries and databases. Modern approach to dictionary creation assumes preliminary work with parallel or comparable text corpora to be considered as reference database for solving both research and practical lexicographic problems. Pa...

متن کامل

Automatic Discovery of Fuzzy Synsets from Dictionary Definitions

In order to deal with ambiguity in natural language, it is common to organise words, according to their senses, in synsets, which are groups of synonymous words that can be seen as concepts. The manual creation of a broad-coverage synset base is a timeconsuming task, so we take advantage of dictionary definitions for extracting synonymy pairs and clustering for identifying synsets. Since word s...

متن کامل

Knowledge-Rich Contexts Discovery

Within large corpora of texts, Knowledge-Rich Contexts (KRCs) are a subset of sentences containing information that would be valuable to a human for the construction of a knowledge base. The entry point to the discovery of KRCs is the automatic identification of Knowledge Patterns (KPs) which are indicative of semantic relations. Machine readable dictionary serves as our starting point for inve...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016