Collocation Mining: Exploiting Corpora for Collocation, Identification and Representation

نویسنده

  • Brigitte Krenn
چکیده

The work presented provides computational linguistics methods and tools for collocation identiication from arbitrary text, and methods and tools for representing collocations in a relational database integrating competence (collocation-type-speciic linguistic analysis) and performance information (corpus sentences). The work diiers from existing approaches to collo-cation identiication in systematically utilizing collo-cation type-speciic linguistic information. With respect to collocation representation, the work is the rst to systematically and in a large scale combining competence-based descriptions of collocations with actual occurrences in text.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning to Order Terms: Supervised Interestingness Measures in Terminology Extraction

Term Extraction, a key data preparation step in Text Mining, extracts the terms, i.e. relevant collocation of words, attached to specific concepts (e.g. genetic-algorithms and decisiontrees are terms associated to the concept “Machine Learning” ). In this paper, the task of extracting interesting collocations is achieved through a supervised learning algorithm, exploiting a few collocations man...

متن کامل

Domain Collocation Identification

In this paper we present a new method of automatic collocation identification. Collocation is an important relation between words, which is widely used, among others, in information retrieval tasks. Over the last years, many methods of automatic collocation acquisition from text corpora have been proposed. The approach described in this paper differs from the others by focusing on domain colloc...

متن کامل

Statistical Identification of Collocations in Large Corpora for Information Retrieval

The linguistic phenomenon of collocation, the habitual juxtaposition of some words in natural language has been shown to benefit natural language processing tasks such as information retrieval. This paper examines the utility of several methods for collocation extraction for document retrieval, specifically for queries in question form.

متن کامل

Collocation Translation Acquisition Using Monolingual Corpora

Collocation translation is important for machine translation and many other NLP tasks. Unlike previous methods using bilingual parallel corpora, this paper presents a new method for acquiring collocation translations by making use of monolingual corpora and linguistic knowledge. First, dependency triples are extracted from Chinese and English corpora with dependency parsers. Then, a dependency ...

متن کامل

Spatial association analysis: A literature review

The immense explosion of geographically referenced data calls for efficient discovery of spatial knowledge. Spatial association analysis is a typical data mining approach for discovering spatial knowledge. Associate rules are patterns of form X→Y, where pattern Y is likely to occur when pattern X occurs. One of the most famous patterns, Diapers → Beer, is a typical association rule example. Spa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000