Dictionary Alignment for Context-sensitive Word Glossing

نویسندگان

  • Willy Yap
  • Timothy Baldwin
چکیده

This paper proposes a method for automatically sense-to-sense aligning dictionaries in different languages (focusing on Japanese and English), based on structural data in the respective dictionaries. The basis of the proposed method is sentence similarity of the sense definition sentences, using a bilingual Japanese-to-English dictionary as a pivot during the alignment process. We experiment with various embellishments to the basic method, including term weighting, stemming/lemmatisation, and ontology expansion.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic interlinear glossing as two-level sequence classification

Interlinear glossing is a type of annotation of morphosyntactic categories and crosslinguistic lexical correspondences that allows linguists to analyse sentences in languages that they do not necessarily speak. Automatising this annotation is necessary in order to provide glossed corpora big enough to be used for quantitative studies. In this paper, we present experiments on the automatic gloss...

متن کامل

The effect of three vocabulary techniques on the Iranian ESP learners’ vocabulary production

The present study aimed to examine the effect of three vocabulary techniques (dictionary use, etymological analysis, and glossing) on the Iranian ESP learners' vocabulary production. Forty-five university students majoring in architecture at Azad University, Anzali branch,  participated in this study. They were divided into three groups, and each group was randomly assigned to one kind of treat...

متن کامل

The Sharp Intelligent Dictionary

This paper describes the Sharp Intelligent Dictionary (SID), an English-Japanese glossing system for Japanese readers and learners of English. SID uses a variety of lightweight analysis techniques, a large bilingual dictionary and a prioritised model of collocations to present informed guesses about the best translations of words and expressions in their context.

متن کامل

Word-for-Word Glossing with Contextually Similar Words

Many corpus-based machine translation systems require parallel corpora. In this paper, we present a word-for-word glossing algorithm that requires only a source language corpus. To gloss a word, we first identify its similar words that occurred in the same context in a large corpus. We then determine the gloss by maximizing the similarity between the set of contextually similar words and the di...

متن کامل

Practical Glossing by Prioritised Tiling

We present the design of a practical context-sensitive glosser, incorporating current techniques for lightweight linguistic analysis based on large-scale lexical resources. We outline a general model for ranking the possible translations of the words and expressions that make up a text. This information can be used by a simple resource-bounded algorithm, of complexity O(n log n) in sentence len...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007