Parallel TreeBanks: Observations for Implication of Equivalent Alignments

نویسنده

  • Oleg Kapanadze
چکیده

Building a parallel Treebank anticipates alignment of linguistic information represented by diverse structures on different layers of a bilingual text. In this paper, we describe our observations for inference translation equivalents in parallel texts of languages with diverse structures German and Georgian. They belong to the different language families and as a consequence enjoy different typological features manifested by diverse morphological structures, word and phrase order in a clause. In the bilingual German-Georgian Treebank development process it has been given a try to cluster the tolerant syntactic structures and classify phrase conventional translations that could be considered as equivalent units in the bilingual text alignment issue.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

XML-based Phrase Alignment in Parallel Treebanks

This paper describes the usage of XML for representing cross-language phrase alignments in parallel treebanks. We have developed a TreeAligner as a tool for interactively inserting and correcting such alignments as an independent level of treebank annotation.

متن کامل

A Search Tool for Parallel Treebanks

This paper describes a tool for aligning and searching parallel treebanks. Such treebanks are a new type of parallel corpora that come with syntactic annotation on both languages plus sub-sentential alignment. Our tool allows the visualization of tree pairs and the comfortable annotation of word and phrase alignments. It also allows monolingual and bilingual searches including the specification...

متن کامل

Using the Stockholm TreeAligner

In this paper we present several use cases for the Stockholm TreeAligner, a software tool originally designed for annotating the alignments in a parallel treebank. The tool has been extended and improved to the point that it can now also serve as a general tool for browsing and searching monolingual and parallel treebanks. Among the use cases presented are: building a parallel treebank, browsin...

متن کامل

Poly-GrETEL: Cross-Lingual Example-based Querying of Syntactic Constructions

We present Poly-GrETEL, an online tool which enables syntactic querying in parallel treebanks and which is based on the monolingual GrETEL environment. We provide online access to the Europarl parallel treebank for Dutch and English, allowing users to query the treebank using either an XPath expression or an example sentence in order to look for similar constructions. We provide automatic align...

متن کامل

Exploiting Parallel Treebanks to Improve Phrase-Based Statistical Machine Translation

Given much recent discussion and the shift in focus of the field, it is becoming apparent that the incorporation of syntax is the way forward for the current state-of-the-art in machine translation (MT). Parallel treebanks are a relatively recent innovation and appear to be ideal candidates for MT training material. However, until recently there has been no other means to build them than by han...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Int. J. Comput. Linguistics Appl.

دوره 7  شماره 

صفحات  -

تاریخ انتشار 2016