Yawat: Yet Another Word Alignment Tool
نویسنده
چکیده
Yawat 1 is a tool for the visualization and manipulation of wordand phrase-level alignments of parallel text. Unlike most other tools for manual word alignment, it relies on dynamic markup to visualize alignment relations, that is, markup is shown and hidden depending on the current mouse position. This reduces the visual complexity of the visualization and allows the annotator to focus on one item at a time. For a bird’s-eye view of alignment patterns within a sentence, the tool is also able to display alignments as alignment matrices. In addition, it allows for manual labeling of alignment relations with customizable tag sets. Different text colors are used to indicate which words in a given sentence pair have already been aligned, and which ones still need to be aligned. Tag sets and color schemes can easily be adapted to the needs of specific annotation projects through configuration files. The tool is implemented in JavaScript and designed to run as a web application.
منابع مشابه
Two Tools for Creating and Visualizing Sub-sentential Alignments of Parallel Text
We present two web-based, interactive tools for creating and visualizing sub-sentential alignments of parallel text. Yawat is a tool to support distributed, manual wordand phrase-alignment of parallel text through an intuitive, web-based interface. Kwipc is an interface for displaying words or bilingual word pairs in parallel, word-aligned context. A key element of the tools presented here is t...
متن کاملYet Another Symmetrical and Real-time Word Alignment Method: Hierarchical Sub-sentential Alignment using F-measure
Symmetrization of word alignments is the fundamental issue in statistical machine translation (SMT). In this paper, we describe an novel reformulation of Hierarchical Subsentential Alignment (HSSA) method using F-measure. Starting with a soft alignment matrix, we use the F-measure to recursively split the matrix into two soft alignment submatrices. A direction is chosen as the same time on the ...
متن کاملA Tool for a High-Carat Gold-Standard Word Alignment
In this paper, we describe a tool designed to produce a gold-standard word alignment between a text and its translation with a novel visualization. In addition, the tool is designed to aid the aligners in producing an alignment at a high level of quality and consistency. This tool is presently being used to align the Hebrew Bible with an English translation of it.
متن کاملAn Integrated Tool for Translation-Memory Maintenance
This paper presents an integrated tool to construct and maintain translation-memory for memory-based machine translation. This tool was aimed to automate constructing and validating translation-memory both in word and in phrase levels from English-Thai parallel texts. To align English-Thai words and phrases, the crucial problems that must be resolved include multiple-word-expression boundary am...
متن کاملUsing Transliteration of Proper Names from Arabic to Latin Script to Improve English-Arabic Word Alignment
Bilingual lexicons of proper names play a vital role in machine translation and cross-language information retrieval. Word alignment approaches are generally used to construct bilingual lexicons automatically from parallel corpora. Aligning proper names is a task particularly difficult when the source and target languages of the parallel corpus do not share a same written script. We present in ...
متن کامل