Parallel texts: Using translational equivalents in linguistic typology
نویسنده
چکیده
Parallel texts are texts in different languages that can be considered translational equivalent. We introduce the notion ‘massively parallel text’ for such texts that have translations into very many languages. In this introduction we discuss some massively parallel texts that might be used for the investigation of linguistic diversity. Further, a short summary of the articles in this issue is provided, finishing with a prospect on where the investigation of parallel texts might lead us.
منابع مشابه
Syntactic Sentence Simplification for French
This paper presents a method for the syntactic simplification of French texts. Syntactic simplification aims at making texts easier to understand by simplifying complex syntactic structures that hinder reading. Our approach is based on the study of two parallel corpora (encyclopaedia articles and tales). It aims to identify the linguistic phenomena involved in the manual simplification of Frenc...
متن کاملMulti-grained alignment of parallel texts with endogenous resources
This paper deals with the spotting of multigrained translation equivalents from parallel corpora. The idea is to contribute to the processing of languages for which few linguistic resources are available. We especially pay attention to the handling of highly inflectional languages. Our approach is endogenous: it does not require external linguistic resources such as stemmers or taggers.
متن کاملExtraction of Translation Equivalents from Parallel Corpora Using Sense-sensitive Contexts
The paper proposes an unsupervised method to extract translation equivalents from parallel corpora. The strategy we use takes into account the context of words. Given a word of the source language and a particular context, we learn its word translation within an equivalent context. We first extract pairs of similar contexts and, then, we compare the similarity between words appearing in each pa...
متن کاملUncovering the differences in linguistic network dynamics of book and social media texts
Complex network studies span a large variety of applications including linguistic networks. To investigate the differences in book and social media texts in terms of linguistic typology, we constructed both sequential and sentence collocation networks of book, Facebook and Twitter texts with undirected and weighted edges. The comparisons are performed using the basic parameters like average deg...
متن کاملSecond Language Acquisition from Aligned Corpora
The paper describes a system for automatic aligning and searching for translation equivalents in large bilingual corpora. This implementation was developed to facilitate our tasks in GLOSSER #343 Coper-nicus'94 Joint Research Project, where Linguistic Modeling Laboratory was charged especially with preparation of bilingual material. The Gale-Church algorithm is chosen as aligning procedure for ...
متن کامل