Data in Your Language : The ECI
نویسندگان
چکیده
In this paper we describe the contents and the method of production of the ACL European Corpus Initiative Multilingual Corpus 1 (ECI/MC1). This is a large multilingual electronic text corpus, containing 97 million words in 27 (mainly European) languages. It is available cheaply on CDROM. Most of the texts in the corpus are marked up using a fully-validated SGML document type description based on the Text Encoding Initiative (TEI) guidelines for corpus annotation. It is hoped that this corpus will provide a useful resource for corpus-based computational linguistics.
منابع مشابه
Citizenship Classes for Bhutanese-Nepali Elders: From Cognitive Deficits to Cultural-Historical Understandings
This article focuses on home-based citizenship classes for Bhutanese-Nepali elders in Central Ohio in the United States. As part of a larger longitudinal study centered in the ethnographic, language socialization, and discourse analytic traditions, the article focuses on data, particularly regular audiovideo recordings, gathered over a five-month period and tracks one student’s progress towards...
متن کاملTei-conformant Structural Markup of a Trilingual Parallel Corpus in the Eci Multilingual Corpus 1 1. Overview of the Eci Corpus 1.1. Brief History and Acknowledgements
In this paper we provide an overview of the ACL European Corpus Initiative (ECI) Multilingual Corpus 1 (ECI/MC1). In particular, we look at one particular subcorpus in the ECI/MC1, the trilingual corpus of International Labour Organisation reports, and discuss the problems involved in TEI-compliant structural markup and preliminary alignment of this large corpus. We discuss gross structural ali...
متن کاملEngineer-computer interaction for structural monitoring
An increased availability of information technology (IT) is currently influencing decisions to monitor structures more frequently. IT is invariably used to interpret structural behaviour from monitoring data. However, engineers remain frustrated with IT results. Engineers work with incomplete knowledge, problem specific characteristics, and context dependency. Although such conditions require i...
متن کاملElements of Feminine Writing in Sepedeh Shamlu’s Novel Sorkhi-e-to az man (“Your Redness Is Mine”)
Feministic criticism concerns the function of specific feminine cultural and ideological constituents in literal works. In its approach to narrative this kind of criticism follows two methods. First, the attributes of women and their role and personality in course of the story, and second a critique of women presented which studies female authors. Sorkhi-e-to az man by Sepedeh Shamlu is one the...
متن کاملMeasuring Union and Nonunion Wage Growth: Puzzles in Search of Solutions
This paper presents conflicting evidence on trends in private sector union and nonunion wages. The BLS quarterly Employment Cost Index (ECI), constructed from establishment surveys, uses fixed weights applied to wage changes among matched job quotes. The ECI shows a substantial decrease in wage growth for union relative to nonunion workers. The annual Employer Costs for Employee Compensation (E...
متن کامل