corpora

نتایج جستجو برای: corpora

تعداد نتایج: 19685 فیلتر نتایج به سال:

Language Corpora: The Case for Ghanaian English

Journal: :3L: The Southeast Asian Journal of English Language Studies 2014

متن کامل

Evaluation of corpora amylacea in refractory epilepsy

Journal: :ACTUALIDAD MEDICA 2020

متن کامل

Tracing thick and thin concepts through corpora

Journal: :Language and Cognition 2023

Abstract Philosophers and linguists currently lack the means to reliably identify evaluative concepts measure their intensity. Using a corpus-based approach, we present new method distinguish evaluatively thick thin adjectives like ‘courageous’ ‘awful’ from descriptive ‘narrow,’ value-associated ‘sunny.’ Our study suggests that modifiers ‘truly’ ‘really’ frequently highlight dimension of adject...

متن کامل

KOTONOHA and BCCWJ: Development of a Balanced Corpus of Contemporary Written Japanese

2007

Kikuo Maekawa

The National Institute for Japanese Language (NIJL) has launched a long-term language corpus development initiative aiming at the development of a super-corpus called KOTONOHA, which is consisting of a multitude of independent corpora. Among the constituent corpora of KOTONOHA, the one that bears the most urgent need is a largescale balanced corpus of the present-day written Japanese. Construct...

متن کامل

Assessing the Corpus Size vs. Similarity Trade-off for Word Embeddings in Clinical NLP

2016

Kirk Roberts

The proliferation of deep learning methods in natural language processing (NLP) and the large amounts of data they often require stands in stark contrast to the relatively data-poor clinical NLP domain. In particular, large text corpora are necessary to build high-quality word embeddings, yet often large corpora that are suitably representative of the target clinical data are unavailable. This ...

متن کامل

Corpus Effects on the Evaluation of Automated Transliteration Systems

2007

Sarvnaz Karimi Andrew Turpin Falk Scholer

Most current machine transliteration systems employ a corpus of known sourcetarget word pairs to train their system, and typically evaluate their systems on a similar corpus. In this paper we explore the performance of transliteration systems on corpora that are varied in a controlled way. In particular, we control the number, and prior language knowledge of human transliterators used to constr...

متن کامل

Translation Lexicon Estimates from Non-Parallel Corpora Pairs

2007

Markos Mylonakis Khalil Sima’an

The estimation of translation lexicon probabilities from parallel corpora is well studied in statistical machine translation. Whenever parallel corpora are not available, it is still possible to obtain unsupervised estimates from pairs of monolingual, non-parallel corpora. In both cases the standard estimator is the Expectation-Maximization (EM) that aims at increasing the likelihood of the sou...

متن کامل

Automatic Parallel Corpora and Bilingual Terminology extraction from Parallel WebSites

2010

José João Almeida Alberto Simões

In our days, the notion, the importance and the significance of parallel corpora is so big that needs no special introduction. Unfortunately, public available parallel corpora is somewhat limited in range. There are big corpora about politics or legislation, about medicine and other specific areas, but we miss corpora for other different areas. Currently there is a huge investment on using the ...

متن کامل

Getting to Know Your Corpus

2012

Adam Kilgarriff

Corpora are not easy to get a handle on. The usual way of getting to grips with text is to read it, but corpora are mostly too big to read (and not designed to be read). We show, with examples, how keyword lists (of one corpus vs. another) are a direct, practical and fascinating way to explore the characteristics of corpora, and of text types. Our method is to classify the top one hundred keywo...

متن کامل

Combining Seemingly Incompatible Corpora for Implicit Semantic Role Labeling

2015

Parvin Sadat Feizabadi Sebastian Padó

Implicit semantic role labeling, the task of retrieving locally unrealized arguments from wider discourse context, is a knowledgeintensive task. At the same time, the annotated corpora that exist are all small and scattered across different annotation frameworks, genres, and classes of predicates. Previous work has treated these corpora as incompatible with one another, and has concentrated on ...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید