نتایج جستجو برای: corpora
تعداد نتایج: 19685 فیلتر نتایج به سال:
Abstract Philosophers and linguists currently lack the means to reliably identify evaluative concepts measure their intensity. Using a corpus-based approach, we present new method distinguish evaluatively thick thin adjectives like ‘courageous’ ‘awful’ from descriptive ‘narrow,’ value-associated ‘sunny.’ Our study suggests that modifiers ‘truly’ ‘really’ frequently highlight dimension of adject...
The National Institute for Japanese Language (NIJL) has launched a long-term language corpus development initiative aiming at the development of a super-corpus called KOTONOHA, which is consisting of a multitude of independent corpora. Among the constituent corpora of KOTONOHA, the one that bears the most urgent need is a largescale balanced corpus of the present-day written Japanese. Construct...
The proliferation of deep learning methods in natural language processing (NLP) and the large amounts of data they often require stands in stark contrast to the relatively data-poor clinical NLP domain. In particular, large text corpora are necessary to build high-quality word embeddings, yet often large corpora that are suitably representative of the target clinical data are unavailable. This ...
Most current machine transliteration systems employ a corpus of known sourcetarget word pairs to train their system, and typically evaluate their systems on a similar corpus. In this paper we explore the performance of transliteration systems on corpora that are varied in a controlled way. In particular, we control the number, and prior language knowledge of human transliterators used to constr...
The estimation of translation lexicon probabilities from parallel corpora is well studied in statistical machine translation. Whenever parallel corpora are not available, it is still possible to obtain unsupervised estimates from pairs of monolingual, non-parallel corpora. In both cases the standard estimator is the Expectation-Maximization (EM) that aims at increasing the likelihood of the sou...
In our days, the notion, the importance and the significance of parallel corpora is so big that needs no special introduction. Unfortunately, public available parallel corpora is somewhat limited in range. There are big corpora about politics or legislation, about medicine and other specific areas, but we miss corpora for other different areas. Currently there is a huge investment on using the ...
Corpora are not easy to get a handle on. The usual way of getting to grips with text is to read it, but corpora are mostly too big to read (and not designed to be read). We show, with examples, how keyword lists (of one corpus vs. another) are a direct, practical and fascinating way to explore the characteristics of corpora, and of text types. Our method is to classify the top one hundred keywo...
Implicit semantic role labeling, the task of retrieving locally unrealized arguments from wider discourse context, is a knowledgeintensive task. At the same time, the annotated corpora that exist are all small and scattered across different annotation frameworks, genres, and classes of predicates. Previous work has treated these corpora as incompatible with one another, and has concentrated on ...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید