نتایج جستجو برای: corpus linguistics
تعداد نتایج: 98006 فیلتر نتایج به سال:
Adopting corpus-based empirical approaches to linguistics, this paper has two main goals: the first is to propose formal methodology to extract meaningful quantitative characterizations from Chinese corpora, the second is to achieve generalizations about theoretically significant linguistic qualities based on these quantitative data. The quantitative scales discussed include mutual information,...
"Coreference resolution" or "finding all expressions that refer to the same entity" in a text, is one of the important requirements in natural language processing. Two words are coreference when both refer to a single entity in the text or the real world. So the main task of coreference resolution systems is to identify terms that refer to a unique entity. A coreference resolution tool could be...
Abstract Linguistics, English linguistics in particular, has witnessed a remarkable quantitative turn since the 1990s and early 2000s. It was both scale quality, concerning degree (including of sophistication) to which empirical studies, statistical techniques, modelling have come be used determine linguistic research. Which role corpus probabilistic linguistics, including usage-based approache...
The acquisition of linguistic knowledge, i.e., the identication, extraction, and encoding of linguistic information in a corpus, has been one of the main motivations for data-driven approaches to natural language. Methods have been developed for the acquisition of, for instance, parts of speech, noun compounds, collocations, support verbs, subcategorization frames, phrase structure rules, selec...
Labelling data is one of the most fundamental activities in science, and has underpinned practice, particularly medicine, for decades, as well research corpus linguistics since at least d
An unsupervised learning method, based on corpus linguistics and special language terminology, is described that can extract time-varying information from text streams. The method is shown to be ‘language-independent’ in that its use leads to sets of regular-expressions that can be used to extract the information in typologically distinct languages like English and Arabic. The method uses the i...
Improving methods of automatic deception detection is an important goal of many researchers from a variety of disciplines, including psychology, computational linguistics, and criminology. We present a system to automatically identify deceptive utterances using acoustic-prosodic, lexical, syntactic, and phonotactic features. We train and test our system on the Interspeech 2016 ComParE challenge...
In this paper the Spoken Dutch Corpus Project is presented, a joint Flemish-Dutch undertaking aimed at the compilation and annotation of a 10million-word corpus of spoken Dutch. Upon completion, the corpus will constitute a valuable resource for research in the fields of computational linguistics and language and speech technology. The paper first gives an overview of the project. It then goes ...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید