منابع مشابه
Reconsidering Language Identification for Written Language Resources
The task of identifying the language in which a given document (ranging from a sentence to thousands of pages) is written has been relatively well studied over several decades. Automated approaches to written language identification are used widely throughout research and industrial contexts, over both oral and written source materials. Despite this widespread acceptance, a review of previous r...
متن کاملLanguage-dependent Fusion for Language Identification
A novel fusion approach for Language Identification called Languagedependent Fusion (LDF) is presented in this paper. A fusion system is a hybrid system which fuses the results from several individual sub-systems which utilize varied features, models, and/or classifiers. In LDF, instead of applying single fixed weighting coefficients to each sub-system, which happens in conventional approach su...
متن کاملUnknown language rejection in language identification system
The number of languages in the world is much larger than the number of target languages that current language identication systems can handle. Therefore, we propose here the use of a multilayer perceptron neural network as a means to prevent those unknown language inputs from being misidenti ed as one of the target languages. We consider not only the target language identi cation rate but also ...
متن کاملLanguage identification with language-independent acoustic models
In this paper we explore the use of languageindependent acoustic models for language identi cation (LID). The phone sequence output by a single language-independent phone recognizer is rescored with language-dependent phonotactic models approximated by phone bigrams. The language-independent phoneme inventory was obtained by Agglomerative Hierarchical Clustering, using a measure of similarity b...
متن کاملMultilingual native language identification
We present the first study of Native Language Identification (NLI) applied to text written in languages other than English, using data from six languages. NLI is the task of predicting an author’s first language (L1) using only their writings in a second language (L2), with applications in Second Language Acquisition and forensic linguistics. Most research to date has focused on English but the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE/ACM Transactions on Audio, Speech, and Language Processing
سال: 2015
ISSN: 2329-9290,2329-9304
DOI: 10.1109/taslp.2015.2419978