Bits_Pilani@INLI-FIRE-2017: Indian Native Language Identification using Deep Learning
نویسندگان
چکیده
The task of Native Language Identification involves identifying the prior or first learnt language of a user based on his writing technique and/or analysis of speech and phonetics in second language. There is a surplus of such data present on social media sites and organised dataset from bodies like Educational Testing Service(ETS), which can be exploited to develop language learning systems and forensic linguistics. In this paper we propose a deep neural network for this task using hierarchical paragraph encoder with attention mechanism to identify relevant features over tendencies and errors a user makes with second language for the INLI task in FIRE 2017. The task involves six Indian languages as prior/native set and english as the second language which has been collected from user's social media account.
منابع مشابه
Mangalore-University@INLI-FIRE-2017: Indian Native Language Identification using Support Vector Machines and Ensemble approach
This paper describes the systems submitted by our team for Indian Native Language Identification (INLI) task held in conjunction with FIRE 2017. Native Language Identification (NLI) is an important task that has different applications in different areas such as social-media analysis, authorship identification, second language acquisition and forensic investigation. We submitted two systems usin...
متن کاملDalTeam@INLI-FIRE-2017: Native Language Identification using SVM with SGD Training
Native Language Identification (NLI), as a variant of Language Identification task, focuses on determining an author’s native language, based on a writing sample in their non-native language. In recent years, the challenging nature of NLI has drawn much attention from the research community. Its application and importance are relevant in many fields, such as personalization of a new language le...
متن کاملOverview of the INLI PAN at FIRE-2017 Track on Indian Native Language Identification
This overview paper describes the first shared task on Indian Native Language Identification (INLI) that was organized at FIRE 2017. Given a corpus with comments in English from various Facebook newspapers pages, the objective of the task is to identify the native language among the following six Indian languages: Bengali, Hindi, Kannada, Malayalam, Tamil, and Telugu. Altogether, 26 approaches ...
متن کاملSeerNet@INLI-FIRE-2017: Hierarchical Ensemble for Indian Native Language Identification
Native Language Identification has played an important role in forensics primarily for author profiling and identification. In this work, we discuss our approach to the shared task of Indian Language Identification. The task is primarily to identify the native language of the writer from the given XML file which contains a set of Facebook comments in the English language. We propose a hierarchi...
متن کاملSSN_NLP@INLI-FIRE-2017: A Neural Network Approach to Indian Native Language Identification
Native Language Identification (NLI) is the process of identifying the native language of non-native speakers based on their speech or writing. It has several applications namely authorship profiling and identification, forensic analysis, second language identification, and educational applications. English is one of the prominent language used by most of the non-English people in the world. Th...
متن کامل