Bits_Pilani@INLI-FIRE-2017: Indian Native Language Identification using Deep Learning

نویسندگان

  • Rupal Bhargava
  • Jaspreet Singh
  • Shivangi Arora
  • Yashvardhan Sharma
چکیده

The task of Native Language Identification involves identifying the prior or first learnt language of a user based on his writing technique and/or analysis of speech and phonetics in second language. There is a surplus of such data present on social media sites and organised dataset from bodies like Educational Testing Service(ETS), which can be exploited to develop language learning systems and forensic linguistics. In this paper we propose a deep neural network for this task using hierarchical paragraph encoder with attention mechanism to identify relevant features over tendencies and errors a user makes with second language for the INLI task in FIRE 2017. The task involves six Indian languages as prior/native set and english as the second language which has been collected from user's social media account.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mangalore-University@INLI-FIRE-2017: Indian Native Language Identification using Support Vector Machines and Ensemble approach

This paper describes the systems submitted by our team for Indian Native Language Identification (INLI) task held in conjunction with FIRE 2017. Native Language Identification (NLI) is an important task that has different applications in different areas such as social-media analysis, authorship identification, second language acquisition and forensic investigation. We submitted two systems usin...

متن کامل

DalTeam@INLI-FIRE-2017: Native Language Identification using SVM with SGD Training

Native Language Identification (NLI), as a variant of Language Identification task, focuses on determining an author’s native language, based on a writing sample in their non-native language. In recent years, the challenging nature of NLI has drawn much attention from the research community. Its application and importance are relevant in many fields, such as personalization of a new language le...

متن کامل

Overview of the INLI PAN at FIRE-2017 Track on Indian Native Language Identification

This overview paper describes the first shared task on Indian Native Language Identification (INLI) that was organized at FIRE 2017. Given a corpus with comments in English from various Facebook newspapers pages, the objective of the task is to identify the native language among the following six Indian languages: Bengali, Hindi, Kannada, Malayalam, Tamil, and Telugu. Altogether, 26 approaches ...

متن کامل

SeerNet@INLI-FIRE-2017: Hierarchical Ensemble for Indian Native Language Identification

Native Language Identification has played an important role in forensics primarily for author profiling and identification. In this work, we discuss our approach to the shared task of Indian Language Identification. The task is primarily to identify the native language of the writer from the given XML file which contains a set of Facebook comments in the English language. We propose a hierarchi...

متن کامل

SSN_NLP@INLI-FIRE-2017: A Neural Network Approach to Indian Native Language Identification

Native Language Identification (NLI) is the process of identifying the native language of non-native speakers based on their speech or writing. It has several applications namely authorship profiling and identification, forensic analysis, second language identification, and educational applications. English is one of the prominent language used by most of the non-English people in the world. Th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017