SSN_NLP@INLI-FIRE-2017: A Neural Network Approach to Indian Native Language Identification

نویسندگان

  • D. Thenmozhi
  • Kawshik Kannan
  • Chandrabose Aravindan
چکیده

Native Language Identification (NLI) is the process of identifying the native language of non-native speakers based on their speech or writing. It has several applications namely authorship profiling and identification, forensic analysis, second language identification, and educational applications. English is one of the prominent language used by most of the non-English people in the world. The native language of the non-English speakers may be easily identified based on their English accents. However, identification of native language based on the users posts and comments written in English is a challenging task. In this paper, we present a neural network approach to identify the native language of an Indian speaker based on the English comments that are posted in microblogs. The lexical features are extracted from the text posted by the user and are used to build a neural network classifier to identify the native language of the user. We have evaluated our approach using the data set given by INLI@FIRE2017 shared task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mangalore-University@INLI-FIRE-2017: Indian Native Language Identification using Support Vector Machines and Ensemble approach

This paper describes the systems submitted by our team for Indian Native Language Identification (INLI) task held in conjunction with FIRE 2017. Native Language Identification (NLI) is an important task that has different applications in different areas such as social-media analysis, authorship identification, second language acquisition and forensic investigation. We submitted two systems usin...

متن کامل

Bits_Pilani@INLI-FIRE-2017: Indian Native Language Identification using Deep Learning

The task of Native Language Identification involves identifying the prior or first learnt language of a user based on his writing technique and/or analysis of speech and phonetics in second language. There is a surplus of such data present on social media sites and organised dataset from bodies like Educational Testing Service(ETS), which can be exploited to develop language learning systems an...

متن کامل

Overview of the INLI PAN at FIRE-2017 Track on Indian Native Language Identification

This overview paper describes the first shared task on Indian Native Language Identification (INLI) that was organized at FIRE 2017. Given a corpus with comments in English from various Facebook newspapers pages, the objective of the task is to identify the native language among the following six Indian languages: Bengali, Hindi, Kannada, Malayalam, Tamil, and Telugu. Altogether, 26 approaches ...

متن کامل

SeerNet@INLI-FIRE-2017: Hierarchical Ensemble for Indian Native Language Identification

Native Language Identification has played an important role in forensics primarily for author profiling and identification. In this work, we discuss our approach to the shared task of Indian Language Identification. The task is primarily to identify the native language of the writer from the given XML file which contains a set of Facebook comments in the English language. We propose a hierarchi...

متن کامل

Bharathi SSN @ INLI-FIRE-2017: SVM based approach for Indian Native Language Identification

Native Language Identification (NLI) is the task of identifying the native language of a writer or a speaker by analyzing their text. NLI can be important for a number of applications. In forensic linguistics, native language is often used as an important feature for authorship profiling and identification. Nowadays due to the huge usage of social media sites and online interactions, receiving ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017