KS_JU@DPIL-FIRE2016: Detecting Paraphrases in Indian Languages Using Multinomial Logistic Regression Model

نویسنده

  • Kamal Sarkar
چکیده

In this work, we describe a system that detects paraphrases in Indian Languages as part of our participation in the shared Task on detecting paraphrases in Indian Languages (DPIL) organized by Forum for Information Retrieval Evaluation (FIRE) in 2016. Our paraphrase detection method uses a multinomial logistic regression model trained with a variety of features which are basically lexical and semantic level similarities between two sentences in a pair. The performance of the system has been evaluated against the test set released for the FIRE 2016 shared task on DPIL. Our systemachieves the highest f-measure of 0.95on task1 in Punjabi language.The performance of our system ontask1 in Hindi language is f-measure of 0.90. Out of 11 teams participated in the shared task, only four teams participated in all four languages, Hindi, Punjabi, Malayalam and Tamil, but the remaining 7 teams participated in one of the four languages. We also participated in task1 and task2 both for all four Indian Languages. The overall average performance of our system including task1 and task2 overall four languages is F1-score of 0.81 which is the second highest score among the four systems thatparticipated in all four languages.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DPIL@FIRE2016: Overview of the Shared task on Detecting Paraphrases in Indian language

This paper explains the overview of the shared task "Detecting Paraphrases in Indian Languages" (DPIL) conducted at FIRE 2016. Given a pair of sentences in the same language, participants are asked to detect the semantic equivalence between the sentences. The shared task is proposed for four Indian languages namely Tamil, Malayalam, Hindi and Punjabi. The dataset created for the shared task has...

متن کامل

CUSAT_TEAM@ DPIL-FIRE2016: Detecting Paraphrase in Indian Languages-Malayalam

This paper describes the work done as part of the shared task on Detecting Paraphrases in Indian Languages(DPIL) in Forum for Information Retrieval and Evaluation(FIRE 2016). Paraphrase identification is the task of deciding whether two given text fragments have the same meaning. Our detection system is for Malayalam language and makes use of the cosine similarity measure, an existing state of ...

متن کامل

JU_NLP@DPIL-FIRE2016: Paraphrase Detection in Indian Languages - A Machine Learning Approach

This paper presents our system report on our participation in the shared task on “Detecting Paraphrases in Indian Languages (DPIL)” organized in the “Forum for Information Retrieval Evaluation (FIRE)”2016, in both the tasks (Task1 and Task2) defined in this shared task in four Indian languages (Tamil, Malayalam, Hindi and Punjabi). We made use of different similarity measures and machine transl...

متن کامل

BITS_PILANI@DPIL-FIRE2016: Paraphrase Detection in Hindi Language using Syntactic Features of Phrase

Paraphrasing means expressing or conveying the same meaning or essence of a sentence or text using different words or rearrangement of words. Paraphrase detection is a challenge, especially in Indian languages like Hindi, because it is very essential to understand the semantics of the language. Detecting paraphrases is very relevant in real life because it has a lot of importance in application...

متن کامل

KEC@DPIL-FIRE2016: Detection of Paraphrases in Indian Languages (Tamil)

This paper presents a report on Detecting Paraphrases in Indian Languages (DPIL), in particular the Tamil language, by the team NLP@KEC of Kongu Engineering College. Automatic paraphrase detection is an intellectual task which has immense applications like plagiarism detection, new event detection, etc. Paraphrase is defined as the expression of a given fact in more than one way by means of dif...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016