An Investigation into the Validity of Some Metrics for Automatically Evaluating Natural Language Generation Systems

نویسندگان

چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Investigation into the Validity of Some Metrics for Automatically Evaluating Natural Language Generation Systems

There is growing interest in using automatically computed corpus-based evaluation metrics to evaluate Natural Language Generation (NLG) systems, because these are often considerably cheaper than the human-based evaluations which have traditionally been used in NLG. We review previous work on NLG evaluation and on validation of automatic metrics in NLP, and then present the results of two studie...

متن کامل

Relevance of Unsupervised Metrics in Task-Oriented Dialogue for Evaluating Natural Language Generation

Automated metrics such as BLEU are widely used in the machine translation literature. They have also been used recently in the dialogue community for evaluating dialogue response generation. However, previous work in dialogue response generation has shown that these metrics do not correlate strongly with human judgment in the non task-oriented dialogue setting. Task-oriented dialogue responses ...

متن کامل

an investigation into translation of cultural concepts by beginner and advance student using think – aloud protocols

this research aims at answering the questions about translation problems and strategies applied by translators when translating cultural concepts. in order to address this issue, qualitative and quantitative study were conducted on two groups of subjects at imam reza international university of mashhad. these two groups were assigned as beginner and advanced translation students (10 students). ...

an investigation into iranian teachers consistency and bias in evaluation of students writings

while performance-based language assessment has led to an increased authenticity and content validity in the practice of writing assessment, the reliability of ratings has become a major issue. research findings have shown different reactions by native english speaker (nes) and non-native english speaker (nns) teachers to students’ writings. the focus of this study is on investigating whether i...

an investigation about the appropriate stochastic modeling framework for agricultural insurance pricing

با توجه به اینکه بیمه محصولات کشاورزی در ایران بیشتر جنبه ای حمایتی دارد و خسارات گزارش شده عموما بیش از حق بیمه های دریافت شده است، در این پایان نامه به جهت تعیین قیمت بیمه محصولات کشاورزی (گندم دیم) از فرآیندهای نوفه شلیک به عنوان مدلی مناسب استفاده شده است. بر اساس داده های صندوق بیمه کشاورزی از خسارات اعلام شده در سال زراعی 1388-1389 گندم دیم، در این پایان نامه حق بیمه خالص و ناخالص این محص...

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Computational Linguistics

سال: 2009

ISSN: 0891-2017,1530-9312

DOI: 10.1162/coli.2009.35.4.35405