Task-Based Evaluation of NLG Systems: Control vs Real-World Context

نویسنده

Ehud Reiter

چکیده

Currently there is little agreement about, or even discussion of, methodologies for taskbased evaluation of NLG systems. I discuss one specific issue in this area, namely the importance of control vs the importance of ecological validity (real-world context), and suggest that perhaps we need to put more emphasis on ecological validity in NLG evaluations.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Workshop on Shared Tasks and Comparative Evaluation in Natural Language Generation

Today’s NLG efforts should be compared against actual human performance, which is fluent and varies randomly and with context. Consequently, evaluations should not be done against a fixed ‘gold standard’ text, and shared task efforts should not assume that they can stipulate the representation of the source content and still let players generate the diversity of texts that the real world calls ...

متن کامل

Evaluation of NLG: Some Analogies and Differences with Machine Translation and Reference Resolution

This short paper first outlines an explanatory model that contrasts the evaluation of systems for which human language appears in their input with systems for which language appears in their output, or in both input and output. The paper then compares metrics for NLG evaluation with those applied to MT systems, and then with the case of reference resolution, which is the reverse task of generat...

متن کامل

Real World Modeling and Nonlinear Control of an Electrohydraulic Driven Clutch

In this paper, a complete model of an electro hydraulic driven dry clutch along with its performance evaluation has elucidated. Through precision modeling, a complete nonlinear physical and full order sketch of clutch has drawn. Ultimate nonlinearities existent in the system prohibits it from being controlled by conventional linear control algorithms and to compensate the behavior of the sy...

متن کامل

Validating the web-based evaluation of NLG systems

The GIVE Challenge is a recent shared task in which NLG systems are evaluated over the Internet. In this paper, we validate this novel NLG evaluation methodology by comparing the Internet-based results with results we collected in a lab experiment. We find that the results delivered by both methods are consistent, but the Internetbased approach offers the statistical power necessary for more fi...

متن کامل

Reuse and Challenges in Evaluating Language Generation Systems: Position Paper

Although there is an increasing shift towards evaluating Natural Language Generation (NLG) systems, there are still many NLG-specific open issues that hinder effective comparative and quantitative evaluation in this field. The paper starts off by describing a task-based, i.e., black-box evaluation of a hypertext NLG system. Then we examine the problem of glass-box, i.e., module specific, evalua...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2011

Task-Based Evaluation of NLG Systems: Control vs Real-World Context

نویسنده

چکیده

منابع مشابه

Workshop on Shared Tasks and Comparative Evaluation in Natural Language Generation

Evaluation of NLG: Some Analogies and Differences with Machine Translation and Reference Resolution

Real World Modeling and Nonlinear Control of an Electrohydraulic Driven Clutch

Validating the web-based evaluation of NLG systems

Reuse and Challenges in Evaluating Language Generation Systems: Position Paper

عنوان ژورنال:

اشتراک گذاری