Instance-Based Natural Language Generation

نویسندگان

  • Sebastian Varges
  • Chris Mellish
چکیده

We investigate the use of instance-based ranking methods for surface realization in natural language generation. Our approach to instance-based natural language generation (IBNLG) employs two components: a rule system that ‘overgenerates’ a number of realization candidates from a meaning representation and an instance-based ranker that scores the candidates according to their similarity to examples taken from a training corpus. We develop an efficient search technique for identifying the optimal candidate based on a novel extension of the A∗ algorithm. The rule system is produced automatically from a semantically annotated fragment of the Penn Treebank II containing management succession texts. We detail the annotation scheme and grammar induction algorithm and evaluate the efficiency and output of the generator. We also discuss issues such as input coverage (completeness) and fluency that are relevant to surface generation in general.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Designing a Speech Corpus for Instance-based Spoken Language Generation

In spoken language applications such as conversation systems where not only the speech waveforms but also the content of the speech (the text) need to be generated automatically, a Concept-to-Speech (CTS) system is needed. In this paper, we address several issues on designing a speech corpus to facilitate an instance-based integrated CTS platform. Both the instance-based CTS generation approach...

متن کامل

A Repository of Frame Instance Lexicalizations for Generation

Robust, statistical Natural Language Generation from Web knowledge bases is hindered by the lack of text-aligned resources. We aim to fill this gap by presenting a method for extracting knowledge from natural language text, and encode it in a format based on frame semantics and ready to be distributed in the Linked Open Data space. We run an implementation of such methodology on a collection of...

متن کامل

Instance-based Sentence Boundary Determination by Optimization for Natural Language Generation

This paper describes a novel instancebased sentence boundary determination method for natural language generation that optimizes a set of criteria based on examples in a corpus. Compared to existing sentence boundary determination approaches, our work offers three significant contributions. First, our approach provides a general domain independent framework that effectively addresses sentence b...

متن کامل

Special Track on Applied Natural Language Processing

The track on applied natural language processing is a forum for researchers working in natural language processing (NLP), computational linguistics (CL), and related areas. The rapid pace of development of online materials, most of them in textual form or text combined with other media (visual, audio), has led to a revived interest for tools capable to understand, organize and mine those materi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Natural Language Engineering

دوره 16  شماره 

صفحات  -

تاریخ انتشار 2001