Choosing the content of textual summaries of large time-series data sets

نویسندگان

  • Jin Yu
  • Ehud Reiter
  • Jim Hunter
  • Chris Mellish
چکیده

Natural Language Generation (NLG) can be used to generate textual summaries of numeric data sets. In this paper we develop an architecture for generating short (a few sentences) summaries of large (100KB or more) time-series data sets. The architecture integrates pattern recognition, pattern abstraction, selection of the most significant patterns, microplanning (especially aggregation), and realisation. We also describe and evaluate SumTime-Turbine, a prototype system which uses this architecture to generate textual summaries of sensor data from gas turbines.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Hybrid Time Series Clustering Method Based on Fuzzy C-Means Algorithm: An Agreement Based Clustering Approach

In recent years, the advancement of information gathering technologies such as GPS and GSM networks have led to huge complex datasets such as time series and trajectories. As a result it is essential to use appropriate methods to analyze the produced large raw datasets. Extracting useful information from large data sets has always been one of the most important challenges in different sciences,...

متن کامل

SumTime-Turbine: A Knowledge-Based System to Communicate Gas Turbine Time-Series Data

SumTime-Turbine produces textual summaries of archived timeseries data from gas turbines. These summaries should help experts understand large data sets that cannot be visually presented in a single graphical display. SumTime-Turbine is based on pattern detection, knowledge-based temporal abstraction (KBTA), and natural language generation (NLG) technology. A prototype version of the system has...

متن کامل

Summarizing Neonatal Time Series Data

We describe our investigations in generating textual summaries of physiological time series data to aid medical personnel in monitoring babies in neonatal intensive care units. Our studies suggest that summarization is a communicative task that requires data analysis techniques for determining the content of the summary. We describe a prototype system that summarizes physiological time series.

متن کامل

SUMTIME: Observations from KA for Weather Domain

SUMTIME (http://www.csd.abdn.ac.in/research/sumtime) is a research project aimed at developing a generic computational model for producing textual summaries of time series data. This report summarises some of the observations made during the initial knowledge acquisition sessions carried out in the weather forecasting domain. Based on these observations, we describe a two-stage model for conten...

متن کامل

IGR For GR/M76881/01: Generating Summaries of Time-Series Data (SumTime) Background/Context

Background/Context The goal of the SumTime project was to develop better techniques for generating English summaries of numerical time-series data. The modern world is being flooded with such data. For example, a typical gas-turbine has 250 sensors, each sampling once per minute. This produces 200MB of data per day, which a maintenance engineer may have one hour (per day) to attempt to understa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Natural Language Engineering

دوره 13  شماره 

صفحات  -

تاریخ انتشار 2007