Summarising Unreliable Data
نویسنده
چکیده
Unreliable data is present in datasets, and is either ignored, acknowledged ad hoc, or undetected. This paper discusses data quality issues with a potential framework in mind to deal with them. Such a framework should be applied within data-to-text systems at the generation of text rather than being an afterthought. This paper also shows ways to express uncertainty through language and World Health Organisation (WHO) corpus studies, and an experiment which analyses how subjects approached summarising data with data quality issues. This work is still ongoing.
منابع مشابه
Regression analysis of country effects using multilevel data: A cautionary tale
Cross-national differences in outcomes are often analysed using regression analysis of multilevel country datasets, examples of which include the ECHP, ESS, EU-SILC, EVS, ISSP, and SHARE. We review the regression methods applicable to this data structure, pointing out problems with the assessment of country-level factors that appear not to be widely appreciated, and illustrate our arguments usi...
متن کاملResilient Blocks for Summarising Distributed Data
Summarising distributed data is a central routine for parallel programming, lying at the core of widely used frameworks such as the map/reduce paradigm. In the IoT context it is even more crucial, being a privileged mean to allow long-range interactions: in fact, summarising is needed to avoid data explosion in each computational unit. We introduce a new algorithm for dynamic summarising of dis...
متن کاملDistributed Incremental Least Mean-Square for Parameter Estimation using Heterogeneous Adaptive Networks in Unreliable Measurements
Adaptive networks include a set of nodes with adaptation and learning abilities for modeling various types of self-organized and complex activities encountered in the real world. This paper presents the effect of heterogeneously distributed incremental LMS algorithm with ideal links on the quality of unknown parameter estimation. In heterogeneous adaptive networks, a fraction of the nodes, defi...
متن کاملAutomatic summarising: The state of the art
This paper reviews research on automatic summarising in the last decade. This work has grown, stimulated by technology and by evaluation programmes. The paper uses several frameworks to organise the review, for summarising itself, for the factors affecting summarising, for systems,
متن کاملA New Architecture for Summarising Time Series Data
This paper presents a new architecture for summarising complex time series data, in which four main components together with a knowledge base and a database are integrated. Based on the architecture, a knowledge-based text generation system has been implemented and its main functions are briefly explained in the context of a sample of data. Evaluation of the system has been done and some conclu...
متن کامل