Interpretable Categorization of Heterogeneous Time Series Data
نویسندگان
چکیده
The explanation of heterogeneous multivariate time series data is a central problem in many applications. The problem requires two major data mining challenges to be addressed simultaneously: Learning models that are humaninterpretable and mining of heterogeneous multivariate time series data. The intersection of these two areas is not adequately explored in the existing literature. To address this gap, we propose grammar-based decision trees and an algorithm for learning them. Grammar-based decision tree extends decision trees with a grammar framework. Logical expressions, derived from context-free grammar, are used for branching in place of simple thresholds on attributes. The added expressivity enables support for a wide range of data types while retaining the interpretability of decision trees. By choosing a grammar based on temporal logic, we show that grammar-based decision trees can be used for the interpretable classification of high-dimensional and heterogeneous time series data. In addition to classification, we show how grammar-based decision trees can also be used for categorization, which is a combination of clustering and generating interpretable explanations for each cluster. We apply grammar-based decision trees to analyze the classic Australian Sign Language dataset as well as categorize and explain near midair collisions to support the development of a prototype aircraft collision avoidance system.
منابع مشابه
TECHNICAL REPORT: On Recognition of Seasonal Predictability in SLIGRO product sales
A summary of new results on SLIGRO sales prediction. 1 The task We aim to predict one week ahead the quantity of weekly product sales, aggregated over all SLIGRO locations. 2 What’s new since ICDM paper The modifications since ICDM paper are the following: 1. Product categorization into 2 classes: ‘predictable’ and ‘unpredictable’ (previously we had 4). 2. Product categorization using the Linea...
متن کاملExtracting Interpretable Features for Early Classification on Time Series
Early classification on time series data has been found highly useful in a few important applications, such as medical and health informatics, industry production management, safety and security management. While some classifiers have been proposed to achieve good earliness in classification, the interpretability of early classification remains largely an open problem. Without interpretable fea...
متن کاملApplication of multivariate techniques in-line with spatial regionalization of AOD over Iran
Application of multivariate techniques in-line with spatial regionalization of AOD over Iran Introduction Models, satellites and terrestrial datasets have been used to detect and characterize aerosol. Nontheless, micoscale classification using remote sensing parameters considers as a deficiency. Thus, regionalizion and modeling aerosol without regard to political boundaries or a specific s...
متن کاملStochastic Comparisons of Series and Parallel Systems with Heterogeneous Extended Generalized Exponential Components
In this paper, we discuss the usual stochastic‎, ‎likelihood ratio, ‎dispersive and convex transform order between two parallel systems with independent heterogeneous extended generalized exponential components. ‎We also establish the usual stochastic order between series systems from two independent heterogeneous extended generalized exponential samples. ‎Finally, ‎we f...
متن کاملNatural Color Categories Are Convex Sets
The paper presents a statistical evaluation of the typological data about color naming systems across the languages of the world that have been obtained by the World Color Survey. In a first step, we discuss a principal component analysis of the categorization data that led to a small set of easily interpretable features that are dominant in color categorization. These features were used for a ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1708.09121 شماره
صفحات -
تاریخ انتشار 2017