Grouping Multivariate Time Series: A Case Study
نویسندگان
چکیده
We present a case study to demonstrate a process for grouping massive multivariate time series based on nonparametric statistical summaries aided by information visualization. We want a method that allows us to quickly find approximate groups in time series, both to identify typical aggregate behaviors and to find aberrant outliers. We use simple statistical summaries to capture the temporal nature and variability of the time series, as well as the interaction between the various multivariate components. Each individual time series is mapped to a fixed-length vector of summaries. The summary vectors are then clustered using any fast clustering algorithm like -means. Appropriate information visualization techniques are used at every stage to guide the analyst. Because the method is nonparametric, it is customizable and flexible, and it generalizes easily. When choosing the statistical summaries, we can incorporate domain knowledge that may enhance the clustering. We demonstrate with a massive real life telecommunications application.
منابع مشابه
Evaluation of Univariate, Multivariate and Combined Time Series Model to Prediction and Estimation the Mean Annual Sediment (Case Study: Sistan River)
Erosion, sediment transport and sediment estimate phenomenon with their damage in rivers is a one of the most importance point in river engineering. Correctly modeling and prediction of this parameter with involving the river flow discharge can be most useful in life of hydraulic structures and drainage networks. In fact, using the multivariate models and involving the effective other parameter...
متن کاملIdentification of outliers types in multivariate time series using genetic algorithm
Multivariate time series data, often, modeled using vector autoregressive moving average (VARMA) model. But presence of outliers can violates the stationary assumption and may lead to wrong modeling, biased estimation of parameters and inaccurate prediction. Thus, detection of these points and how to deal properly with them, especially in relation to modeling and parameter estimation of VARMA m...
متن کاملAn Empirical Comparison of Distance Measures for Multivariate Time Series Clustering
Multivariate time series (MTS) data are ubiquitous in science and daily life, and how to measure their similarity is a core part of MTS analyzing process. Many of the research efforts in this context have focused on proposing novel similarity measures for the underlying data. However, with the countless techniques to estimate similarity between MTS, this field suffers from a lack of comparative...
متن کاملMissing data imputation in multivariable time series data
Multivariate time series data are found in a variety of fields such as bioinformatics, biology, genetics, astronomy, geography and finance. Many time series datasets contain missing data. Multivariate time series missing data imputation is a challenging topic and needs to be carefully considered before learning or predicting time series. Frequent researches have been done on the use of diffe...
متن کاملVariable grouping in multivariate time series via correlation
The decomposition of high-dimensional multivariate time series (MTS) into a number of low-dimensional MTS is a useful but challenging task because the number of possible dependencies between variables is likely to be huge. This paper is about a systematic study of the "variable groupings" problem in MTS. In particular, we investigate different methods of utilizing the information regarding corr...
متن کامل