Towards a Faster Symbolic Aggregate Approximation Method
نویسندگان
چکیده
The similarity search problem is one of the main problems in time series data mining. Traditionally, this problem was tackled by sequentially comparing the given query against all the time series in the database, and returning all the time series that are within a predetermined threshold of that query. But the large size and the high dimensionality of time series databases that are in use nowadays make that scenario inefficient. There are many representation techniques that aim at reducing the dimensionality of time series so that the search can be handled faster at a lower-dimensional space level. The symbolic aggregate approximation (SAX) is one of the most competitive methods in the literature. In this paper we present a new method that improves the performance of SAX by adding to it another exclusion condition that increases the exclusion power. This method is based on using two representations of the time series: one of SAX and the other is based on an optimal approximation of the time series. Pre-computed distances are calculated and stored offline to be used online to exclude a wide range of the search space using two exclusion conditions. We conduct experiments which show that the new method is faster than SAX.
منابع مشابه
Enhancing the Symbolic Aggregate Approximation Method Using Updated Lookup Tables
Similarity search in time series data mining is a problem that has attracted increasing attention recently. The high dimensionality and large volume of time series databases make sequential scanning inefficient to tackle this problem. There are many representation techniques that aim at reducing the dimensionality of time series so that the search can be handled faster at a lower dimensional sp...
متن کاملA Symbolic Representation Method to Preserve the Characteristic Slope of Time Series
In recent years many studies have been proposed for knowledge discovery in time series. Most methods use some technique to transform raw data into another representation. Symbolic representations approaches have shown effectiveness in speedup processing and noise removal. The current most commonly used algorithm is the Symbolic Aggregate Approximation (SAX). However, SAX doesn’t preserve the sl...
متن کاملMeasuring Similarity of Automatically Extracted Melodic Pitch Contours for Audio-based Query by Humming of Polyphonic Music Collections
A study of melodic similarity of pitch contours automatically obtained from audio files in the context of Query by Humming is presented. Pitch contours are extracted directly from monophonic (query files) and polyphonic (commercial songs) audio files using a state-of-the-art algorithm MELODIA [SG12] for automatic estimation of predominant melodic contours. The contours are then coded using the ...
متن کاملUsing SAX representation for human action recognition
Human action recognition is an important problem in Computer Vision. Although most of the existing solutions provide good accuracy results, the methods are often overly complex and computationally expensive, hindering practical application. In this regard, we introduce Symbolic Aggregate approximation (SAX) to address the problem of human action recognition. Given motion trajectories of referen...
متن کاملExtended SAX: Extension of Symbolic Aggregate Approximation for Financial Time Series Data Representation
Efficient and accurate similarity searching for a large amount of time series data set is an important but non-trivial problem. Many dimensionality reduction techniques have been proposed for effective representation of time series data in order to realize such similarity searching, including Singular Value Decomposition (SVD), the Discrete Fourier transform (DFT), the Adaptive Piecewise Consta...
متن کامل