Duality-Based Subsequence Matching in Time-Series Databases

نویسندگان

  • Yang-Sae Moon
  • Kyu-Young Whang
  • Woong-Kee Loh
چکیده

In this papec we propose a new subsequence matching method, DualMatch, which exploits duality in constructing windows and significantly improves performance. Qual Match divides data sequences into disjoint windows and the query sequence into sliding windows, and thus, is a dual approach of the one by Faloutsos et al. (FRM in short), which divides data sequences into sliding windows and the query sequence into disjoint windows. We formally prove that our dual approach is correct, i.e., it incurs no false dismissal. We also prove that, given the minimum query length, there is a maximum bound of the window size to guarantee correctness of Dual Match and discuss the effect of the window size on performance. FRM causes a lot of false alarms (i.e., candidates that do not qualify) by storing minimum bounding rectangles rather than individual points representing windows to avoid excessive storage space required for the index. Dual Match solves this problem by directly storing points, but without incurring excessive storage overhead. Experimental results show that, in most cases, DualMatch provides large improvement in both false alarms and performance over FRM, given the same amount of storage space. In particular; for low selectivities (less than DualMatch significantly improves performance up to 430fold. On the other hand, for high selectivities(more than lo-'), it shows a very minor degradation (less than 29%). For selectivities in between (10-4-10-2), Dual Match shows performance slightly better than that of FRM. DualMatch is also 4.10-25.6 times faster than FRM in building indexes of approximately the same size. Overall, these results indicate that our approach provides a new paradigm in subsequence matching that improvesperformance significantly in large database applications.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Eecient Time-series Subsequence Matching Using Duality in Constructing Windows Eecient Time-series Subsequence Matching Using Duality in Constructing Windows

Subsequence matching in time-series databases is an important problem in data mining and has attracted a lot of research interest. It is a problem of nding the data sequences containing subsequences similar to a given query sequence and of nding the oosets of these subsequences in the original data sequences. In this paper, we propose a new approach (Dual Match) to subsequence matching that exp...

متن کامل

Similar Subsequence Search in Time Series Databases

Finding matching subsequences in time series data is an important problem. The classical approach to search for matching subsequences has been on the principle of exhaustive search, where all possible candidates are generated and evaluated or all the terms of the time series in a data base are examined. As a result most of the subsequence search algorithms are cubic in nature with few algorithm...

متن کامل

Linear Detrending Subsequence Matching in Time-Series Databases

Each time-series has its own linear trend, the directionality of a timeseries, and removing the linear trend is crucial to get the more intuitive matching results. Supporting the linear detrending in subsequence matching is a challenging problem due to a huge number of possible subsequences. In this paper we define this problem the linear detrending subsequence matching and propose its efficien...

متن کامل

Ranked Subsequence Matching in Time-Series Databases

Existing work on similar sequence matching has focused on either whole matching or range subsequence matching. In this paper, we present novel methods for ranked subsequence matching under time warping, which finds top-k subsequences most similar to a query sequence from data sequences. To the best of our knowledge, this is the first and most sophisticated subsequence matching solution mentione...

متن کامل

A Subsequence Matching with Gaps-Range-Tolerances Framework: A Query-By-Humming Application

We propose a novel subsequence matching framework that allows for gaps in both the query and target sequences, variable matching tolerance levels efficiently tuned for each query and target sequence, and also constrains the maximum match length. Using this framework, a space and time efficient dynamic programming method is developed: given a short query sequence and a large database, our method...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001