Efficient Similarity Search Techniques with a Real-Time Approximate Analysis in Streaming Database
نویسندگان
چکیده
In many applications such as sensor networks, similarity search is more practical than exact match in stream processing, where both the queries and the data items are always change over time. The volumes of multi-streams could be very large, since new items are continuously appended. The main idea is to build a small size of synopsis instead of keeping original streams by using our proposed techniques, then to provide approximate answers for many different classes of aggregate queries. In this paper, we present D-skyline and T-skyline methods give almost “true” results on approximated analysis for similarity search query in streaming environments.
منابع مشابه
Similarity Search in Time Series Databases
In many application domains, data can be represented as a series of values (time series). Examples include stocks, seismic signals, audio, and many more. Similarity search in time series databases is an important research direction. Several methods have been proposed in order to provide algorithms for efficient query processing in the case of static time series of fixed length. Research in this...
متن کاملAdaptive similarity search in streaming time series with sliding windows
The challenge in a database of evolving time series is to provide efficient algorithms and access methods for query processing, taking into consideration the fact that the database changes continuously as new data become available. Traditional access methods that continuously update the data are considered inappropriate, due to significant update costs. In this paper, we use the IDC-Index (Incr...
متن کاملDesign and Test of the Real-time Text mining dashboard for Twitter
One of today's major research trends in the field of information systems is the discovery of implicit knowledge hidden in dataset that is currently being produced at high speed, large volumes and with a wide variety of formats. Data with such features is called big data. Extracting, processing, and visualizing the huge amount of data, today has become one of the concerns of data science scholar...
متن کاملSimilarity Range Queries in Streaming Time Series
Similarity search in time series databases is an important research direction. Several methods have been proposed in order to provide algorithms for efficient query processing in the case of static time series of fixed length. In streaming time series the similarity problem is more complex, since the dynamic nature of streaming data make these methods inappropriate. In this paper, we propose a ...
متن کاملSimilarity Search and Locality Sensitive Hashing using TCAMs
Similarity search methods are widely used as kernels in various data mining and machine learning applications including those in computational biology, web search/clustering. Nearest neighbor search (NNS) algorithms are often used to retrieve similar entries, given a query. While there exist efficient techniques for exact query lookup using hashing, similarity search using exact nearest neighbo...
متن کامل