Filtration Algorithms for Approximate Order-Preserving Matching
نویسندگان
چکیده
The exact order-preserving matching problem is to find all the substrings of a text T which have the same length and relative order as a pattern P . Like string maching, order-preserving matching can be generalized by allowing the match to be approximate. In approximate order-preserving matching two strings match if they have the same relative order after removing up to k elements in the same positions in both strings. In this paper we present practical solutions for this problem. The methods are based on filtration, and one of them is the first sublinear solution on average. We show by practical experiments that the new solutions are fast and efficient.
منابع مشابه
Order-Preserving Matching with Filtration
The problem of order-preserving matching has gained attention lately. The text and the pattern consist of numbers. The task is to find all substrings in the text which have the same relative order as the pattern. The problem has applications in analysis of time series like stock market or weather data. Solutions based on the KMP and BMH algorithms have been presented earlier. We present a new s...
متن کاملA Fast Order-Preserving Matching with q-neighborhood Filtration Using SIMD Instructions
The order-preserving matching problem is a variant of the pattern matching problem focusing on shapes of sequences instead of values of sequences. Given a text and a pattern, the problem is to output all positions where the pattern and a subsequence in the text are of the same relative order. Chhabra and Tarhio proposed a fast algorithm based on filtration for the order-preserving matching prob...
متن کاملAdaptive Approximate Record Matching
Typographical data entry errors and incomplete documents, produce imperfect records in real world databases. These errors generate distinct records which belong to the same entity. The aim of Approximate Record Matching is to find multiple records which belong to an entity. In this paper, an algorithm for Approximate Record Matching is proposed that can be adapted automatically with input error...
متن کاملAlternative Algorithms for Order-Preserving Matching
The problem of order-preserving matching is to find all substrings in the text which have the same relative order and length as the pattern. Several online and one offline solution were earlier proposed for the problem. In this paper, we introduce three new solutions based on filtration. The two online solutions rest on the SIMD (Single Instruction Multiple Data) architecture and the offline so...
متن کاملFaster Filters for Approximate String Matching
We introduce a new filtering method for approximate string matching called the suffix filter. It has some similarity with well-known filtration algorithms, which we call factor filters, and which are among the best practical algorithms for approximate string matching using a text index. Suffix filters are stronger, i.e., produce fewer false matches than factor filters. We demonstrate experiment...
متن کامل