Linear Algorithm for Conservative Degenerate Pattern Matching
نویسندگان
چکیده
A degenerate symbol x̃ over an alphabet Σ is a non-empty subset of Σ, and a sequence of such symbols is a degenerate string. A degenerate string is said to be conservative if its number of non-solid symbols is upper-bounded by a fixed positive constant k. We consider here the matching problem of conservative degenerate strings and present the first linear-time algorithm that can find, for given degenerate strings P̃ and T̃ of total length n containing k non-solid symbols in total, the occurrences of P̃ in T̃ in O(nk) time.
منابع مشابه
A New Approach to Pattern Matching in Degenerate DNA/RNA Sequences and Distributed Pattern Matching
In this paper, we consider the pattern matching problem in DNA and RNA sequences where either the pattern or the text can be degenerate i.e. contain sets of characters. We present an asymptotically faster algorithm for the above problem that works in O(n logm) time, where n and m is the length of the text and the pattern respectively. We also suggest an efficient implementation of our algorithm...
متن کاملEfficient Pattern Matching in Elastic-Degenerate Strings
In this paper, we extend the notion of gapped strings to elastic-degenerate strings. An elastic-degenerate string can been seen as an ordered collection of k > 1 seeds (substrings/subpatterns) interleaved by elastic-degenerate symbols such that each elastic-degenerate symbol corresponds to a set of two or more variable length strings. Here, we present an algorithm for solving the pattern matchi...
متن کاملPattern Matching in Degenerate DNA/RNA Sequences
In this paper, we consider the pattern matching problem in DNA and RNA sequences where either the pattern or the text can be degenerate i.e. contain sets of characters. We present an asymptotically faster algorithm for the above problem that works in O(n logm) time, where n and m is the length of the text and the pattern respectively. We also suggest an efficient implementation of our algorithm...
متن کاملEfficient pattern matching in degenerate strings with the Burrows-Wheeler transform
A degenerate or indeterminate string on an alphabet Σ is a sequence of non-empty subsets of Σ. Given a degenerate string t of length n, we present a new method based on the Burrows–Wheeler transform for searching for a degenerate pattern of length m in t running in O(mn) time on a constant size alphabet Σ. Furthermore, it is a hybrid patternmatching technique that works on both regular and dege...
متن کاملParallel Algorithms for Degenerate and Weighted Sequences Derived from High Throughput Sequencing Technologies
Novel high throughput sequencing technologies have redefined the way genome sequencing is performed. They are able to produce millions of short sequences in a single experiment and with a much lower cost than previous methods. In this paper, we address the problem of efficiently mapping and classifying millions of degenerate and weighted sequences to a reference genome, based on whether they oc...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Eng. Appl. of AI
دوره 51 شماره
صفحات -
تاریخ انتشار 2016