ESP-index: A compressed index based on edit-sensitive parsing
نویسندگان
چکیده
منابع مشابه
A Searchable Compressed Edit-Sensitive Parsing
A searchable data structure for the edit-sensitive parsing (ESP) is proposed. Given a string S, its ESP tree is equivalent to a context-free grammar G generating just S, which is represented by a DAG. Using the succinct data structures for trees and permutations, G is decomposed to two LOUDS bit strings and single array in (1+ε)n log n+ 4n+o(n) bits for any 0 < ε < 1 and the number n of variabl...
متن کاملIndex-compressed vector quantisation based on index mapping
The authors introduce a novel coding technique which significantly improves the performance of the traditional vector quantisation (VQ) schemes at low bit rates. High interblock correlation in natural images results in a high probability that neighbouring image blocks are mapped to small subsets of the VQ codebook, which contains highly correlated codevectors. If, instead of the whole VQ codebo...
متن کاملA Compressed Text Index on Secondary Memory
We introduce a practical disk-based compressed text index that, when the text is compressible, takes much less space than the suffix array. It provides good I/O times for searching, which in particular improve when the text is compressible. In this aspect our index is unique, as most compressed indexes are slower than their classical counterparts on secondary memory. We analyze our index and sh...
متن کاملThe FM-Index: A Compressed Full-Text Index Based on the BWT
In this talk we address the issue of indexing compressed data both from the theoretical and the practical point of view. We start by introducing the FM-index data structure [2] that supports substring searches and occupies a space which is a function of the entropy of the indexed data. The key feature of the FM-index is that it encapsulates the indexed data (self-index) and achieves the space r...
متن کاملThe Compressed Overlap Index
For analysing text algorithms, for computing superstrings, or for testing random number generators, one needs to compute all overlaps between any pairs of words in a given set. The positions of overlaps of a word onto itself, or of two words, are needed to compute the absence probability of a word in a random text, or the numbers of common words shared by two random texts. In all these contexts...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Discrete Algorithms
سال: 2013
ISSN: 1570-8667
DOI: 10.1016/j.jda.2012.07.009