Succinct Indexes

نویسنده

  • Meng He
چکیده

This thesis defines and designs succinct indexes for several abstract data types (ADTs). The concept is to design auxiliary data structures that ideally occupy asymptotically less space than the information-theoretic lower bound on the space required to encode the given data, and support an extended set of operations using the basic operators defined in the ADT. As opposed to succinct (integrated data/index) encodings, the main advantage of succinct indexes is that we make assumptions only on the ADT through which the main data is accessed, rather than the way in which the data is encoded. This allows more freedom in the encoding of the main data. In this thesis, we present succinct indexes for various data types, namely strings, binary relations, multi-labeled trees and multi-labeled graphs, as well as succinct text indexes. For strings, binary relations and multi-labeled trees, when the operators in the ADTs are supported in constant time, our results are comparable to previous results, while allowing more flexibility in the encoding of the given data. Using our techniques, we improve several previous results. We design succinct representations for strings and binary relations that are more compact than previous results, while supporting access/rank/select operations efficiently. Our high-order entropy compressed text index provides more efficient support for searches than previous results that occupy essentially the same amount of space. Our succinct representation for labeled trees supports more operations than previous results do. We also design the first succinct representations of labeled graphs. To design succinct indexes, we also have some preliminary results on succinct data structure design. We present a theorem that characterizes a permutation as a suffix array, based on which we design succinct text indexes. We design a succinct representation of ordinal trees that supports all the navigational operations supported by various succinct tree representations. In addition, this representation also supports two other encodings schemes of ordinal trees as abstract data types. Finally, we design succinct representations of planar triangulations and planar graphs which support the rank/select of edges in counter clockwise order in addition to other operations supported in previous work, and a succinct representation of k-page graph which supports more efficient navigation than previous results for large values of k.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Succinct Indexes for Strings, Binary Relations and Multi-labeled Trees

We define and design succinct indexes for several abstract data types (ADTs). The concept is to design auxiliary data structures that ideally occupy asymptotically less space than the information-theoretic lower bound on the space required to encode the given data, and support an extended set of operations using the basic operators defined in the ADT. The main advantage of succinct indexes as o...

متن کامل

Optimized succinct data structures for massive data

Succinct data structures provide the same functionality as their corresponding traditional data structure in compact space. We improve on functions rank and select , which are the basic building blocks of FM-indexes and other succinct data structures. First, we present a cache-optimal, uncompressed bitvector representation which outperforms all existing approaches. Next, we improve — in both sp...

متن کامل

A Space-Efficient Framework for Dynamic Point Location

Let G be a planar subdivision with n vertices. A succinct geometric index for G is a data structure that occupies o(n) bits beyond the space required to store the coordinates of the vertices of G, while supporting efficient queries. We describe a general framework for converting dynamic data structures for planar point location into succinct geometric indexes, provided that the subdivision G to...

متن کامل

Succinct: Enabling Queries on Compressed Data

Succinct is a data store that enables efficient queries di-rectly on a compressed representation of the input data.Succinct uses a compression technique that allows ran-dom access into the input, thus enabling efficient stor-age and retrieval of data. In addition, Succinct nativelysupports a wide range of queries including count andsearch of arbitrary strings, range ...

متن کامل

Succinct Suffix Arrays Based on Run-Length Encoding

A succinct full-text self-index is a data structure built on a text T = t1t2 . . . tn, which takes little space (ideally close to that of the compressed text), permits efficient search for the occurrences of a pattern P = p1p2 . . . pm in T , and is able to reproduce any text substring, so the self-index replaces the text. Several remarkable self-indexes have been developed in recent years. The...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008