Approximate Queries on Set-valued Attributes
نویسندگان
چکیده
Sets and sequences are commonly used to model complex entities. Attributes containing sets or sequences of elements appear in various application domains, e.g., in telecommunication and retail databases, web server log tools, bioinformatics, etc. However, the support for such attributes is usually limited to definition and storage in relational tables. Contemporary database systems don’t support either indexing or advanced querying of set or sequence attributes, such as executing set containment or set similarity queries. In this paper we focus on approximate queries on set and sequence attributes. We present the notion of an approximate query and we review similarity measures proposed so far for such attributes. We introduce a new similarity measure that can be successfully used with sequences. We present the hierarchical bitmap index – a novel and efficient indexing technique for sets and show how the hierarchical bitmap index framework can be extended to incorporate sequences as well. We conclude with algorithms for efficient approximate query processing using the hierarchical bitmap index.
منابع مشابه
Hierarchical Bitmap Index: An Efficient and Scalable Indexing Technique for Set-Valued Attributes
Set-valued attributes are convenient to model complex objects occurring in the real world. Currently available database systems support the storage of set-valued attributes in relational tables but contain no primitives to query them efficiently. Queries involving set-valued attributes either perform full scans of the source data or make multiple passes over single-value indexes to reduce the n...
متن کاملUniversal Approximation of Interval-valued Fuzzy Systems Based on Interval-valued Implications
It is firstly proved that the multi-input-single-output (MISO) fuzzy systems based on interval-valued $R$- and $S$-implications can approximate any continuous function defined on a compact set to arbitrary accuracy. A formula to compute the lower upper bounds on the number of interval-valued fuzzy sets needed to achieve a pre-specified approximation accuracy for an arbitrary multivariate con...
متن کاملIndex Structures for Databases Containing Data Items with Set-valued Attributes Index Structures for Databases Containing Data Items with Set-valued Attributes
We introduce two new hash-based index structures to index set-valued attributes. Both are able to support subset and superset queries. Analytical cost models for the new index structures as well as for the two existing index structures, sequential signature le and Russian Doll Tree, are presented and experimentally validated. Using the validated cost model, we express the performance of all fou...
متن کاملAn Indexing Method for Handling Queries on Set-Valued Attributes in Object-Oriented Databases
We propose a signature-based indexing method for object-oriented query handling in this paper. Signature file based access methods initially applied on text for their filtering capability have now been used to handle set-oriented queries in Object-Oriented Data Bases (OODBs). All the proposed techniques use either search methods that take longer retrieval time or tree based intermediate data st...
متن کاملParallel Sub-Collection Join Algorithm for High Performance Object-Oriented Databases
In Object-Oriented Databases (OODB), although path expression between classes may exist, it is sometimes necessary to perform an explicit join between two or more classes due to the absence of pointer connections or the need for value matching between objects. Furthermore, since objects are not in a normal form, an attribute of a class may have a collection as a domain. Collection attributes ar...
متن کامل