DataGuide-based Distribution for XML Documents
نویسنده
چکیده
Distribution is a well-known solution to increase performance and provide load balancing in case you need optimal resource utilization. Together with replication it also allows improved reliability, accessibility and fault-tolerance. However since the amount of data is large there is a problem of maintaining meta-information about distribution and finding needed data fragments during execution of queries. These problems are well understood but they have not received much attention in the context of XML data management. This paper presents research-in-progress, which examines the possibility of management of meta-information about XML data distribution extending auxillary index structure called DataGuide.
منابع مشابه
Indexation des documents XML : Un DataGuide annoté avec un index de contenu
Indexing in classical information retrieval brings few tools for the treatment of the semi-structured documents: the representations of documents in information retrieval were conceived for flat and homogeneous documents. They are not adapted to the simultaneous treatment of the structure and the contents. Several approaches of indexing semi-structured data was proposed to resolve this new chal...
متن کاملContent-Aware DataGuides for Indexing Large Collections of XML Documents
XML is well-suited for modelling structured data with textual content. However, most indexing approaches perform structure and content matching independently, combining the retrieved path and keyword occurrences in a third step. This paper shows that retrieval in XML documents can be accelerated significantly by processing text and structure simultaneously during all retrieval phases. To this e...
متن کاملContent-Aware DataGuides: Interleaving IR and DB Indexing Techniques for Efficient Retrieval of Textual XML Data
Not only since the advent of XML, many applications call for efficient structured document retrieval, challenging both Information Retrieval (IR) and database (DB) research. Most approaches combining indexing techniques from both fields still separate path and content matching, merging the hits in an expensive join. This paper shows that retrieval is significantly accelerated by processing text...
متن کاملخوشهبندی فراابتکاری اسناد فارسی اِکساِماِل مبتنی بر شباهت ساختاری و محتوایی
Due to the increasing number of documents, XML, effectively organize these documents in order to retrieve useful information from them is essential. A possible solution is performed on the clustering of XML documents in order to discover knowledge. Clustering XML documents is a key issue of how to measure the similarity between XML documents. Conventional clustering of text documents using a do...
متن کاملData Structures for Maintaining Path Statistics in Distributed XML Stores
The paper contains description of distributed XML store model based on notion of distributed XML document. Classification of XPath expressions is defined and the notion of distributed XML document is introduced. Definition of DataGuide-based statistical structure for XML stores is proposed and two possible approaches to maintain its actuality are discussed. Stability of feedback-based approach ...
متن کامل