Semantic-based Construction of Content and Structure XML Index
نویسندگان
چکیده
Content And Structure (CAS) index for XML data is an important index type that has not been widely researched, even though its role is important especially in multi domain applications. Most existing researches in XML Queries Optimization focus on structure index alone. Few have utilized the rich semantic of XML data to support CAS index and querying. In this paper, we propose two indexes namely Structural index and Content index, whose construction utilizes XML data semantics and schema. These indexes contribute to a better CAS queries performance. The experiments prove that our method improves the performance of CAS queries by reducing the cost of CPU time and the total number of scanned elements compared to a standard method.
منابع مشابه
خوشهبندی فراابتکاری اسناد فارسی اِکساِماِل مبتنی بر شباهت ساختاری و محتوایی
Due to the increasing number of documents, XML, effectively organize these documents in order to retrieve useful information from them is essential. A possible solution is performed on the clustering of XML documents in order to discover knowledge. Clustering XML documents is a key issue of how to measure the similarity between XML documents. Conventional clustering of text documents using a do...
متن کاملبررسی واکنش موتورهای کاوش وب به پیشینههای فرادادهای مبتنی برروش ترکیبی دادههای خرد و روش دادههای پیوندی
The purpose of this research was to find out the reaction of Web Search Engines to Metadata records created based on the combined method of Rich Snippets and Linked Data. 200 metadata records in two groups (100 records as the control group with the normal structure and, 100 records created based on microdata and implemented in RDF/XML as experimental group) extracted from the information gatewa...
متن کاملUtilizing the Structure and Data Information for XML Document Clustering
This paper reports on the experiments and results of a clustering approach used in the INEX 2008 Document Mining Challenge. The clustering approach utilizes both the structure and the content information of the XML documents in the Wikipedia collection. The content of the XML documents is measured using the latent semantic kernel (LSK). A well-known problem with the construction of latent seman...
متن کاملIndex and Search XML Documents by Combining Content and Structure
By nesting data, XML format allows embedding additional semantic which is not possible using flat text format. Obviously, capturing this semantic will enhance the effectiveness of the searching process in an XML corpus. Many approaches address the XML searching problem. Approaches stemmed from database communities are concentrated on the data structure. In this case, users have to express their...
متن کاملDeveloping a BIM-based Spatial Ontology for Semantic Querying of 3D Property Information
With the growing dominance of complex and multi-level urban structures, current cadastral systems, which are often developed based on 2D representations, are not capable of providing unambiguous spatial information about urban properties. Therefore, the concept of 3D cadastre is proposed to support 3D digital representation of land and properties and facilitate the communication of legal owners...
متن کامل