Structure-Preserving Difference Search for XML Documents
نویسندگان
چکیده
Current XML differencing applications usually try to find a minimal sequence of edit operations that transform one XML document to another XML document (the so-called "edit script"). In our conviction, this approach often produces increments that are unintuitive for human readers and do not reflect the actual changes. We therefore propose in this article a different approach trying to maximise the retained structure instead of minimising the edit sequence. Structure is thereby not limited to the usual tree structure of XML any kind of structural relations can be considered (like parent-child, ancestor-descendant, sibling, document order). In our opinion, this approach is very flexible and able to adapt to the user's requirements. It produces more readable results while still retaining a reasonably small edit sequence. Structure-Preserving Difference Search for XML Documents
منابع مشابه
خوشهبندی فراابتکاری اسناد فارسی اِکساِماِل مبتنی بر شباهت ساختاری و محتوایی
Due to the increasing number of documents, XML, effectively organize these documents in order to retrieve useful information from them is essential. A possible solution is performed on the clustering of XML documents in order to discover knowledge. Clustering XML documents is a key issue of how to measure the similarity between XML documents. Conventional clustering of text documents using a do...
متن کاملHybrid XML data model architecture for efficient document management
XML has been known as a document standard in representation and exchange of data on the Internet, and is also used as a standard language for the search and reuse of scattered documents on the Internet. The issues related to XML are how to model data on effective and efficient management of semi-structured data and how to actually store the modeled data when implementing a XML contents manageme...
متن کاملQuerying standardized EHRs by a Search Ontology XML extension (SOX)
Motivation: The previously developed Search Ontology (SO) allows domain experts to formally specify domain concepts, search terms associated to a domain, and rules describing domain concepts. So far, Lucene search queries can be generated from information contained in the SO and can be used for querying literature data bases or PubMed. However, this is still insufficient, since these queries ar...
متن کاملFull Text Search in XML Documents
The goal of this paper is to show how XML structure information can be used for full text search in XML documents. Existing products for full text search are investigated regarding their support of XML. The main aspect of this investigation is how the search scope of queries is specified and narrowed by taking advantage of the XML format. Considering the results of this investigation, a suggest...
متن کاملA Query Expression and Processing Technique for an XML Search Engine
One of the virtues of XML is that it allows complex structures to be easily expressed. This allows XML to be used as an intermediate, neutral, and standard form for representing many types of structured or semistructured documents that arise in a great variety of applications. To support for efficient queries against XML data, many query languages have been designed. The query languages require...
متن کامل