Structure-Preserving Difference Search for XML Documents

نویسندگان

  • Erich Schubert
  • Sebastian Schaffert
  • François Bry
چکیده

Current XML differencing applications usually try to find a minimal sequence of edit operations that transform one XML document to another XML document (the so-called "edit script"). In our conviction, this approach often produces increments that are unintuitive for human readers and do not reflect the actual changes. We therefore propose in this article a different approach trying to maximise the retained structure instead of minimising the edit sequence. Structure is thereby not limited to the usual tree structure of XML any kind of structural relations can be considered (like parent-child, ancestor-descendant, sibling, document order). In our opinion, this approach is very flexible and able to adapt to the user's requirements. It produces more readable results while still retaining a reasonably small edit sequence. Structure-Preserving Difference Search for XML Documents

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

خوشه‌بندی فراابتکاری اسناد فارسی اِکس‌اِم‌اِل مبتنی بر شباهت ساختاری و محتوایی

Due to the increasing number of documents, XML, effectively organize these documents in order to retrieve useful information from them is essential. A possible solution is performed on the clustering of XML documents in order to discover knowledge. Clustering XML documents is a key issue of how to measure the similarity between XML documents. Conventional clustering of text documents using a do...

متن کامل

Hybrid XML data model architecture for efficient document management

XML has been known as a document standard in representation and exchange of data on the Internet, and is also used as a standard language for the search and reuse of scattered documents on the Internet. The issues related to XML are how to model data on effective and efficient management of semi-structured data and how to actually store the modeled data when implementing a XML contents manageme...

متن کامل

Querying standardized EHRs by a Search Ontology XML extension (SOX)

Motivation: The previously developed Search Ontology (SO) allows domain experts to formally specify domain concepts, search terms associated to a domain, and rules describing domain concepts. So far, Lucene search queries can be generated from information contained in the SO and can be used for querying literature data bases or PubMed. However, this is still insufficient, since these queries ar...

متن کامل

Full Text Search in XML Documents

The goal of this paper is to show how XML structure information can be used for full text search in XML documents. Existing products for full text search are investigated regarding their support of XML. The main aspect of this investigation is how the search scope of queries is specified and narrowed by taking advantage of the XML format. Considering the results of this investigation, a suggest...

متن کامل

A Query Expression and Processing Technique for an XML Search Engine

One of the virtues of XML is that it allows complex structures to be easily expressed. This allows XML to be used as an intermediate, neutral, and standard form for representing many types of structured or semistructured documents that arise in a great variety of applications. To support for efficient queries against XML data, many query languages have been designed. The query languages require...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005