Efficient parallel compression and decompression for large XML files

نویسندگان

  • Mohammad Ali
  • Minhaj Ahmad Khan
چکیده

eXtensible Markup Language (XML) is gaining popularity and is being used widely on internet for storing and exchanging data. Large XML files when transferred on network create bottleneck and also degrade the query performance. Therefore, efficient mechanisms of compression and decompression are applied to XML files. In this paper, an algorithm for performing XML compression and decompression is suggested. The suggested approach reads an XML file, removes tags, divides the XML file into different parts and then compresses each different part on a separate core for achieving efficiency. We compare performance results of the proposed algorithm with parallel compression and decompression of XML files using GZIP. The performance results show that the suggested algorithm performs 24%, 53% and 72% better than the parallel GZIP compression and decompression on Intel Xeon, Intel core i7 and Intel core i3 based architectures respectively.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Updates of Compressed Dynamic XML Documents

Because of the ever-growing number of applications that send numerous and potentially large XML files over networks there has been a recent interest in efficient updates of XML documents. However all known approaches deal with uncompressed documents. In this paper, we describe a novel XML compressor, XSAQCT designed to improve the efficiency of querying and updating XML documents with minimal d...

متن کامل

Algorithm for XML Compression using DTD and Stack

Worldwide standard for data definition is XML. For developing SOA based applications XML is extensively used. SOA based applications contains many different applications which are integrated to each other. For solving the problem of interoperability XML documents are used. XML is widely used for a variety of tasks, including configuration files, protocols, and web services. XML has problem with...

متن کامل

Efficient Trace File Compression Design with Locality and Address Difference

Trace-driven simulation is a simple, fast, and convenient approach to simulate computer architecture for power consumption, throughput, CPU time, and other factors. However, trace-driven simulation requires a massive storage space to save the trace files of benchmark programs. Therefore, an important task is how to design a compression method that reduces the storage space of trace files effici...

متن کامل

XML index compression by DTD subtraction

Whenever XML is used as format to exchange large amounts of data or even for data streams, the verbose behaviour of XML is one of the bottlenecks. While compression of XML data seems to be a way out, it is essential for a variety of applications that the compression result can be queried efficiently. Furthermore, for efficient path query evaluation, an index is desired, which usually generates ...

متن کامل

Combining Efficient XML Compression with Query Processing

This paper describes a new XML compression scheme that offers both high compression ratios and short query response time. Its core is a fully reversible transform featuring substitution of every word in an XML document using a semi-dynamic dictionary, effective encoding of dictionary indices, as well as numbers, dates and times found in the document, and grouping data within the same structural...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Int. Arab J. Inf. Technol.

دوره 13  شماره 

صفحات  -

تاریخ انتشار 2016