Efficient parallel compression and decompression for large XML files
نویسندگان
چکیده
eXtensible Markup Language (XML) is gaining popularity and is being used widely on internet for storing and exchanging data. Large XML files when transferred on network create bottleneck and also degrade the query performance. Therefore, efficient mechanisms of compression and decompression are applied to XML files. In this paper, an algorithm for performing XML compression and decompression is suggested. The suggested approach reads an XML file, removes tags, divides the XML file into different parts and then compresses each different part on a separate core for achieving efficiency. We compare performance results of the proposed algorithm with parallel compression and decompression of XML files using GZIP. The performance results show that the suggested algorithm performs 24%, 53% and 72% better than the parallel GZIP compression and decompression on Intel Xeon, Intel core i7 and Intel core i3 based architectures respectively.
منابع مشابه
Updates of Compressed Dynamic XML Documents
Because of the ever-growing number of applications that send numerous and potentially large XML files over networks there has been a recent interest in efficient updates of XML documents. However all known approaches deal with uncompressed documents. In this paper, we describe a novel XML compressor, XSAQCT designed to improve the efficiency of querying and updating XML documents with minimal d...
متن کاملAlgorithm for XML Compression using DTD and Stack
Worldwide standard for data definition is XML. For developing SOA based applications XML is extensively used. SOA based applications contains many different applications which are integrated to each other. For solving the problem of interoperability XML documents are used. XML is widely used for a variety of tasks, including configuration files, protocols, and web services. XML has problem with...
متن کاملEfficient Trace File Compression Design with Locality and Address Difference
Trace-driven simulation is a simple, fast, and convenient approach to simulate computer architecture for power consumption, throughput, CPU time, and other factors. However, trace-driven simulation requires a massive storage space to save the trace files of benchmark programs. Therefore, an important task is how to design a compression method that reduces the storage space of trace files effici...
متن کاملXML index compression by DTD subtraction
Whenever XML is used as format to exchange large amounts of data or even for data streams, the verbose behaviour of XML is one of the bottlenecks. While compression of XML data seems to be a way out, it is essential for a variety of applications that the compression result can be queried efficiently. Furthermore, for efficient path query evaluation, an index is desired, which usually generates ...
متن کاملCombining Efficient XML Compression with Query Processing
This paper describes a new XML compression scheme that offers both high compression ratios and short query response time. Its core is a fully reversible transform featuring substitution of every word in an XML document using a semi-dynamic dictionary, effective encoding of dictionary indices, as well as numbers, dates and times found in the document, and grouping data within the same structural...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Int. Arab J. Inf. Technol.
دوره 13 شماره
صفحات -
تاریخ انتشار 2016