CodePlugin: Plugging Deduplication into Erasure Coding for Cloud Storage

نویسندگان

  • Mengbai Xiao
  • Mohammed Anowarul Hassan
  • Weijun Xiao
  • Qi Wei
  • Songqing Chen
چکیده

Cloud storage systems play a key role in many cloud services. To tolerate multiple simultaneous disk failures and reduce the storage overhead, today cloud storage systems often employ erasure coding schemes. To simplify implementations, existing systems, such as Microsoft Azure and EMCAtmos, only support file appending operations. However, this feature leads to a nontrivial and increasing portion of redundant data on cloud storage systems. To reduce the data redundancy due to file updates by users so as to reduce the corresponding encoding and storage cost, in this work, we investigate how to efficiently integrate the inline deduplication capability into the general context of the Reed-Solomon (RS) code. For this purpose, we present our initial design of CodePlugin. Basically, CodePlugin introduces some preprocessing steps before the normal encoding. In these pre-processing steps, the data duplications are identified and properly shuffled so that the redundant blocks do not have to be encoded. CodePlugin is applicable to any existing coding scheme and our preliminary experimental results show that CodePlugin can effectively improve the encoding throughput (by ∼ 20%) and reduce the storage cost (by ∼ 17.4%).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

InterCloud RAIDer: A Do-It-Yourself Multi-cloud Private Data Backup System

In this paper, we introduce InterCloud RAIDer, which realizes a multi-cloud private data backup system by composing (i) a data deduplication technique to reduce the overall storage overhead, (ii) erasure coding to achieve redundancy at low overhead, which is dispersed across multiple cloud services to realize fault-tolerance against individual service providers, specifically we use non-systemat...

متن کامل

In-line Deduplication for Cloud storage to Reduce Fragmentation by using Historical Knowledge

Recovery and Backup system in which the process involves that copying and archiving of data on different cloud server, so that this data is used to recover the unique data, afterward a loss event. Purpose of backup is to recover data after its loss and to improve data from a past time. In backup systems, the fragments of every data file are physically distributed over multiple servers, which in...

متن کامل

GPU Erasure Coding for Campaign Storage

High-performance computing (HPC) demands high bandwidth and low latency in I/O performance leading to the development of storage systems and I/O software components that strive to provide greater and greater performance. However, capital and energy budgets along with increasing storage capacity requirements have motivated the search for lower cost, large storage systems for HPC. With Burst Buff...

متن کامل

Erasure Coding for Cloud Storage Systems: A Survey

In the current era of cloud computing, data stored in the cloud is being generated at a tremendous speed, and thus the cloud storage system has become one of the key components in cloud computing. By storing a substantial amount of data in commodity disks inside the data center that hosts the cloud, the cloud storage system must consider one question very carefully: how do we store data reliabl...

متن کامل

Solving the Secure Storage Dilemma: An Efficient Scheme for Secure Deduplication with Privacy-Preserving Public Auditing

Existing cloud storage systems receive the data in its plain form and perform conventional (server-side) deduplication mechanisms. However, disclosing the data to the cloud can potentially threaten the security and privacy of users, which is of utmost importance for a real-world cloud storage. This can be solved by secure deduplication mechanisms which enables the user to encrypt the data on th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015