A new decentralized periodic replication strategy for dynamic data grids

نویسندگان

  • Hanene Chettaoui
  • Faouzi Ben Charrada
چکیده

Data grids provide scalable infrastructure for storage resource and data files management, which support dataintensive applications that need to access to huge amount of data stored at distributed locations around the world. The size of these data can reach the scale of terabytes or even petabytes in many applications. These applications require reaching several main goals, namely efficient accessing, storing, transferring and analyzing a large amount of data in geographically distributed locations. In this situation, replication is a general and simple technique used in data grids to achieve these goals. Indeed, it has as main purposes improving data access efficiency, providing high availability, decreasing bandwidth consumption, improving fault tolerance and enhancing scalability. In this paper, we propose a new classification of replication strategies through two complementary criteria as well as a survey of the induced categories of strategies. In addition, we introduce a new decentralized periodic replication strategy for dynamic data grids assuming limited storage for replicas, called DPRSKP, which stands for Decentralized Periodic Replication Strategy based on Knapsack Problem. This strategy takes into consideration the changing availability of sites. DPRSKP is based on two polynomial-time complexity algorithms. The first one starts by selecting the best candidate files for replication while the second places them in the best locations. The replication problem in DPRSKP is formulated according to the Knapsack problem. In addition, DPRSKP extends the well known LRU and LFU strategies. The simulation experiments were carried out using OptorSim and a dynamic period rather than a static one. The obtained results show that DPRSKP can effectively improve response time, bandwidth consumption, remote file accesses number and local file accesses number as compared with other replication strategies.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Data Grids Performance by Using Modified Dynamic Hierarchical Replication Strategy

Abstract: A Data Grid connects a collection of geographically distributed computational and storage resources that enables users to share data and other resources. Data replication, a technique much discussed by Data Grid researchers in recent years creates multiple copies of file and places them in various locations to shorten file access times. In this paper, a dynamic data replication strate...

متن کامل

An Efficient Data Replication Strategy in Large-Scale Data Grid Environments Based on Availability and Popularity

The data grid technology, which uses the scale of the Internet to solve storage limitation for the huge amount of data, has become one of the hot research topics. Recently, data replication strategies have been widely employed in distributed environment to copy frequently accessed data in suitable sites. The primary purposes are shortening distance of file transmission and achieving files from ...

متن کامل

CFS: a new dynamic replication strategy for data grids

Data grids are currently proposed solutions to large scale data management problems including efficient file transfer and replication. Large amounts of data and the world-wide distribution of data stores contribute to the complexity of the data management challenge. Recent architecture proposals and prototypes deal with dynamic replication strategies for a high-performance data grid. This paper...

متن کامل

E2DR: Energy Efficient Data Replication in Data Grid

Abstract— Data grids are an important branch of gird computing which provide mechanisms for the management of large volumes of distributed data. Energy efficiency has recently emerged as a hot topic in large distributed systems. The development of computing systems is traditionally focused on performance improvements driven by the demand of client's applications in scientific and business domai...

متن کامل

Improving Job Scheduling Performance with Dynamic Replication Strategy in Data Grids

Dealing with a large amount of data in Data Grids makes the requirement for efficient data access more critical. In this paper, we proposed a new approach to replication problem by organizing the data into several data categories that it belongs to. This organizing will help improving placement strategy of data replication. We studied our approach in combination with scheduling issue and evalua...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Scalable Computing: Practice and Experience

دوره 15  شماره 

صفحات  -

تاریخ انتشار 2014