Dynamic data replication in LCG 2008

نویسندگان

  • Caitriana Nicholson
  • David G. Cameron
  • A. T. Doyle
  • A. Paul Millar
  • Kurt Stockinger
چکیده

To provide performant access to data from high energy physics experiments such as the Large Hadron Collider (LHC), controlled replication of files among grid sites is required. Dynamic replication in response to jobs may also be useful, and has been investigated using the grid simulator OptorSim. In this paper, results from simulation of the LHC Computing Grid in 2008, in a physics analysis scenario, are presented. These show, first, that dynamic replication does give improve job throughput by optimising resource usage; second, that for this complex grid system, simple replication strategies such as LRU and LFU are as effective as more advanced economic models; third, that grid site policies which allow maximum resource sharing are more effective; and lastly, that dynamic replication is particularly effective when data access patterns include some files being accessed more often than others, such as with a Zipf-like distribution.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

File Management for HEP Data Grids

The next generation of high energy physics experiments, such as the Large Hadron Collider (LHC) at CERN, the European Organization for Nuclear Research, pose a challenge to current data handling methodologies, where data tends to be centralised in a single location. Data grids, including the LHC Computing Grid (LCG), are being developed to meet this challenge by unifying computing and storage r...

متن کامل

Grid Data Management: Simulations of Lcg 2008

Simulations have been performed with the grid simulator OptorSim using the expected analysis patterns from the LHC experiments and a realistic model of the LCG at LHC startup, with thousands of user analysis jobs running at over a hundred grid sites. It is shown, first, that dynamic data replication plays a significant role in the overall analysis throughput in terms of optimising job throughpu...

متن کامل

A Survey of Dynamic Replication Strategies for Improving Response Time in Data Grid Environment

Large-scale data management is a critical problem in a distributed system such as cloud,P2P system, World Wide Web (WWW), and Data Grid. One of the effective solutions is data replicationtechnique, which efficiently reduces the cost of communication and improves the data reliability andresponse time. Various replication methods can be proposed depending on when, where, and howreplicas are gener...

متن کامل

Improving Data Grids Performance by Using Modified Dynamic Hierarchical Replication Strategy

Abstract: A Data Grid connects a collection of geographically distributed computational and storage resources that enables users to share data and other resources. Data replication, a technique much discussed by Data Grid researchers in recent years creates multiple copies of file and places them in various locations to shorten file access times. In this paper, a dynamic data replication strate...

متن کامل

Dynamic Replication based on Firefly Algorithm in Data Grid

In data grid, using reservation is accepted to provide scheduling and service quality. Users need to have an access to the stored data in geographical environment, which can be solved by using replication, and an action taken to reach certainty. As a result, users are directed toward the nearest version to access information. The most important point is to know in which sites and distributed sy...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Concurrency and Computation: Practice and Experience

دوره 20  شماره 

صفحات  -

تاریخ انتشار 2008