The File Mover: high-performance data transfer for the grid
نویسندگان
چکیده
The exploration in many scientific disciplines (e.g., High-Energy Physics, Climate Modeling, and Life Sciences) involves the production and the analysis of massive data collections, whose archival, retrieval, and analysis require the coordinated usage of high capacity computing, network, and storage resources. To obtain satisfactory performance, these applications require the availability of a high-performance, reliable data transfer mechanisms, able to minimize the data transfer time that often dominates their execution time. In this paper we present the File Mover, an efficient data transfer system specifically tailored to the needs of data-intensive applications, that exploits the overlay networks paradigm to provide superior performance with respect to conventional file transfer systems. An extensive experimental evaluation, carried out by means of a proof-of-concept implementation of the File Mover for a variety of network scenarions, shows the ability of the File Mover to outperform alternative data transfer systems.
منابع مشابه
Improving Data Grids Performance by Using Modified Dynamic Hierarchical Replication Strategy
Abstract: A Data Grid connects a collection of geographically distributed computational and storage resources that enables users to share data and other resources. Data replication, a technique much discussed by Data Grid researchers in recent years creates multiple copies of file and places them in various locations to shorten file access times. In this paper, a dynamic data replication strate...
متن کاملE2DR: Energy Efficient Data Replication in Data Grid
Abstract— Data grids are an important branch of gird computing which provide mechanisms for the management of large volumes of distributed data. Energy efficiency has recently emerged as a hot topic in large distributed systems. The development of computing systems is traditionally focused on performance improvements driven by the demand of client's applications in scientific and business domai...
متن کاملAn Efficient Data Replication Strategy in Large-Scale Data Grid Environments Based on Availability and Popularity
The data grid technology, which uses the scale of the Internet to solve storage limitation for the huge amount of data, has become one of the hot research topics. Recently, data replication strategies have been widely employed in distributed environment to copy frequently accessed data in suitable sites. The primary purposes are shortening distance of file transmission and achieving files from ...
متن کاملA Framework for Data Management and Transfer in Grid Environments
The main obstacles to grid file management come from the fact that grid file resources are typically stored in heterogeneous and distributed environment and accessed through various protocols. In this paper, we propose a grid file management system called Vega [1][2] Hotfile2 for data-intensive application in widely distributed systems and grid environments. Widely distributed and heterogeneous...
متن کاملJPARSS: A Java Parallel Network Package for Grid Computing
The emergence of high speed wide area networks makes grid computing a reality. However grid applications that need reliable data transfer still have difficulties to achieve optimal TCP performance due to network tuning of TCP window size to improve bandwidth and to reduce latency on a high speed wide area network. This paper presents a Java package called JPARSS (Java Parallel Secure Stream (So...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Concurrency and Computation: Practice and Experience
دوره 20 شماره
صفحات -
تاریخ انتشار 2008