Predicting the Performance of Data Transfer in a Grid Environment
نویسندگان
چکیده
In a Grid environment, implementing a parallel algorithm for data transfer or multiple parallel jobs allocation doesn’t give reliable data transfer. There is a need to predict the data transfer performance before allocating the parallel processes on grid nodes. In this paper we propose a predictive framework for performing efficient data transfer. Our framework considers different phases for providing information about efficient and reliable participating nodes in a computational Grid environment. Experimental results reveal that multivariable predictors provide better accuracy compared to univariable predictors. We observe that the Neural Network prediction technique provides better prediction accuracy compared to the Multiple Linear Regression and Decision Regression. Proposed ranking factor overcomes the problem of considering fresh participating nodes in data transfer.
منابع مشابه
Improving Data Grids Performance by Using Modified Dynamic Hierarchical Replication Strategy
Abstract: A Data Grid connects a collection of geographically distributed computational and storage resources that enables users to share data and other resources. Data replication, a technique much discussed by Data Grid researchers in recent years creates multiple copies of file and places them in various locations to shorten file access times. In this paper, a dynamic data replication strate...
متن کاملDynamic Replication based on Firefly Algorithm in Data Grid
In data grid, using reservation is accepted to provide scheduling and service quality. Users need to have an access to the stored data in geographical environment, which can be solved by using replication, and an action taken to reach certainty. As a result, users are directed toward the nearest version to access information. The most important point is to know in which sites and distributed sy...
متن کاملSampling-Based Tasks Scheduling in Dynamic Grid Environment
-In this paper, we propose a new solution for data mining task scheduling in Grid environment. First, we propose a sample-based application run time evaluation. Then, we propose a cost model for predicting the data transfer time on Grid. Finally, according the priori estimation of the application response time and the data transfer time, we propose the method for tasks scheduling in grid enviro...
متن کاملA New Job Scheduling in Data Grid Environment Based on Data and Computational Resource Availability
Data Grid is an infrastructure that controls huge amount of data files, and provides intensive computational resources across geographically distributed collaboration. The heterogeneity and geographic dispersion of grid resources and applications place some complex problems such as job scheduling. Most existing scheduling algorithms in Grids only focus on one kind of Grid jobs which can be data...
متن کاملA Survey of Dynamic Replication Strategies for Improving Response Time in Data Grid Environment
Large-scale data management is a critical problem in a distributed system such as cloud,P2P system, World Wide Web (WWW), and Data Grid. One of the effective solutions is data replicationtechnique, which efficiently reduces the cost of communication and improves the data reliability andresponse time. Various replication methods can be proposed depending on when, where, and howreplicas are gener...
متن کامل