On Scheduling Algorithms for MapReduce Jobs in Heterogeneous Clouds with Budget Constraints
نویسندگان
چکیده
In this paper, we consider task-level scheduling algorithms with respect to budget constraints for a bag of MapReduce jobs on a set of provisioned heterogeneous (virtual) machines in cloud platforms. The heterogeneity is manifested in the popular ”pay-as-you-go” charging model where the service machines with different performance would have different service rates. We organize a bag of jobs as a κ-stage workflow and consider the scheduling problem with budget constraints. In particular, given a total monetary budget, by combining a greedy-based local optimal algorithm and dynamic programming techniques, we first propose a global optimal scheduling algorithm to achieve a minimum scheduling length of the workflow in pseudo-polynomial time. Then, we extend the idea in the greedy algorithm to efficient global distribution of the budget among the tasks in different stages for overall scheduling length reduction. Our empirical studies verify the proposed optimal algorithm and show the efficiency of the greedy algorithm to minimize the scheduling length.
منابع مشابه
On Optimal Budget-Driven Scheduling Algorithms for MapReduce Jobs in the Heterogeneous Cloud
In this paper, we consider task-level scheduling algorithms with res-pect to budget and deadline constraints for a bag of MapReduce jobs on a set of provisioned heterogeneous (virtual) machines in cloud platforms. Heterogeneity is manifested in the ”pay-as-you-go” charging model we use, where service machines with different performance have different service rates. We organize the bag of jobs a...
متن کاملThree Hybrid Metaheuristic Algorithms for Stochastic Flexible Flow Shop Scheduling Problem with Preventive Maintenance and Budget Constraint
Stochastic flexible flow shop scheduling problem (SFFSSP) is one the main focus of researchers due to the complexity arises from inherent uncertainties and also the difficulty of solving such NP-hard problems. Conventionally, in such problems each machine’s job process time may encounter uncertainty due to their relevant random behaviour. In order to examine such problems more realistically, fi...
متن کاملScheduling and Energy Efficiency Improvement Techniques for Hadoop Map-reduce: State of Art and Directions for Future Research
MapReduce has become ubiquitous for processing large data volume jobs. As the number and variety of jobs to be executed across heterogeneous clusters are increasing, so is the complexity of scheduling them efficiently to meet required objectives of performance. This report presents a survey of some of the MapReduce scheduling algorithms proposed for such complex scenarios. A taxonomy is provide...
متن کاملReal-Time Scheduling of Skewed MapReduce Jobs in Heterogeneous Environments
Supporting real-time jobs on MapReduce systems is particularly challenging due to the heterogeneity of the environment, the load imbalance caused by skewed data blocks, as well as real-time response demands imposed by the applications. In this paper we describe our approach for scheduling real-time, skewed MapReduce jobs in heterogeneous systems. Our approach comprises the following components:...
متن کاملAdaptive Dynamic Data Placement Algorithm for Hadoop in Heterogeneous Environments
Hadoop MapReduce framework is an important distributed processing model for large-scale data intensive applications. The current Hadoop and the existing Hadoop distributed file system’s rack-aware data placement strategy in MapReduce in the homogeneous Hadoop cluster assume that each node in a cluster has the same computing capacity and a same workload is assigned to each node. Default Hadoop d...
متن کامل