Task Assignment and Transaction Clustering Heuristics for Distributed Systems

نویسندگان

  • José Aguilar
  • Erol Gelenbe
چکیده

In this paper, we present and discuss the task assignment problem for distributed systems. We also show how this problem is very similar to that of clustering transactions for load balancing purposes and for their efficient execution in a distributed environment. The formalization of these problems in terms of a graph-theoretic representation of a distributed program, or of a set of related transactions, is given. The cost function which needs to be minimized by an assignment of tasks to processors or of transactions to clusters is detailed, and we survey related work, as well as work on the dynamic load balancing problem. Since the task assignment problem is NP-hard, we present three novel heuristic algorithms that we have tested for solving it and compare them to the well-known greedy heuristic. These novel heuristics use neural networks, genetic algorithms, and simulated annealing. Both the resulting performance and the computational cost for these algorithms are evaluated on a large number of randomly generated program graphs of different sizes. ©Elsevier Science Inc. 1997 1. I N T R O D U C T I O N The p rob lem of assigning each task in a parallel p rogram to some processing uni t of the system has major impact on the resul t ing performance . The p rob lem arises in all areas of parallel and dis t r ibuted computat ion, where programs are decomposed into tasks or processes, which must then be assigned to processing uni ts for execution. INFORMATION SCIENCES 97, 199-219 (1997) © Elsex, ier Science Inc. 1997 0020-0255/97/$17.00 655 Avenue of the Americas, New York, NY 10010 PII S0020-0255(96)00178-8 200 J. A G U I L A R AND E. G E L E N B E In certain systems, this assignment is carried out dynamically at run-time; this gives rise to the load balancing problem [26]. However, in many cases, the user or the system will wish to exert explicit control over the assignment of each task. This paper addresses the latter, which is known as the task assignment problem. This problem is very similar to that of clustering transactions for load balancing purposes and for their efficient execution in a distributed environment. These problems can be formalized in terms of a graphtheoretic representation of a distributed program, or of a set of related transactions. The issue is then to minimize an adequate and meaningful cost function by an assignment of tasks to processors or of transactions to clusters. The task assignment problem is usually addressed using a graphtheoretic representation of the program. Typically, a distributed program is represented as a collection of tasks, which correspond to nodes in a graph. The arcs of the graph may represent communication between tasks, or precedence relations, or both. Task assignment is then formulated as a problem of partitioning the graph so as to minimize some cost function. Typically, each element (or block) in the partition will represent a set of tasks which will be assigned to the same processor. The cost function may represent a combination of communication costs (which will increase as tasks are dispersed among a larger number of processing units) and computation times (which will typically decrease as the number of tasks included in any block becomes smaller). The assignment is then chosen to minimize this combined cost. In the general case, since this problem is NP-hard, approximate heuristics are needed because exact solutions would require excessive execution times when the number of tasks in the program and the number of processing units are large. In the following sections, we first introduce task assignment and briefly discuss the related issue of task scheduling. We also discuss load balancing in order to differentiate it with the work presented here. Then, we formalize the task assignment problem and its related cost functions in a graph-theoretic framework. Finally, we survey other work, and present our own approaches and heuristic algorithms to solve the task assignment problem. Thc approaches we propose and evaluate in this paper are a heuristic based on the random neural model of Gelenbe [24, 25], a heuristic based on genetic algorithms [5, 27, 53], and the well-known simulated annealing heuristic [2, 5]. TASK ASSIGNMENT AND TRANSACTION C LU S TERIN G 201 2. P R O B L E M DEFINITION Task assignment is simply the choice of a mapping of a set of tasks to a set of processors so as to achieve a predefined goal. This goal is usually represented as some cost function which may consider a combination of several criteria: equitable load sharing between the processors, maximization of the degree of parallelism, minimization of the amount (and delay) of communication between the processors, minimization of the execution time of the program, etc. In order to be of use in achieving a satisfactory solution, the cost function must obviously include the constraints and characteristics of the programs involved (such as task execution times, amount of intertask communication, precedence between tasks), and of the system architecture, including the nature and topology of interconnects between processing units, the speed of the processors, memory system properties (shared or private to processors, limits in memory size, etc.). Usually, the task assignment problem will not consider the actual schedule or order in which the tasks are executed. On the other hand, task scheduling has been actively researched over the years and precisely addresses this specific issue [16, 29, 44, 50]. Thus in the present paper, we will not discuss scheduling issues. The related dynamic load balancing, or dynamic task assignment, problem will allocate tasks during program execution [8, 34, 36, 38, 56] and use task migration to shift the workload in the system among processing units [7, 10, 19, 45, 52]. Dynamic load balancing algorithms use system-state information, and the workload may migrate from one processor to another during run-time. Task migration is a mechanism where a process on one machine is moved to another machine, that is, it consists in interrupting the task executions and in transferring a sufficient amount of information so that the task can be executed in another place. Policies for dynamic load balancing, or dynamic task assignment, will often use the following types of rules [10, 19, 45]: • the information rule, which describes how to collect and where to store the information used in making decisions; • the transfer rule, which is used to determine when to initiate an attempt to transfer a task and whether or not to transfer a task; • the location rule, which chooses the machine to or from which tasks will be transferred; • the selection rule, which is used to select a task for transfer. 202 J. AGUILAR AND E. GELENBE Dynamic task assignment is obviously better suited to a processing environment which changes frequently due to variations in workload, or due to unexpected events such as processor slowdowns which may occur when a local load has higher priority, processor or network failures, processor withdrawal when a processor is preempted by a higher-priority job, etc. Dynamic load balancing is itself quite complex and the redistribution process creates additional overhead that can adversely impact system performance. Krueger and Livny [35] show that while initial task assignment is capable of improving performance, the addition of task reallocation mechanisms, in many cases, can provide considerable additional improvement. In the sequel, we will only deal with the task assignment problem. 2.1. GRAPH-THEORETIC A P P R O A C H TO TASK A S S I G N M E N T The graph-theoretic approach to task assignment models a program as a graph, and then uses graph-theoretic techniques to solve the problem. Each task in the parallel program is modeled as a node in the graph, and communicating tasks are connected by an edge. Both nodes and arcs will, in general, be weighted, the first representing task execution times, while the latter represent communication times or amounts of data being exchanged. Both directed and nondirected graphs may be used to represent programs. A nondirected graph will only represent information exchange and concurrent execution between tasks, while a directed graph will deal also with precedence relations between tasks. There is a very substantial literature about the graph-theoretic approach to task assignment. Stone et al. [51] use the "max flow-minimal cut" theorem of Ford and Fulkerson [21] to search for an optimal assignment which will minimize the communication cost of a system with two processors. In [40], an extension of this approach to a system with n processors is proposed, by recursively using the same theorem combined with a greedy algorithm to find suboptimal assignments. The approach is augmented to include the interference cost, which reflects the degree of incompatibility between tasks. In [41], the same author describes two heuristic algorithms to find suboptimal assignments of tasks to processors. Both algorithms model the task assignment problem as a graph-partitioning problem and show that an appropriate goal is the minimization of the total interprocessor communication cost while meeting a constraint on the number of tasks assigned to each processor. Shen et al. [49] present a graph-matching approach using the minimax criterion, based on both minimization of interprocessor communication and balance of processor TASK ASSIGNMENT AND TRANSACTION CLUSTERING 203 loading. Task assignment is transformed into a type of graph matching, called weak homomorphism. The search of the optimal weak homomorphism corresponds to the optimal task assignment. Both [12] and [54] propose to use the critical-path notion to assign tasks to processors so as to minimize program execution time based on task graph precedence. Ercal et al. [20] present a recursive algorithm based on the Kernighan-Lin bisection heuristic for the effective mapping of the tasks of a parallel program onto a hypercube parallel architecture. Heuristic algorithms are potentially fast, though some (such as those which are based on simulated annealing) can be rather slow. However, they are not guaranteed to yield an optimal solution. 2.2. HEURISTIC A L G O R I T H M S FOR TASK A S S I G N M E N T Much work on approximate algorithms has been devoted to the minimization of the communication time resulting from task assignment [18, 22], while other work has also included the minimization of execution time [49, 14]. Other research also considers the effect of precedence constraints [15]. Task assignment in real-time systems [47] requires that deadlines be taken into consideration; in [28], the execution time and the degree of parallelism are also optimized while also considering finite memory capacity and task delay. In [13], both load balancing and processor capacity constraints are discussed in the context of real-time and have as a goal the minimization of the communication cost. Ere [18] presents an algorithm called "2 module clustering" that finds task pairs to be assigned to the same processor. This procedure is run until all the candidate task pairs are grouped together. The goal is to minimize intergroup communication. An improvement of this approach is proposed in [48]. Chu [14] has proposed a similar approach where the minimization of communication cost and load balancing is accomplished in two phases. Tasks are first fused among a set of processors with a clustering algorithm until the number of processors is equal to the number of groups. Task fusion is realized in such a way that two tasks which communicate with each other are assigned either to the same processor or to neighbor processors. Then, one checks to see whether or not the system satisfies the load balancing constraint; if it does not, then some tasks are assigned from processors exceeding the tolerated level of load, to processors that are below this level, while minimizing the cost of communication. More recently, Chu et al. [15] present a method for optimal task allocation that 204 J. AGUILAR AND E. GELENBE considers the effects of precedence, intertask communication, and cumulative execution time of each task to search for a minimum-bottleneck assignment. Bowen et al. [13] propose and evaluate a hierarchical clustering and allocation algorithm, called A divisible, that drastically reduces the interprocess communication cost while respecting lower and upper bounds of utilization of the processors. This algorithm is also well suited to dynamic task assignment. Baxter et al. [9] present an algorithm for static task allocation called LAST (Localized Allocation of Static Tasks), which successively allocates sets of tasks to processors, until a completed mapping is constructed. The next task to be allocated is chosen on the basis of connectivity with the previously allocated tasks, and then assigned to processors based on the speed with which the assignment can be computed. The overall cost of the mapping is the time required for the system to execute all the tasks. In [43], several algorithms are given to relocate processes when the system configuration changes. These algorithms modify the process and processor cluster trees generated during the original allocation, to reflect these changes. Generally, only a small subtree of the process cluster tree will have to be remapped to another small processor cluster subtree. Wells et al. [55] have developed a parallel task allocation methodology for nonbuffered message-passing environments. The algorithm incorporates a set of list-based heuristics (priority list, etc.) and graph-theoretical procedures (graph precedence layering, graph width, minimum cut graph traversal for partitioning, etc.) designed to balance computational load with communication requirements. In [54], Tantawi and Towsley study a distributed system composed of a set of heterogeneous processors and develop a technique for static optimal probabilistic assignment. Other work on the static task assignment problem includes [32, 33, 37]. There has also been substantial work on dynamic load balancing and process migration. For example, [38, 39, 42] propose algorithms which migrate tasks from overloaded to less loaded processors, while in [38], a gradient procedure for moving tasks is examined. In [42], a drafting algorithm is proposed, based on the idea that a task only has to communicate with a subset of the other tasks. In [11], a model for dynamic task assignment based on "phases" is suggested, where a phase is a complete period of execution of a task, and the idea is to reconsider assignment for every distinct phase of a task. Krueger and Livny [35] study a variety of algorithms combining a local scheduling policy, such as processor sharing, with a global scheduling policy, such as load balancing, to achieve dynamic task allocation. In [26], an adaptive load balancing algorithm is proposed for both tasks and files. The gradient descent paradigm is used to make TASK A S S I G N M E N T A N D T R A N S A C T I O N C L U S T E R I N G 205 on-line load balancing decisions for tasks, and balancing is based on redistribution of files so as to maintain an equal file load at all nodes. Other work on dynamic task assignment using process migration includes [7, 8, 10, 19, 30, 36, 52, 56]. 3. F R A M E W O R K F O R T H E T W O P R O B L E M S OF TASK A S S I G N M E N T A N D T R A N S A C T I O N C L U S T E R I N G The task assignment problem addresses the clustering of a set of tasks on a set of processors so as to optimize system performance. Therefore, we first describe the formal environment within which this problem is considered. However, this problem is very similar to another important question in distributed systems: how to cluster a set of transactions so that they will be executed on a set of processors. In fact, one may consider that transactions are simply tasks of a special kind. Thus, we will address here the framework for that problem as well. 3.1. TASK GRAPHS AND TASK ASSIGNMENT In our study, we consider a distributed system architecture which consists of a collection of K processors with distributed memory, i.e., with sufficient memory at each processor so that any one task can be executed. The processors are fully interconnected via a reliable high-speed network. A parallel program which will be executed in this environment is represented by a task graph [23], which is denoted by I-I=( N , A , e , C ) , where N = { 1 . . . . . n} is the set of n tasks that compose the program, A = {aij} is the incidence matrix which describes the graph, and e, C are the amounts of work related to task execution and to communication between tasks. Thus, e i defines the amount of w o r k o r code to be execu ted in task i = 1...n. Cii will denote the amount of information transferred during communicat ion from task i to task j, if a i j = 1. Clearly, ai/= 0 implies that Cij = O. Note that this model may describe precedence between tasks if the graph is directed and acyclic, or it may be used to represent a set of tasks which interact via passage of information when the graph is not directed (in which case we will have aij =a/ i for all i , j). 206 J. A G U I L A R A N D E. G E L E N B E The task assignment problem at hand is that of assigning the n tasks to K processors. This means that we have to fred a partition ( I I 1 . . . . . I1 K) of the set of n tasks in a way which optimizes performance, as expressed by criteria such as: • The communicat ion between different processors of the system must be kept to a minimum. • The load of the different processors must be balanced. • The total effective execution time of the parallel program must be minimized. 3.2. CLUSTERING OF TRANSACTIONS Now consider once again the task graph model described above with the following differences in the manner in which it is interpreted. We will consider a transaction graph: r = ( T , P , e , C ) , where T={1 . . . . . n} is a set of n transactions, P={Pij} is the n-by-n precedence matrix which describes the transaction graph, i.e., the precedence dependencies between transactions, and the n-vector e and the n-by-n matrix C represent, respectively, the amount of work related to transaction execution and the amount of information or data granules shared between transactions. Thus, e i denotes the amount of w o r k o r code to be executed--assoc ia ted with transaction i = 1...n. Cii will denote the number of information granules which are shared by transactions i and j. Clearly, we will seek an assignment of transactions to processing units so that related transactions are executed on the same processor. Related transactions would be those which share common information or data granules, as well as those which have precedence relations between them. Similarly, we would be seeking to cluster transactions so that those which have such affinities are placed in the same cluster. In the sequel, we shall follow the terminology of task assignment, but will keep in mind that a similar approach can be adopted for transaction clustering and that the same algorithms will apply to both problems. TASK ASSIGNMENT AND TRANSACTION CLU S TERIN G 207 4. TASK ASSIGNMENT AND T H E R E L A T E D COST FUNC-q'IONS Contrary to load balancing, task assignment usually refers to decisions which are made before program execution, and which are not changed during program execution [6, 9, 17, 41]. Thus, this approach to distributing processes or tasks to processing units can be applied effectively to programs whose run-time behavior is relatively predictable, since the decision must rely on a priori knowledge of the system and of the programs. There are numerous examples of applications where this approach is useful, including major numerical algorithms such as matrix multiplication or inversion, solution methods for differential systems, as well as large nonnumerical problems such as searching and sorting. As mentioned previously, task assignment is usually carried out so as to optimize a criterion which describes the "costs" or "benefits." The assignment itself can be characterized by a set of binary variables {X~p}, where i ranges over the set of tasks and p ranges over the set of processors. Thus, Xip is the binary variable whose value is 1 if task i is assigned to processor p, and is 0 otherwise. For an assignment to be valid, we must have that each task is assigned to exactly one processor: Xip.Xiq=O, foralli, p~q, and ~,Xip=l f o r a l l i . (1)

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hybrid Meta-heuristic Algorithm for Task Assignment Problem

Task assignment problem (TAP) involves assigning a number of tasks to a number of processors in distributed computing systems and its objective is to minimize the sum of the total execution and communication costs, subject to all of the resource constraints. TAP is a combinatorial optimization problem and NP-complete. This paper proposes a hybrid meta-heuristic algorithm for solving TAP in a ...

متن کامل

Task Assignment on Distributed-memory Systems with Adaptive Wormhole Routing Task Assignment on Distributed-memory Systems with Adaptive Wormhole Routing

Assignment of tasks of a parallel program onto processors of a distributed-memory system is critical to obtain minimal program completion time by minimizing communication overhead. Wormhole-routing switching technique, with various adaptive routing strategies, is increasingly becoming the trend to build scalable distributed-memory systems. This paper presents task assignment heuristics for such...

متن کامل

Priority Assignment for Sub-transaction in Distributed Real-time Databases

Recent studies on deadline assignment to sub-tasks in distributed real-time systems have suggested different heuristics for priority assignment to improve the system performance [6,10]. These heuristics only consider the real-time constraints of the tasks and may not be suitable for distributed real-time database systems (DRTDBS). In this paper, we examine the performance of these heuristics fo...

متن کامل

Green Energy-aware task scheduling using the DVFS technique in Cloud Computing

Nowdays, energy consumption as a critical issue in distributed computing systems with high performance has become so green computing tries to energy consumption, carbon footprint and CO2 emissions in high performance computing systems (HPCs) such as clusters, Grid and Cloud that a large number of parallel. Reducing energy consumption for high end computing can bring various benefits such as red...

متن کامل

On the Task Assignment Problem: Two New Efficient Heuristic Algorithms

We study the problem of task allocation in heterogeneous distributed systems. The objective is the minimization of the sum of processor execution and intertask communication costs. We transform the problem to a maximization one, where we try to determine and avoid large communication costs and inefficient allocations. After an appropriate graph transformation, we propose two fast algorithms, th...

متن کامل

A new Shuffled Genetic-based Task Scheduling Algorithm in Heterogeneous Distributed Systems

Distributed systems such as Grid- and Cloud Computing provision web services to their users in all of the world. One of the most important concerns which service providers encounter is to handle total cost of ownership (TCO). The large part of TCO is related to power consumption due to inefficient resource management. Task scheduling module as a key component can has drastic impact on both user...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Inf. Sci.

دوره 97  شماره 

صفحات  -

تاریخ انتشار 1997