Parallel Fault Tolerant Algorithms for Parabolic Problems
نویسندگان
چکیده
With increasing number of processors available on nowadays high performance computing systems, the mean time between failure of these machines is decreasing. The ability of hardware and software components to handle process failures is therefore getting increasingly important. The objective of this paper is to present a fault tolerant approach for the implicit forward time integration of parabolic problems using explicit formulas. This technique allows the application to recover from process failures and to reconstruct the lost data of the failed process(es) avoiding the roll-back operation required in most checkpointrestart schemes. The benchmark used to highlight the new algorithms is the two dimensional heat equation solved with a first order implicit Euler scheme.
منابع مشابه
Fault Tolerant Algorithms for Heat Transfer Problems1
With the emergence of new massively parallel systems in the High Performance Computing area allowing scientific simulations to run on thousands of processors, the mean time between failures of large machines is decreasing from several weeks to a few minutes. The ability of hardware and software components to handle these singular events called process failures is therefore getting increasingly ...
متن کاملVoting Algorithm Based on Adaptive Neuro Fuzzy Inference System for Fault Tolerant Systems
some applications are critical and must designed Fault Tolerant System. Usually Voting Algorithm is one of the principle elements of a Fault Tolerant System. Two kinds of voting algorithm are used in most applications, they are majority voting algorithm and weighted average algorithm these algorithms have some problems. Majority confronts with the problem of threshold limits and voter of weight...
متن کاملVoting Algorithm Based on Adaptive Neuro Fuzzy Inference System for Fault Tolerant Systems
some applications are critical and must designed Fault Tolerant System. Usually Voting Algorithm is one of the principle elements of a Fault Tolerant System. Two kinds of voting algorithm are used in most applications, they are majority voting algorithm and weighted average algorithm these algorithms have some problems. Majority confronts with the problem of threshold limits and voter of weight...
متن کاملEfficient algorithms for fault tolerant mobile agent execution
Redundancy is necessary for fault tolerance, but the overhead introduced by redundancy may degrade system’s performance. In this paper, we propose efficient replication-based algorithms for fault-tolerant mobile agent execution, which enable parallel processing in the agent execution to reduce the overhead caused by redundancy. We also investigate failure detection mechanisms and identify the p...
متن کاملEfficient Parallel Solution of Nonlinear Parabolic Partial Differential Equations by a Probabilistic Domain Decomposition
Initialand initial-boundary value problems for nonlinear one-dimensional parabolic partial differential equations are solved numerically by a probabilistic domain decomposition method. This is based on a probabilistic representation of solutions by means of branching stochastic processes. Only few values of the solution inside the space-time domain are generated by a Monte Carlo method, and an ...
متن کامل