Algorithm-based fault-tolerant programming in scientific computation on multiprocessors

نویسندگان

  • Josef Altmann
  • Ansgar Böhm
چکیده

EEcient parallel algorithms proposed to solve many fundamental problems in scientiic computation are sensitive to processor failures. Because of its low costs, algorithm-based fault tolerance i s a n i n t e r esting concept for introducing fault tolerance into existing multi-processors. To facilitate fault{tolerant programming in scientiic computation, we have modiied and developed further an existing parallel run{time environment. In this paper the aspect of tuning known error processing techniques to the algorithm{based approach is primarily examined. Design issues for implementation and execution time overhead of a fault{tolerant application in our run{time environment are s t u d i e d. In contrast to many other environments for parallel fault{ tolerant programming, which use the master/slave programming model, our environment enables one to add fault tolerance to existing parallel applications in sci-entiic computation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fault Tolerant DNA Computing Based on ‎Digital Microfluidic Biochips

   Historically, DNA molecules have been known as the building blocks of life, later on in 1994, Leonard Adelman introduced a technique to utilize DNA molecules for a new kind of computation. According to the massive parallelism, huge storage capacity and the ability of using the DNA molecules inside the living tissue, this type of computation is applied in many application areas such as me...

متن کامل

Voting Algorithm Based on Adaptive Neuro Fuzzy Inference System for Fault Tolerant Systems

some applications are critical and must designed Fault Tolerant System. Usually Voting Algorithm is one of the principle elements of a Fault Tolerant System. Two kinds of voting algorithm are used in most applications, they are majority voting algorithm and weighted average algorithm these algorithms have some problems. Majority confronts with the problem of threshold limits and voter of weight...

متن کامل

Voting Algorithm Based on Adaptive Neuro Fuzzy Inference System for Fault Tolerant Systems

some applications are critical and must designed Fault Tolerant System. Usually Voting Algorithm is one of the principle elements of a Fault Tolerant System. Two kinds of voting algorithm are used in most applications, they are majority voting algorithm and weighted average algorithm these algorithms have some problems. Majority confronts with the problem of threshold limits and voter of weight...

متن کامل

Fault-Tolerant Matrix Operations for Networks of Workstations Using Diskless Checkpointing

Networks of workstations (NOWs) offer a cost-effective platform for high-performance, long-running parallel computations. However, these computations must be able to tolerate the changing and often faulty nature of NOW environments. We present high-performance implementations of several fault-tolerant algorithms for distributed scientific computing. The fault-tolerance is based on diskless chec...

متن کامل

Scalable and Fault Tolerant Computation with the Sparse Grid Combination Technique

This paper continues to develop a fault tolerant extension of the sparse grid combination technique recently proposed in [B. Harding and M. Hegland, ANZIAM J., 54 (CTAC2012), pp. C394–C411]. The approach is novel for two reasons, first it provides several levels in which one can exploit parallelism leading towards massively parallel implementations, and second, it provides algorithm-based fault...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995