Enhancing The Fault-Tolerance of Nonmasking Programs

نویسندگان

  • Sandeep S. Kulkarni
  • Ali Ebnenasir
چکیده

In this paper, we focus on automated techniques to enhance the fault-tolerance of a nonmasking fault-tolerant program to masking. A masking program continually satisfies its specification even if faults occur. By contrast, a nonmasking program merely guarantees that after faults stop occurring, the program recovers to states from where it continually satisfies its specification. Until the recovery is complete, however, a nonmasking program can violate its (safety) specification. Thus, the problem of enhancing fault-tolerance from nonmasking to masking requires that safety be added and recovery be preserved. We focus on this enhancement problem for high atomicity programs –where each process can read all variables– and for distributed programs –where restrictions are imposed on what processes can read and write. We present a sound and complete algorithm for high atomicity programs and a sound algorithm for distributed programs. We also argue that our algorithms are simpler than previous algorithms, where masking fault-tolerance is added to a fault-intolerant program. Hence, these algorithms can partially reap the benefits of automation when the cost of adding masking fault-tolerance to a fault-intolerant program is high. To illustrate these algorithms, we show how the masking fault-tolerant programs for triple modular redundancy and Byzantine agreement can be obtained by enhancing the fault-tolerance of the corresponding nonmasking versions. We also discuss how the derivation of these programs is simplified when we begin with a nonmasking fault-tolerant program.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Designing Masking Fault-tolerance via Nonmasking Fault-tolerance 1

Masking fault-tolerance guarantees that programs continually satisfy their specii-cation in the presence of faults. By way of contrast, nonmasking fault-tolerance does not guarantee as much: it merely guarantees that when faults stop occurring, program executions converge to states from where programs continually (re)satisfy their speciication. In this paper, we show that an eeective method to ...

متن کامل

Designing Masking Fault Tolerance via Nonmasking Fault Tolerance

Masking fault-tolerance guarantees that programs continually satisfy their specii-cation in the presence of faults. By way of contrast, nonmasking fault-tolerance does not guarantee as much: it merely guarantees that when faults stop occurring, program executions converge to states from where programs continually (re)satisfy their speciication. We present in this paper a component based method ...

متن کامل

Automatic Synthesis of Fault-tolerance

AUTOMATIC SYNTHESIS OF FAULT-TOLERANCE By Ali Ebnenasir Fault-tolerance is an important property of today’s software systems as we rely on computers in our daily affairs (e.g., medical equipments, transportation systems, etc). Since it is difficult (if not impossible) to anticipate all classes of faults that perturb a program while designing that program, it is desirable to incrementally add fa...

متن کامل

Feasibility of Stepwise Addition of Multitolerance to High Atomicity Programs

We present sound and (deterministically) complete algorithms for stepwise design of two families ofmultitolerant programs in a high atomicity program model, where a program can read and write all itsvariables in an atomic step. We illustrate that if one needs to add failsafe (respectively, nonmasking)fault-tolerance to one class of faults and masking fault-tolerance to another c...

متن کامل

Optimal, Nonmasking Fault-Tolerant Recon guration of Trees and Rings

We design two programs that maintain the processes of an arbitrary distributed system in a rooted spanning tree and in a unidirectional ring, respectively, in the presence of fail-stop failures and repairs of both processes and communication channels. Our programs are notable as they (i) are fully distributed, (ii) have optimal time complexity, and (iii) demonstrate two di erent approaches to d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003