Recovery with limited replay: fault-tolerant processes in Linda
نویسندگان
چکیده
Research in the area of fault-tolerant distributed systems has focused to a large extent on data surviving various forms of failure. The replica control algorithms for maintaining mutually consistent replicas abound in number. However, comparatively little work has been devoted to making processes recoverable. In domains other than databases and transaction processing, faulttolerance generally implies both fault-tolerant data and fault-tolerant processes. In environments where cooperation among processes is important we argue that high availability of processes in addition to their recoverability is crucial. Our specific interest is in the Linda tuple space paradigm. In this paper we discuss efficient techniques for making Linda processes recoverable and outline some characteristics of Linda that make it particularly suitable for implementing fault-tolerance. We also propose a simple extension to our recoverable process mechanism that makes processes highly available. [keywords: fault-tolerant processes, high availability, recovery, Linda tuple space, replay, message logging].
منابع مشابه
Somersault Software Fault-Tolerance
software fault-tolerance, process replication failure masking, continuous availability, topology The ambition of fault-tolerant systems is to provide application transparent fault-tolerance at the same performance as a non-fault-tolerant system. Somersault is a library for developing distributed fault-tolerant software systems that comes close to achieving both goals. We describe Somersault and...
متن کاملNT-SwiFT: software implemented fault tolerance on Windows NT
More and more high available applications are implemented on Windows NT. However, the current version of Windows NT (NT4) does not provide some facilities that are needed to implement these fault tolerant applications. In this paper, we describe a set of components collectively named NT-SwiFT (Software Implemented Fault Tolerance) which facilitates building fault-tolerant and highly available a...
متن کاملA New Design of Fault Tolerant Comparator
In this paper we have presented a new design of fault tolerant comparator with a fault free hot spare. The aim of this design is to achieve a low overhead of time and area in fault tolerant comparators. We have used hot standby technique to normal operation of the system without interrupting and dynamic recovery method in fault detection and correction. The circuit is divided to smaller modules...
متن کاملFault tolerant system with imperfect coverage, reboot and server vacation
This study is concerned with the performance modeling of a fault tolerant system consisting of operating units supported by a combination of warm and cold spares. The on-line as well as warm standby units are subject to failures and are send for the repair to a repair facility having single repairman which is prone to failure. If the failed unit is not detected, the system enters into an unsafe...
متن کاملCheckpointing and Rollback Recovery Algorithms for Fault Tolerance in MANETs: A Review
Checkpointing and Rollback Recovery Algorithms for Fault Tolerance in MANETs: A Review Sushant Patial Department of Computer Science, Himachal Pradesh University Shimla-5 Email: patialsushant @gmail.com Jawahar Thakur Department of Computer Science, Himachal Pradesh University Shimla-5 Email: jawahar.hpu @gmail.com -------------------------------------------------------------------ABSTRACT-----...
متن کامل