نتایج جستجو برای: fault recovery

تعداد نتایج: 262091  

Journal: :Computer and Information Science 2009
Idawaty Ahmad Muhammad Fauzan Othman

In this paper, we proposed two recovery solutions over the existing error-free utility accrual scheduling algorithm known as General Utility Accrual Scheduling algorithm (or GUS) (Peng Li, 2004). A robust fault recovery algorithm called Backward Recovery GUS (or BRGUS) works by adapting the time redundancy model i.e., by re-executing the affected task after its transient error period is over. T...

2013
Rajiv Mahajan

Software almost inevitably contains defects. Do everything possible to reduce the fault rate; Use faulttolerance techniques to deal with software faults. Fault tolerance is the ability of a system to perform its function correctly even in the presence of internal faults. Most of the ordinary systems lack fault tolerant software fix. This paper surveys various software Fault Tolerance techniques...

2001
Vincenzo De Florio

The structures for the expression of fault-tolerance provisions into the application software are the central topic of this dissertation. Structuring techniques provide means to control complexity, the latter being a relevant factor for the introduction of design faults. This fact and the ever increasing complexity of today’s distributed software justify the need for simple, coherent, and effec...

2002
R. Badrinath R. Gupta N. Shrivastava

Checkpointing and rollback recovery is a simple technique for fault tolerance. The state of a process is saved on a disk file from which the process can recover on the occurrence of failure. In this paper we describe the implementation of FTOP (Fault Tolerant PVM), a coordinated checkpointing library integrated with PVM. Existing PVM applications require only minor change for incorporating faul...

2002
Priya Narasimhan

The OMG’s Real-Time CORBA (RT-CORBA) and FaultTolerant CORBA (FT-CORBA) specifications make it possible for today’s CORBA implementations to exhibit either real-time or fault tolerance in isolation. While real-time requires a priori knowledge of the system’s temporal operation, fault tolerance necessarily deals with faults that occur unexpectedly, and with possibly unpredictable fault recovery ...

2000
Myron Hecht Xuegao An Bing Zhang Yutao He

This paper describes the OFTT (OLE Fault Tolerance Technology), a fault tolerance middleware toolkit running on the Microsoft Windows NT operating system that provides required fault tolerance for networked PCs in the context of industrial process monitoring and control applications. It is based on the Microsoft Component Object Model (COM) and consists of components that performs checkpoint-sa...

2013
A. Wander

This paper summarizes basic concepts and the current state of the art in spacecraft fault detection, isolation and recovery. A gap analysis focuses on drawbacks of classical fault diagnosis methods in deep space applications. Studies that propose innovative techniques are summarized and evaluated briefly with respect to enhanced spacecraft on-board fault diagnosis on system level. Research chal...

2011
Kishor H. Kharbas

KHARBAS, KISHOR H. Failure Detection and Partial Redundancy in HPC. (Under the direction of Dr. Frank Mueller.) To support the ever increasing demand of scientific computations, today’s High Performance Computing (HPC) systems have large numbers of computing elements running in parallel. Petascale computers, which are capable of reaching a performance in excess of one PetaFLOPS (1015 floating p...

2006
Gunbae Kim IlWoong Kim Ilgweon Kang Sungho Kang

This paper presents Fault Tolerant Carry Select Adder (FT-CSA), most widely used type of adder, based on the self checking scheme with modular architecture. The error recovery capability is derived using generic input/output combination of carry select adder with predetermined fault and error set. The experimental results show that proposed FT-CSA has nearly 50% overhead compared with the typic...

2004
S. J. Upadhyaya W. K. Fuchs

Error recovery capability is examined in processing arrays that employ spare nodes for fault tolerance. Spares can provide fault tolerance to high-performance single-package arrays, where it is not feasible to repair faulty subsystems. The cost of such a fault-tolerance solution , redundant hardware that idles until needed, may not be practical. Manufacturers must be ooered hardware solutions t...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید