نتایج جستجو برای: fault recovery

تعداد نتایج: 262091  

Journal: :International Journal of Intelligent Information and Database Systems 2021

1997
Myron Hecht

A distributed fault tolerant system for real time process control based on an enhancement of the distributed recovery block is described. Coverage is provided for failures in hardware, system software, networks, and application software. Fault tolerance provisions are introduced at the system level and in application software using an architecture based on the distributed recovery block (DRB). ...

2009
Hasan Sözer

The increasing size and complexity of software systems makes it hard to prevent orremove all possible faults. Faults that remain in the system can eventually lead toa system failure. Fault tolerance techniques are introduced for enabling systems torecover and continue operation when they are subject to faults. Many fault tolerancetechniques are available but incorporating them i...

Journal: :Optical Switching and Networking 2008
A. Antonino Andrea Bianco A. Bianciotto Vito De Feo Jorge M. Finochietto Roberto Gaudino Fabio Neri

This paper presents the architecture of a Wavelength Division Multiplexing (WDM) optical packet network, called WONDER, that was designed and prototyped in the PhotonLab at Politecnico di Torino, Italy. The design and implementation of the WONDER network aim to assess the effectiveness of optical technologies with respect to electronic ones, trying to identify an optimal mix of the two technolo...

2010
George Bosilca Aurelien Bouteiller Thomas Hérault Pierre Lemarinier Jack J. Dongarra

With the number of computing elements spiraling to hundred of thousands in modern HPC systems, failures are common events. Few applications are nevertheless fault tolerant; most are in need for a seamless recovery framework. Among the automatic fault tolerant techniques proposed for MPI, message logging is preferable for its scalable recovery. The major challenge for message logging protocols i...

1993
Joanne Bechta Dugan Michael R. Lyu

This paper discusses the modeling and analysis of three major fault-tolerant software system architec-tures: DRB (Distributed Recovery Blocks), NVP (N-Version Programming) and NSCP (N Self-Checking Programming). In the system-level reliability modeling domain, fault tree analysis techniques and Markov reward modeling techniques are combined to incorporate transient and permanent hardware faults...

1996
Jon B. Weissman David Womack

We present a model for application-level fault tolerance for parallel applications. The objective is to achieve high reliability with minimal impact on the application. Our approach is based on a full replication of all parallel application components in a distributed wide-area environment in which each replica is independently scheduled in a different site. A system architecture for coordinati...

2001
Giang T. Nguyen Ladislav Hluchý Viet D. Tran Margaréta Kotocová

This paper presents a solution for the problem of transparent recovery of asynchronous distributed computation on clusters of workstations when a fault occurs on a node. If the system has fault-tolerant features, it can survive the fault and continues its computations. Performance degradation is unavoidable when hardware redundancies are not available. It is a large advantage if the long-runtim...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید