Adaptive Fault Recovery for Networked Reconfigurable Systems
نویسندگان
چکیده
The device-level size and complexity of reconfigurable architectures makes fault tolerance an important concern in system design. In this paper, we introduce a fully-automated fault recovery system for networked systems which contain FPGAs. If a fault is detected that can not be addressed locally, fault information is transferred to a reconfiguration server. Following design recompilation to avoid the fault, a new FPGA configuration is returned to the remote system and computation is reinitiated. To illustrate the benefit of this approach, we have implemented a complete fault recovery system which requires no manual intervention. An important part of the system is a timing-driven incremental router for Xilinx Virtex devices. This router is directly interfaced to Xilinx JBits and uses no CAD tools from the standard Xilinx Alliance tool flow. Our completed system has been applied to three benchmark designs and exhibits complete fault recovery in up to 12× less time than the standard incremental Xilinx PAR flow.
منابع مشابه
Modeling and Design of Fault-Tolerant and Self-Adaptive Reconfigurable Networked Embedded Systems
Automotive, avionic, or body-area networks are systems that consist of several communicating control units specialized for certain purposes. Typically, different constraints regarding fault tolerance, availability and also flexibility are imposed on these systems. In this article, we will present a novel framework for increasing fault tolerance and flexibility by solving the problem of hardware...
متن کاملModeling and Analysis of Distributed Reconfigurable Hardware∗
The ability to migrate hardware processes in a network of hardware reconfigurable nodes improves the fault tolerance of these networks. The degree of fault tolerance is inherent to such networked systems and can be optimized during design time. Therefore, an efficient way of calculating the degree of fault tolerance is needed. This paper presents an approach based on satisfiability testing whic...
متن کاملThe Chameleon Infrastructure for Adaptive, Software Implemented Fault Tolerance
This paper presents Chameleon, an adaptive software infrastructure for supporting different levels of availability requirements in a heterogeneous networked environment. Chameleon provides dependability through the use of ARMORs—Adaptive, Reconfigurable, and Mobile Objects for Reliability. Three broad classes of ARMORs are defined: Managers, Daemons, and Common ARMORs. Key concepts that support...
متن کاملOn Feasibility of Adaptive Level Hardware Evolution for Emergent Fault Tolerant Communication
A permanent physical fault in communication lines usually leads to a failure. The feasibility of evolution of a self organized communication is studied in this paper to defeat this problem. In this case a communication protocol may emerge between blocks and also can adapt itself to environmental changes like physical faults and defects. In spite of faults, blocks may continue to function since ...
متن کاملDesigninga Neuro-Sliding Mode Controller for Networked Control Systems with Packet Dropout
This paper addresses control design in networked control system by considering stochastic packet dropouts in the forward path of the control loop. The packet dropouts are modelled by mutually independent stochastic variables satisfying Bernoulli binary distribution. A sliding mode controller is utilized to overcome the adverse influences of stochastic packet dropouts in networked control system...
متن کامل