Fault-Tolerant Communication with Partitioned Dimension-Order Routers with Complex Faults

نویسندگان

  • Hossam Mahmoud Ahmad Fahmy
  • Salma A. Ghoneim
چکیده

ÐThe current fault-tolerant routing methods require extensive changes to practical routers such as the Cray T3D's dimension-order router to handle faults. In this paper, we propose methods to handle faults in multicomputers with dimension-order routers with simple changes to router structure and logic. Our techniques can be applied to current implementations in which the router is partitioned into multiple modules and no centralized crossbar is used. We consider arbitrarily located faulty blocks and assume only local knowledge of faults. We apply our techniques for torus networks and show that, with as few as four virtual channels per physical channel, deadlockand livelock-free routing can be provided even with multiple faults and multimodule implementation of routers. Our simulations of the proposed technique for 2D tori and mesh indicate that the performance degradation is similar to that seen in the case of cross-bar based designs previously proposed. Index TermsÐCray T3D router, dimension-order router, fault-tolerant routing, multicomputer networks, message routing, torus networks, wormhole routing.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fault-Tolerance with Multimodule Routers

The current multiprocessors such as Cray T D support interprocessor communication using partitioned dimension order routers PDRs In a PDR implemen tation the routing logic and switching hardware is par titioned into multiple modules with each module suit able for implementation as a chip This paper proposes a method to incorporate fault tolerance into such routers with simple changes to the rou...

متن کامل

Communication in Multicomputers with Nonconvex Faults

Enhancingcurrentmulticomputer routers for fault-tolerant routing with modest increase in routing complexity and resource requirements is addressed. The proposed method handles solid faults in meshes, which includes all convex faults and many practical nonconvex faults, for example, faults in the shape of L or T. As examples of the proposed method, adaptive andnonadaptive fault-tolerant routing ...

متن کامل

On Feasibility of Adaptive Level Hardware Evolution for Emergent Fault Tolerant Communication

A permanent physical fault in communication lines usually leads to a failure. The feasibility of evolution of a self organized communication is studied in this paper to defeat this problem. In this case a communication protocol may emerge between blocks and also can adapt itself to environmental changes like physical faults and defects. In spite of faults, blocks may continue to function since ...

متن کامل

CAFT: Cost-aware and Fault-tolerant routing algorithm in 2D mesh Network-on-Chip

By increasing, the complexity of chips and the need to integrating more components into a chip has made network –on- chip known as an important infrastructure for network communications on the system, and is a good alternative to traditional ways and using the bus. By increasing the density of chips, the possibility of failure in the chip network increases and providing correction and fault tol...

متن کامل

Fault Tolerant Deflecting Router with High Fault Coverage for On-chip Network

Continuous scaling of CMOS technology makes it possible to integrate a large number of heterogeneous devices that need to communicate efficiently on a single chip. For this efficient routers are needed to takes place communication between these devices. As the chip scales, the probability of both permanent and transient faults is also increasing, making Fault Tolerance (FT) a key concern in sca...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999