An efficient fault tolerant mechanism to deal with permanent and transient failures in a network on chip

نویسندگان

  • Muhammad Ali
  • Michael Welzl
  • Sven Hessler
  • Sybille Hellebrand
چکیده

Recent advances in the silicon technology is enabling the VLSI chips to accommodate billions of transistors; leading toward incorporating hundreds of heterogeneous components on a single chip. However, it has been observed that the scalability of chips is posing grave problems for the current interconnect architecture which is unable to cope with the growing number of components on a chip. To remedy the inefficiency of buses, researchers have explored the area of computer networks besides exploring parallel computing to come up with viable solutions for billion transistor chips. The outcome is a novel and scalable communication paradigm for future System on Chips (SoCs) called as Network on Chips (NoC). However, as the chip scales, the probability of both permanent and temporary faults is also increasing, making Fault Tolerance (FT) a key concern in scaling chips. Alpha particle emissions, Gaussian noise on channels are some of the reasons which introduce transient faults in the data. Besides that, due to electromigration of conductors, corrosion or aging factors, on-chip modules or links may suffer permanent damage. This paper proposes a comprehensive solution to deal with both permanent and transient errors affecting theVLSI chips. On the one hand we present an efficient packet retransmission mechanism to deal with packet corruption or loss due to transient faults. On the other hand, we propose a deterministic routing mechanism which routes packets on alternate paths when a communication link or a router suffers permanent failure.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reliability and Performance Evaluation of Fault-aware Routing Methods for Network-on-Chip Architectures (RESEARCH NOTE)

Nowadays, faults and failures are increasing especially in complex systems such as Network-on-Chip (NoC) based Systems-on-a-Chip due to the increasing susceptibility and decreasing feature sizes. On the other hand, fault-tolerant routing algorithms have an evident effect on tolerating permanent faults and improving the reliability of a Network-on-Chip based system. This paper presents reliabili...

متن کامل

Fault Tolerant Deflecting Router with High Fault Coverage for On-chip Network

Continuous scaling of CMOS technology makes it possible to integrate a large number of heterogeneous devices that need to communicate efficiently on a single chip. For this efficient routers are needed to takes place communication between these devices. As the chip scales, the probability of both permanent and transient faults is also increasing, making Fault Tolerance (FT) a key concern in sca...

متن کامل

CAFT: Cost-aware and Fault-tolerant routing algorithm in 2D mesh Network-on-Chip

By increasing, the complexity of chips and the need to integrating more components into a chip has made network –on- chip known as an important infrastructure for network communications on the system, and is a good alternative to traditional ways and using the bus. By increasing the density of chips, the possibility of failure in the chip network increases and providing correction and fault tol...

متن کامل

Graceful deadlock-free fault-tolerant routing algorithm for 3D Network-on-Chip architectures

Three-Dimensional Networks-on-Chip (3D-NoC) has been presented as an auspicious solution merging the high parallelism of Network-on-Chip (NoC) interconnect paradigm with the high-performance and lower interconnect-power of 3-dimensional integration circuits. However, 3D-NoC systems are exposed to a variety of manufacturing and design factors making them vulnerable to different faults that cause...

متن کامل

Efficient Fault-Tolerant Adaptive Routing under an Unconstrained Set of Node and Link Failures for Many-Core Systems-on-Chip

An online fault tolerant routing algorithm for 2D Mesh Networks-on-Chip is presented in this work. It combines an adaptive routing algorithm with neighbor fault-awareness and a new traffic-balancing metric. To be able to cope with runtime permanent and temporary failures that may result in message corruption, message loss or deadlocks, the routing algorithm is enhanced with packet retransmissio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IJHPSA

دوره 1  شماره 

صفحات  -

تاریخ انتشار 2007