Redundancy Based Design and Analysis of ALU Circuit Using CMOS 180nm Process Technology for Fault Tolerant Computing Architectures

نویسندگان

Tejinder Singh

Farzaneh Pashaie

Rajat Kumar

چکیده

As the technology entering into Nano dimensions, the manufacturing processes are becoming less reliable, that is drastically impacting the yield. Therefore, fault tolerant systems are becoming more important, particularly in safety-critical applications. In this paper, we present the design and analysis of 4-bit Arithmetic and Logical Unit (ALU) circuit designed using CMOS 180 nm process technology for fault tolerant computing architectures. As, ALU is a functional block of the Central Processing Unit (CPU) of a computer system. It is highly recommended that the ALU block must be fault free or fault tolerant one. In order to have high reliability and high up time of the system, we have used the classical Triple Modular Redundancy (TMR) technique in which three redundant subsystems are used in order to attain high reliability. We have achieved lower power dissipation with higher reliability of ALU circuit. The Voter Logic and Fault detection circuits are also designed and reported in this paper. Keywords: Fault Tolerant ALU Design, Triple Modular Redundancy, 180nm Process Technology, Schematic Design 1. INTRODUCTION In the past few years, the design of computing architectures and circuits has become remarkably complex and dense, while the role and importance of the systems have increased explosively [1,2]. As the scale of integration increased from small/medium to large and to today’s very large scale, the reliability per fundamental computing function require dramatic improvements. Due to the fact that as CMOS technology is also following Moore’s law, the implementation of more number is modules per unit chip are is mandatory. But, as we plan to shrink the technology scale further, there we can see a lot of complications like reliability concerns, accuracy parameters, precise results of circuits and various other circuit level performance parameters. As the demand of enhanced functionality is increasing, the complexity of systems has increased and that has raised the probability of complete failure of system. Moreover, our dependence on computing systems has grown to an extent that it has became impossible to return to less sophisticated mechanisms [1]. High reliability and uninterrupted operations in computing systems are much more vital in certain applications such as spacecraft navigation, aircraft flight control and landing systems, nuclear power plants, and chemical industries. Failures or malfunctioning of any equipment or its components in any such application leads to disastrous effects [5]. So our main concern is to have high reliable system in critical applications. In order to achieve high reliability of computing systems, we approach a fault tolerance mechanism for the Arithmetic and Logic Unit (ALU), which will avoid the unexpected breakdown of the system. We have reported design and analysis of a complete 4-bit ALU circuit. Complementary Metal Oxide Semiconductor (CMOS) 180nm process technology is used to design the system. Triple Modular Redundancy (TMR) technique is adopted and applied to the system to have fault free system. In this fault tolerant mechanism, majority voter logic module is designed to select the correct output even in the case of failure of system and disagreement detector is used to http://dx.doi.org/10.12785/ijcds/040106 54 Tejinder Singh et. al, Redundancy based Design and Analysis of ALU Circuit ... http://journals.uob.edu.bh detect the faulty ALU and provide output to the used that in case of any fault occurs. The structure of the paper is organized as the theory of fault tolerant system is given in section 2 followed by design mechanism of fault tolerant system in section 3. The various redundancy techniques are described in section 4. The design of fault tolerant ALU system is given in detail in section 5 followed by results and discussion in section 6 with simulated results of fault tolerant behaviour of system and at last the conclusion or outcome is given in section 7. 2. FAULT TOLERANT SYSTEM Fault tolerance or in simple words the graceful degradation is the property is a system that enables that system to continue its operation properly even in the event of failure of few of system’s internal components. if the quality of operation decreases anyhow, the decrease is proportional to the failure’s severity as compared to a naïvely implemented system in which even a minor failure leads to total breakdown of system. Fault tolerance mechanism or graceful degradation is particularly sought after in life-critical and highly available systems [15]. Fault tolerance is the property that enables a system to continue operating in the event of the failure of some of its components as described [15]. Several applications areas need systems to maintain correct (predictable) functionality in the presence of faults: Banking systems, Control systems, manufacturing system. There are mainly three types of faults [8] vizPermanent fault: the faults are perpetual and can be caused by physical damage or design errors, Intermittent fault: The faults occur periodically and typically result from unstable device operation and Transient fault: It is often caused by external disturbances, exist for a finite length of time and are non-recurring in nature. 3. DESIGN MECHANISM In fault tolerant designs, redundancy is used to provide the information needed to negate the effects of a failure. Basically four types of redundancy are considered: hardware, software, information and temporal, time redundancy [3]. Hardware redundancy is perhaps the most commonly used redundancy and can be employed in several forms [1]. Hardware redundancy consists in employing several identical circuits to perform the same computation at the same time [9]. The faults can be detected by the duplication or masked by the triplication by comparing the redundant outputs through a comparator/voter [4]. Information redundancy involves the addition of redundant information to the original data, i.e. it is the number of bits used to transmit a message minus the numbers of bits of actual information in the message. Informally, it is the amount of wasted space used to transmit certain data [1]. Temporal redundancy consists in forcing the system (or a subsystem) to repeat a given operation and then compare the results with those of the previous operation. Such a redundancy is able to tolerate transient or intermittent errors but not permanent errors thus making this solution not suitable for our study. In software redundancy, error detection and recovery are based on replicating application processes on a single or multiple computers [3]. Time Redundancy consists in re-executing the same operation at different time and comparing results to detect faults [4]. The use of redundancy is proposed not as a replacement, but rather as a supplement to the two cardinal principles of reliable design [6]. One is to use the most reliable components and   and other is to use the least possible complexity consistent with required system performance A. Redundancy The two functions of redundancy are – Passive redundancy that uses excess capacity to reduce the impact of component failures. One common form of passive redundancy is the use of high stress handling bond wires in IC, whereas Active redundancy eliminates the performance decline by monitoring performance of individual device and this monitoring is used in voting logic. The voting logic is linked to switching circuit that automatically reconfigures the components. 4. REDUNDANCY TECHNIQUES Redundancy is the provision of functional capabilities that would be unnecessary in a fault-free environment. This can consist of backup components which automatically "kick in" should one component fail. For example, large cargo trucks can lose a tire without any major consequences. They have many tires, and no one tire is critical (with the exception of the front tires, which are used to steer). The idea of incorporating redundancy in order to improve the reliability of a system was pioneered by John von Neumann in the 1950s. Two kinds of redundancy are possible: space redundancy and time redundancy. Space redundancy provides additional components, functions, or data items that are unnecessary for fault-free operation. Space redundancy is further classified into hardware, software and information redundancy, depending on the type of redundant resources added to the system. In time redundancy the computation or data transmission is repeated and the result is compared to a stored copy of the previous result. In order to tolerate the defects due to manufacturing processes, we have considered the hardware based redundancy technique [3]. Triple Modular Redundancy (TMR) is commonly used for designing dependable systems to ensure high reliability, availability and data integrity. Triple Modular Redundancy has been extensively used as a building block of fault-tolerant computing architectures. The idea of this technique is very fundamental: A TMR unit consists of three computing Int. J. Com. Dig. Sys. 4, No. 1, 53-62 (Jan-2014) 55 http://journals.uob.edu.bh modules and a voter logic module. Three modules perform the same computation in parallel manner and there results are applied to the voter as shown in Fig. 1. If any one of the three modules fails, the other two modules can correct and mask the fault [7]. With the failure of voter system, the complete system can collapse. However, in a good TMR system, the voter is much more reliable than other TMR components. The voting logic compares the outputs of all the modules pass the majority output i.e. if all three outputs are same then it becomes the final output and if two out of three outputs are same then the two same outputs become the final output. Also, if the two same outputs are erred output then it will become the final output [9]. A. Arithmetic and Logical Unit The processors found inside modern CPUs and graphics processing units accommodate very powerful and very complex ALUs, a single component may contain Figure 1. Block diagram of triple modular redundancy technique. Three similar modules and the fault is detected by Voter logic circuit. a number of ALUs. An ALU loads data from input registers, executes the operation and stores the result into output registers [13].  The design and function of an ALU may vary between different processors. For example, some arithmetic and logical units only perform integer calculations, while others are designed to handle floating-point operations. Regardless off the way an ALU is designed, its primary job is to handle integer operations. Therefore, a computer’s integer performance is tied directly to processing speed of ALU [15]. ALUs can perform the following operations:  Integer arithmetic operations (addition, subtraction and multiplication)  Bitwise logic operations (AND, NOT, OR, XOR)  Bit-shifting operations (shifting a word by a specified number of bits to the left or right). 5. FAULT TOLERANT ALU DESIGN The fundamental function of a processor is to execute sequences of instructions that are stored in main memory, which is external to the processor or central processing unit of a computer system. The processor also monitors and inspects the other system components, usually via dedicated control signals. For example, the processor directly or indirectly controls input/output or I/O operations viz. data transfers between primary memory and I/O devices. The processor contains various registers, which are used for the temporary storage of various instructions and operands, and an arithmetic and logical unit (ALU), which executes instructions related to data processing. It is proposed that redundancy to the critical components of a CPU may provide a feasible alternative to the multiprocessor architecture. ALU is only responsible for all the computational tasks, hence the ALU should be fault tolerant in any case for reliable systems. A design of fault-tolerant system enables a system to continue its described and expected operation, possibly at a degraded or reduced level, rather than showing a sign of complete failure, at a time when some part of the system fails to operate, The fault-tolerant term is most Figure 2. Block diagram of fault tolerant system. Three similar ALU circuits provide output to voter logic to select the correct output in case of any fault and disagreement detector detects the faulty ALU module. commonly used for computing architectures or computing systems that are designed and customized to continue more or less complete operational with, might be, an increase in response time or reduction in throughput in the event of partial failure of the system. That is, the whole system is not stopped due to hardware or the software problems. An example in another field is a motor vehicle designed so it will continue to be drivable if one of the tires is punctured. A structure is able to retain its integrity in the presence of damage due to causes such as corrosion and fatigue manufacturing flaws, or impact. Triple modular redundancy is basically the replication of a component into three identical systems where all systems or sub modules contain the same information and perform the operation at the same time. The output of all the three sub modules is then voted INPUT OUTPUT FAULTY ALU ALU 1

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fault Tolerant Reversible QCA Design using TMR and Fault Detecting by a Comparator Circuit

Quantum-dot Cellular Automata (QCA) is an emerging and promising technology that provides significant improvements over CMOS. Recently QCA has been advocated as an applicant for implementing reversible circuits. However QCA, like other Nanotechnologies, suffers from a high fault rate. The main purpose of this paper is to develop a fault tolerant model of QCA circuits by redundancy in hardware a...

متن کامل

Fault Tolerant Reversible QCA Design using TMR and Fault Detecting by a Comparator Circuit

متن کامل

A comparison of fault-tolerant state machine architectures for space-borne electronics

and Conclusions Very Large Scale Integrated (VLSI) circuits used in the space and nuclear industry are continuously subjected to ion radiation. As the limits of VLSI technology are pushed towards sub-micron levels in order to achieve higher levels of integration devices become more vulnerable to radiation induced errors. These radiation induced errors may lead to possible system failure, partic...

متن کامل

An approach to fault detection and correction in design of systems using of Turbo ‎codes‎

We present an approach to design of fault tolerant computing systems. In this paper, a technique is employed that enable the combination of several codes, in order to obtain flexibility in the design of error correcting codes. Code combining techniques are very effective, which one of these codes are turbo codes. The Algorithm-based fault tolerance techniques that to detect errors rely on the c...

متن کامل

A Comparison of Fault-Tolerant State Machine Architectures for Space-Borne Electronics - Reliability, IEEE Transactions on

Conclusions Very large scale integrated (VLSI) circuits used in the space & nuclear industry are continuously subjected to ion radiation. As the limits of VLSI technology are pushed towards sub-micron levels in order to achieve higher levels of integration, devices become more vulnerable to radiation induced errors. These radiation induced errors can lead to system failure, particularly if they...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Redundancy Based Design and Analysis of ALU Circuit Using CMOS 180nm Process Technology for Fault Tolerant Computing Architectures

نویسندگان

چکیده

منابع مشابه

Fault Tolerant Reversible QCA Design using TMR and Fault Detecting by a Comparator Circuit

Fault Tolerant Reversible QCA Design using TMR and Fault Detecting by a Comparator Circuit

A comparison of fault-tolerant state machine architectures for space-borne electronics

An approach to fault detection and correction in design of systems using of Turbo ‎codes‎

A Comparison of Fault-Tolerant State Machine Architectures for Space-Borne Electronics - Reliability, IEEE Transactions on

عنوان ژورنال:

اشتراک گذاری