System Effects of Single Event Upsets

نویسنده

  • A. M. Finn
چکیده

At the system level, SEUs in processors are controlled by fault-tolerance techniques such as replication and voting, watchdog processors, and tagged data schemes [13,16,30]. SEUs in memory subsystems are controlled by use of error control codes (ECCs) [4,17,21] and a process called scrubbing. The scrubbing process periodically reads each word in the memory. If the number of faulty digits in a word is less than or equal to the number the ECC can correct, then the digits are corrected and the word is written back to memory. If the number of faulty digits exceeds the ECC's capability, the errors cannot be corrected and the memory has failed. Fault-tolerance to memory failures requires either physical redundancy via replication or temporal redundancy via checkpoint rollback schemes. In most aerospace applications physical redundancy is undesirable because mass, volume, and power are at a premium. The rate at which SEUs are scrubbed from memory affects the performance and reliability of the entire computer system. Infrequent scrubbing leads to an accumulation of faults and increases the probability of exceeding the ECC's capability. Conversely, frequent scrubbing uses memory cycles that might otherwise be used by the operating system or an application program. There is a recognized tradeoff between using ECCs and scrubbing or using lower density, higher power, radiation-hardened semiconductors to achieve reliability [7,32]. Previous analyses of the tradeoffs between the use of simple ECCs, the additional hardware for the ECC, failure due to that additional hardware, and the system impact have been based on simplified analytical models; detailed analytical models are intractable. This paper introduces the idea of Markov modeling for SEU effects. Markov modeling allows extrapolation of chip failure rates to the subsystem and system level, allows more sophisticated tradeoff evaluations, and permits sensitivity analyses. The remainder of this paper is organized in three parts. Section 2 provides background about the SEU problem, expected SEU failure rates, and SEU control techniques. Section 3 introduces the use of Markov modeling techniques for memory subsystems and develops one model in detail. Section 4 presents the modeling results and generalizes the applicability of the modeling techniques. Single Event Upsets (SEUs) pose a serious threat to computer reliability and longevity. SEU effects are found at sea level, in airborne avionics, and in space. At the system level, SEUs in processors are controlled by replication and voting, watchdog processors, and tagged data schemes. SEUs in memory subsystems are controlled by periodically scrubbing words protected by an Error Control Code (ECC). The rate of memory scrubbing affects the performance and reliability of the entire computer system. There are tradeoffs between using radiation hardened semiconductors, scrubbing rates, and ECC capabilities. Previous tradeoff analyses have used simplified analytic models. The system effects of SEUs may be evaluated by Markov modeling. Markov modeling has been extensively used for modeling processor redundancy; here it is also used for memory subsystems. A modeling methodology is presented which extrapolates chip transient and permanent failure rates to the system level, allows evaluation of alternative ECCs, and permits sensitivity analyses. The results for an example memory subsystem show that scrubbing effectiveness may be relatively insensitive to scrubbing rate.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Design of a ‘Single Event Effect’ Mitigation Technique for Reconfigurable Architectures

Commercially off the shelf (COTS) available reconfigurable System-on-Chip architectures, are becoming popular for applications where high dependability, performance and low costs are mandatory constraints such as space applications. We present a unique SEE (single Event Effect) mitigation technique based upon Temporal Data Sampling and Weighted Voting for synchronous circuits and configuration ...

متن کامل

Reliability of Programmable Input/Output Pins in the Presence of Configuration Upsets

Field programmable gate arrays (FPGAs) are an attractive alternative for space-based remote sensing applications. However, SRAMbased FPGAs are sensitive to radiationinduced single-event upsets within the configuration memory. Such configuration upsets may change the logic, routing, and operating modes of a user FPGA design. Upsets within the configuration of an I/O block are especially troubles...

متن کامل

Single Event Upsets in Implantable Cardioverter Defibrillators

Single event upsets (SEU) have been observed in implantable cardiac defibrillators. The incidence of SEUs is well modeled by upset rate calculations attributable to the secondary cosmic ray neutron flux. The effect of recent interpretations of the shape of the heavy ion cross-section curve on neutron burst generation rate calculations is discussed. The model correlates well with clinical experi...

متن کامل

An improved SRAM cell design for tolerating radiation-induced single-event effects

This paper presents an improved design of a radiationhardened static random access memory (SRAM) cell. The memory cell is designed to be tolerant to transient single-event upsets by taking advantage of the fact that for the same area, the surface mobility of NMOS transistors is greater than that of PMOS transistors. The results show that the proposed design is able to withstand single-event ups...

متن کامل

Radiation Effects on the On-line Monitoring System of a Hadrontherapy Center

Introduction Today, there is a growing interest in the use of hadrontherapy as an advanced radiotherapy technique. Hadrontherapy is considered a promising tool for cancer treatment, given its high radiobiological effectiveness and high accuracy of dose deposition due to the physical properties of hadrons. However, new radiation modalities of dose delivery and on-line beam monitoring play crucia...

متن کامل

Measurement of Distance-dependent Multiple Upsets of Flip-Flops in 65nm CMOS Process

We measured neutron-induced SEUs (Single Event Upsets) and MCUs (Multiple Cell Upsets) on FFs in a 65 nm bulk CMOS process. Measurement results show that maximum MCU / SEU ratio is 30.6% and is exponentially decreased by the distance between latches on FFs.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003