Reliability-Constrained Processor Performance Optimization via Design Parameter Selection
نویسندگان
چکیده
Current high-performance processors suffer from soft error susceptibility issues which are generated in twofold aspects. The electronic noises, which usually come from large power supplies, strong radiation, or high-energy particle strikes [3], may invert some logic bits of processor structures, introducing transient faults (or equally soft errors) into the system. On the other hand, there is a strong decreasing tendency in the development of processor feature size and supply voltage, but the clock frequency and on-chip transistor density are fast increasing. These factors make current processors extremely vulnerable to soft errors. The efficient soft error rate (SER) is the product of raw SER and the probability that a soft error produces a visible error in the program output. The former is determined by the circuit properties, e.g. the critical charge of a cell, while the latter is characterized at microarchitectural level by the Architectural Vulnerability Factor (AVF) [2], which is proposed based on the observation that a large amount of raw soft errors are masked at the architectural level. A common approach to calculate a processor structure’s AVF is via Architecturally Correct Execution (ACE) analysis [2]: count the number of bits that are required for correct execution, and then divide it by the total number of bits of the structure. Therefore, the AVF is usually used to estimate processor soft error robustness. In this work, we propose reliable and high performance processor design parameter selection at the pre-silicon stage. We analyze the design choices optimized for performance, reliability to soft errors and their tradeoff, respectively, by exploring a large design space consisting of several key configuration parameters for both single-core and multi-core processors. We propose using two techniques to configure design parameters to optimize processor performance under reliability constraints. Given a few sampled simulation results, we characterize the design space using Patient Rule Induction Method (PRIM) [1] to generate a set of selective rules on key design parameters. Applying these rules on the design space effectively identifies the configurations that achieve the optimization for a certain metric, e.g. the AVF. This technique provides computer architects with useful guidelines to design reliable processors while achieving high performance. The other approach proposed in this paper to select desired configurations for different optimizations is heuristic pareto frontier analysis with reduced design space, namely Subspace Pareto Frontier Identification (SPFI). Basically, a predictive model is trained using a subset of parameters which turn out to be important to the response. We only perform predictions for the points in a significantly reduced design space to identify the performance optima under different vulnerability constraints. This mechanism avoids exhaustively evaluating the original large design space, thereby reducing large prediction overheads. In summary, the main contributions of this paper are as follows: • Reliable processor design parameter selection: We are capable of extracting a certain region of the design space by applying a set of selective rules on the parameters. The identified design space region is towards the optimization of the AVF, so the rules generated are guidelines for achieving reliable single and multi-core processor design. • Optimizing holistic reliability: We quantitatively show that reducing the AVF of a single structure may increase the vulnerability of other parts of the core. Similarly, reducing the AVF of one core may also affect another core’s AVF. This addresses the demand for a holistic reliability optimization. Our method can also generate rules for global reliability optimization, especially for shared resource of multi-core processors. • Balancing reliability and performance: We quantitatively demonstrate that, for some individual structures and the entire processor, merely minimizing their vulnerability significantly degrades performance. Simultaneously optimizing performance and reliability tends to mitigate the imbalance. • Heuristic reliability-constrained performance optima identification: By selecting a few important design variables, we construct a new small design space for evaluation, thus avoiding exhaustively predicting the original large design space. The experimental results demonstrate the accuracy of our method.
منابع مشابه
Design configuration selection for hard-error reliable processors via statistical rules
Lifetime reliability is becoming a first-order concern in processor manufacturing in addition to conventional design goals including performance, power consumption and thermal features since semiconductor technology enters the deep submicron era. This requires computer architects to carefully examine each design option and evaluate its reliability, in order to prolong the lifetime of the target...
متن کاملIntroduction of a New Selection Parameter in Genetic Algorithm for Constrained Reliability Design Problems
In this article we propose to introduce a new selection parameter in Genetic Algorithms (GAs) for a class of constrained reliability design problems. Our work demonstrates two major points. The first one is that the populations are quickly included in the space of the feasible solutions for a sufficiently large selection of parameter value. The second one is that the value of the selection para...
متن کاملReliable Software for Unreliable Hardware - A Cross-Layer Approach
xiv 2) The Instruction Error Masking Index estimates the probability that an error at an instruction will ultimately be masked until the final program output, i.e. does not become visible at the application output and therefore is denoted as ‘masked’. 3) In case the error is not masked, the Error Propagation Index estimates how many outputs will be affected by the unmasked error. These instruct...
متن کاملAiaa 99-4084 Parameter Optimization via Genetic Algorithm of Fuzzy Controller for Autonomous Airvehicle
In this paper, an optimal controller for the longitudinal channel of an autonomous helicopter model is designed by blending together two artificial intelligence techniques, genetic algorithms and fuzzy control. An evaluation index that captures the complex, constrained, multiple objective character of the problem was built based on several design requirements expressed in terms of the time resp...
متن کاملORE extraction and blending optimization model in poly- metallic open PIT mines by chance constrained one-sided goal programming
Determination a sequence of extracting ore is one of the most important problems in mine annual production scheduling. Production scheduling affects mining performance especially in a poly-metallic open pit mine with considering the imposed operational and physical constraints mandated by high levels of reliability in relation to the obtained actual results. One of the important operational con...
متن کامل