Self–Rejuvenation - an Effective Way to High Availability
نویسندگان
چکیده
Most computer users know that one of the most effective ways to recover from faults such as system crashes and performance degradation is simply to reset or reboot a computer. Although these methods might be expensive in terms of downtime, they are frequently most effective. Since commercial users in most cases cannot afford the restart, they tend to use periodic rejuvenation, which is restarting of processes, programs or entire systems in a preventive manner at the appropriate time for a given application. We propose self-rejuvenation, i.e., automatic rejuvenation based on failure prediction by means of modeling with Markov chains and Universal Basis Functions (UBF). Once the severity of a potential failure is determined by our models, the rejuvenation procedure may begin either by a process restart, a restart of application, reboot of the system, check-pointing, going back to recovery point or fail-over by reloading and recomputing an application on another computer. There is a strong correlation between failures and system dynamics. We show how system models can be constructed by analyzing error logs and ‘fever charts’ from system’s logs. Furthermore, we demonstrate the quality of our models based on real-system data. It will be shown that prediction-triggered self-rejuvenation is an effective way to reducing unavailability by an order of magnitude.
منابع مشابه
Availability Models for Virtualized Systems with Rejuvenation
As one of core technologies of software rejuvenation, analytical models provide a decision-making basis for implementing rejuvenation. This paper builds analytic models using stochastic reward nets with three different rejuvenation policies: non-rejuvenation, time-based rejuvenation, and time and load-based delay rejuvenation, and presents how system transits from one state into another. The re...
متن کاملAvailability Analysis and Improvement of Software Rejuvenation Using Virtualization
Availability of business-critical application servers is an issue of paramount importance that has received special attention from the industry and academia. To improve the availability of application servers, we have conducted a study of virtualization technology and software rejuvenation that follows a proactive fault-tolerant approach to counter act the software aging problem. We present Mar...
متن کاملOn the Analysis of Software Rejuvenation Policies
Software rejuvenation is a technique for software fault tolerance which involves occasionally stopping the executing software, \cleaning" the \internal state" and restarting. This cleanup is done at desirable times during execution on a preventive basis so that unplanned failures, which result in higher costs compared to planned stopping, are avoided. Since during rejuvenation, the software is ...
متن کاملA Method for Evaluation of Selected Quality Properties in Rejuvenation Systems using Markov Model
Software fault-tolerance techniques have been widely used in computing systems to achieve high level of quality. Rejuvenation, a modern software fault-tolerance technique, has attracted a large number of researchers in software engineering area. Evaluating the effectiveness and feasibility of this technique becomes extremely important in selecting, comparing and applying it in actual software s...
متن کاملA proactive approach towards always-on availability in broadband cable networks
In this paper, we propose a high availability design of a Cable Modem Termination System (CMTS) clusters system based on the software rejuvenation technique. This proactive system maintenance technique is aimed to reduce system outages and the associated downtime cost due to the ‘software aging’ phenomenon. Different rejuvenation policies are studied from the perspectives of design, implementat...
متن کامل