Reliability Optimization Models for Fault-Tolerant Distributed Systems
نویسندگان
چکیده
This paper presents four models to demonstrate our techniques for optimizing software and hardware reliability for fault-tolerant distributed systems. The models help us find the optimal system structure while considering basic information on reliability and cost of the available software and hardware components. Each model is suitable for a distinct set of conditions or situations. All four models maximize reliability while meeting cost constraints. The Simulated Annealing optimization algorithm is selected to demonstrate system reliability optimization techniques for distributed systems because of its flexibility in applying to various problem types with various constraints, as well as its efficiency in computation time. It provides satisfactory reliability results while meeting the constraints.
منابع مشابه
Action Models: A Reliability Modeling Formalism for Fault-Tolerant Distributed Computing Systems
Modern-day computing system design and development is characterized by increasing system complexity and ever shortening time to market. For modeling techniques to be deployed successfully, they must conveniently deal with complex system models, and must be quick and easy to use by non-specialists. In this paper we introduce “action models,” a modeling formalism that tries to achieve the above g...
متن کاملReliability and Performance Evaluation of Fault-aware Routing Methods for Network-on-Chip Architectures (RESEARCH NOTE)
Nowadays, faults and failures are increasing especially in complex systems such as Network-on-Chip (NoC) based Systems-on-a-Chip due to the increasing susceptibility and decreasing feature sizes. On the other hand, fault-tolerant routing algorithms have an evident effect on tolerating permanent faults and improving the reliability of a Network-on-Chip based system. This paper presents reliabili...
متن کاملDependability Evaluation of Fault Tolerant Architectures in Distributed Industrial Control Systems Using Petri Nets
Modern distributed industrial control systems need improvements in their dependability. In this paper we study different fault tolerant architectures for the nodes of these systems and present three different alternatives in order to develop fault tolerant nodes. Also, in order to evaluate their dependability we present theoretical models of each one, based on Petri nets, and the results obtain...
متن کاملReliability and Timeliness Analysis of Fault-tolerant Distributed Publish/Subscribe Systems
Distributed publish / subscribe paradigm is a powerful data dissemination paradigm that offers both scalability and flexibility for time-sensitive applications. However, its nature of high expressiveness makes it difficult to analyze or predict the performance of publish / subscribe systems such as event delivery probability and end-toend delivery delay, especially when the publish / subscribe ...
متن کاملEffective Fault Handling Algorithm for Load Balancing Using Ant Colony Optimization in Cloud Computing
Cloud computing is an emerging technology in distributed computing. It is a collection of interconnected virtual machines as to facilitate pay per use model as per the demand and requirement of the user. The primary aim of cloud computing is to provide efficient access to remote and geographically distributed resources without losing the property of reliability. In order to make these virtual m...
متن کامل