Experience with Rules-Based Programming for Distributed, Concurrent, Fault-Tolerant Code
نویسندگان
چکیده
This paper describes how a rules-based approach allowed us to solve a broad class of challenging distributed system problems in the RAMCloud storage system. In the rules-based approach, behavior is described with small sections of code that trigger independently based on system state; this provides a clean separation between the deterministic and nondeterministic parts of an algorithm. To simplify the implementation of rules-based modules, we developed a task abstraction for information hiding and complexity management, pools for grouping tasks and minimizing the cost of rule evaluation, and a pollingbased asynchronous RPC system. The rules-based approach is a special case of an event-based state machine, but it encourages a cleaner factoring of code.
منابع مشابه
Toward Common Patterns for Distributed, Concurrent, Fault-Tolerant Code
There are no widely accepted design patterns for writing distributed, concurrent, fault-tolerant code. Each programmer develops her own techniques for writing this type of complex software. The use of a common pattern for fault-tolerant programming has the potential to produce correct code more quickly and increase shared understanding between developers. We describe rules, tasks, and pools, pa...
متن کاملAn approach to fault detection and correction in design of systems using of Turbo codes
We present an approach to design of fault tolerant computing systems. In this paper, a technique is employed that enable the combination of several codes, in order to obtain flexibility in the design of error correcting codes. Code combining techniques are very effective, which one of these codes are turbo codes. The Algorithm-based fault tolerance techniques that to detect errors rely on the c...
متن کاملConcurrent C: real-time programming and fault tolerance
Concurrent C is an upward-compatible parallel extension of C which runs on a variety of uniprocessors and multiprocessors. A Concurrent C program consists of a set of processes which execute in parallel and interact with each other by sending messages. Fault-Tolerant (FT) Concurrent C, an extension of Concurrent C, is a tool for writing fault-tolerant distributed programs, based on the replicat...
متن کاملImplementing Coordinated Exception Handling for Distributed Object-Oriented Systems with AspectJ
Exception handling is a very popular technique for incorporating fault tolerance into software systems. However, its use for structuring concurrent, distributed systems is hindered by the fact that the exception handling models of many mainstream object-oriented programming languages are sequential. In this paper we present an aspect-based framework for incorporating concurrent exception handli...
متن کاملAn occam-pi Implementation of a Verified Distributed Robust Annealing Algorithm
Significant additions have recently been made to the occam concurrent programming language. The new occam-π now supports, among other features, mobile channels, mobile processes, shared channels, and dynamic forking of new concurrent processes. These features should greatly enhance the ability of occam to precisely and easily implement complex concurrent applications. We have recently evaluated...
متن کامل