SPARE: Replicas on Hold

نویسندگان

  • Tobias Distler
  • Ivan Popov
  • Wolfgang Schröder-Preikschat
  • Hans P. Reiser
  • Rüdiger Kapitza
چکیده

Despite numerous improvements in the development and maintenance of software, bugs and security holes exist in today’s products, and malicious intrusions happen frequently. While this is a general problem, it explicitly applies to webbased services. However, Byzantine fault-tolerant (BFT) replication and proactive recovery offer a powerful combination to tolerate and overcome these kinds of faults, thereby enabling long-term service provision. BFT replication is commonly associated with the overhead of 3f + 1 replicas to handle f faults. Using a trusted component, some previous systems were able to reduce the resource cost to 2f +1 replicas. In general, adding support for proactive recovery further increases the resource demand. We believe this enormous resource demand is one of the key reasons why BFT replication is not commonly applied and considered unsuitable for web-based services. In this paper we present SPARE, a cloud-aware approach that harnesses virtualization to reduce the resource demand of BFT replication and to provide efficient support for proactive recovery. In SPARE, we focus on the main source of software bugs and intrusions; that is, the services and their associated execution environments. This approach enables us to restrict replication and request execution to only f + 1 replicas in the fault-free case while rapidly activating up to f additional replicas by utilizing virtualization in case of timing violations and faults. For an instant reaction, we keep spare replicas that are periodically updated in a paused state. In the fault-free case, these passive replicas require far less resources than active replicas and aid efficient proactive recovery.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Practical Byzantine Fault Tolerance Using Fewer than 3f+1 Active Replicas

Byzantine fault tolerant state machine replication (BFT-SMR) is a foundation for implementations of highly reliable services. Existing algorithms for BFT-SMR require at least 3f +1 active replicas to tolerate f faulty replicas. We show that BFT-SMR can be achieved with fewer than 3f +1 active replicas, as long as standby spare replicas are available, such that the number of active replicas plus...

متن کامل

Practical Intrusion-tolerance in the Cloud

Byzantine fault tolerant (BFT) replication is commonly associated with the overhead of 3f +1 replicas to handle f faults. We believe this large resource demand is one of the key reasons why BFT replication is not commonly applied. We present Spare, an approach that harnesses virtualization support as typically found in cloud-computing environments to reduce the resource demand of BFT replicatio...

متن کامل

Meeting Correlated Spare Part Demands with Optimal Transshipments

This paper studies spare part transshipments between two service part facilities whose demands are correlated. Transshipments are used to reduce severity of part stock outs. Facilities are run by an inventory manager (IM) who minimizes replenishment, transshipment, and inventory costs. We show that the optimal transshipment policy is an inventory hold-back type; if the part inventory at a facil...

متن کامل

History-Based Harvesting of Spare Cycles and Storage in Large-Scale Datacenters

An effective way to increase utilization and reduce costs in datacenters is to co-locate their latency-critical services and batch workloads. In this paper, we describe systems that harvest spare compute cycles and storage space for co-location purposes. The main challenge is minimizing the performance impact on the services, while accounting for their utilization and management patterns. To ov...

متن کامل

1-Safe Algorithms for Symmetric Site Con gurations

In order to provide database availability in the presence of node and site failures, traditional 1-safe algorithms disallow primary and hot standby replicas to be located at the same site. This means that the failure of a single primary node must be handled like a failure of the entire primary site. Furthermore, this excludes symmetric site conngurations, where the primary replicas are located ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011