نتایج جستجو برای: coordinated checkpointing
تعداد نتایج: 48092 فیلتر نتایج به سال:
This short paper deals with parallel scientific applications using non-blocking and periodic coordinated checkpointing to enforce resilience. We provide a model and detailed formulas for total execution time and consumed energy. We characterize the optimal period for both objectives, and we assess the range of time/energy trade-offs to be made by instantiating the model with a set of realistic ...
This report deals with some aspects of distributed recovery. The report is divided into multiple parts, each part introducing a problem and a solution. The intent of this report is to present a medley of preliminary ideas, more detailed treatment may be presented elsewhere. The report deals with the following problems: A single processor failure tolerance scheme based on the distributed recover...
A distributed coordinated checkpointing algorithm is shown. A consistent global checkpoint is a set of states in which no message is recorded as received in one process and as not yet sent in another process. This algorithm obtains a consistent global checkpoint for any checkpoint initiation by any process. Under Chandy and Lamport’s assumption that one consistent global checkpoint is obtained ...
Incremental checkpointing, which is intended to minimize checkpointing overhead, saves only the modified pages of a process. This means that in incremental checkpointing, the time consumed for checkpointing varies according to the amount of modified pages. Thus, efficient intervals of checkpointing have to be determined on run-time of a process. In this paper, we present an efficient and adapti...
Numerous services and applications have been developed to monitor anomalies or collect various sensing information in large-scale monitoring areas using drones. Nonetheless, interruptions of drone missions such occasionally occur due network errors, low battery levels, physical defects, as damage the rotor propeller. Checkpointing is a technique that periodically saves system’s state, allowing ...
In this paper, we have presented an efficient non-blocking coordinated checkpointing algorithm for distributed systems. The distinct advantages of the proposed algorithm are the following. It produces a consistent set of checkpoints, without the overhead of taking temporary checkpoints; the algorithm also makes sure that only few processes are required to take checkpoints in its any execution; ...
This paper considers the reliability of software Distributed Shared Memory systems where the unit of sharing is a persistent read-write object. We present an extended coherence protocol for causal consistency model, which integrates replication management with independent checkpointing. It uses a novel coordinated burst checkpoint operation in order to replicate consistent checkpoints of shared...
This report deals with some aspects of distributed recovery. The report is divided into multiple parts, each part introducing a problem and a solution. The intent of this report is to present a medley of preliminary ideas, more detailed treatment may be presented elsewhere. The report deals with the following problems: A single processor failure tolerance scheme based on the distributed recover...
This paper presents an implementation of several consistent recovery protocols at the abstract device level and their performance comparison We have performed experiments using three NAS Parallel Benchmark applications with class C datasets on state of the art equip ment The interesting result is that causal message logging protocol has the most expensive recovery cost with communication intens...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید