coordinated checkpointing

نتایج جستجو برای: coordinated checkpointing

تعداد نتایج: 48092 فیلتر نتایج به سال:

Checkpointing Orchestrated Web Services

Journal: :International Journal of Computer and Communication Technology 2014

متن کامل

Checkpointing in Oracle

1998

Ashok Joshi William Bridge Juan Loaiza Tirthankar Lahiri

Checkpointing is an important mechanism for limiting crash recovery times. This paper describes a new checkpointing algorithm that was implemented in Oracle 8.0. This algorithm efJiciently JWS buffers which need to be written for checkpointing and easily scales to very large buffer cache sizes: it has been tested with buffer caches as large as six million buffers. Based on this algorithm, we ha...

متن کامل

Cooperative Checkpointing for Supercomputing Systems

2005

Adam Jamison Oliner José E. Moreira Larry Rudolph Arthur C. Smith

A system-level checkpointing mechanism, with global knowledge of the state and health of the machine, can improve performance and reliability by dynamically deciding when to skip checkpoint requests made by applications. This thesis presents such a technique, called cooperative checkpointing, and models its behavior as an online algorithm. Where C is the checkpoint overhead and I is the request...

متن کامل

Eecient, Language-based Checkpointing for Massively Parallel Programs

2007

Sanjeev Krishnan Laxmikant V. Kale

Checkpointing and restart is an approach to ensuring forward progress of a program in spite of system failures or planned interruptions. We investigate issues in checkpointing and restart of programs running on massively parallel computers. We identify a new set of issues that have to be considered for the MPP platform, based on which we have designed an approach based on the language and run-t...

متن کامل

Comprehensive Low-overhead Process Recovery Based on Quasi-synchronous Checkpointing

1995

D. Manivannan

In this paper, we propose a low-overhead recovery algorithm based on a quasi-synchronous checkpointing algorithm. The checkpointing algorithm preserves process autonomy by allowing them to take checkpoints asynchronously and uses communication-induced checkpoint coordination for the progression of the recovery line which helps bound rollback propagation during a recovery. Thus, it has the easen...

متن کامل

Self-stabilizing Checkpointing Algorithm in Ring Topology

2005

Partha Sarathi Mandal Krishnendu Mukhopadhyaya

If the variables used for the checkpointing algorithm have data faults, the algorithm may fail. In this paper, a selfstabilizing checkpointing algorithm is proposed for handling data faults in a ring network. The proposed algorithm can deal with concurrent initiation of checkpointing and at most one data fault per process. However, several processes may be faulty.

متن کامل

Recomputation Enabled Efficient Checkpointing

Journal: :CoRR 2017

Ismail Akturk Ulya R. Karpuzcu

Systematic checkpointing of the machine state makes restart of execution from a safe state possible upon detection of an error. The time and energy overhead of checkpointing, however, grows with the frequency of checkpointing. Amortizing this overhead becomes especially challenging, considering the growth of expected error rates, as checkpointing frequency tends to increase with increasing erro...

متن کامل

An Efficient Recovery Mechanism with Checkpointing Approach for Cluster Federation

2014

Manoj Kumar

Checkpoint and recovery protocols are commonly used in distributed applications for providing fault tolerance. A distributed system may require taking checkpoints from time to time to keep it free of arbitrary failures. In case of failure, the system will rollback to checkpoints where global consistency is preserved. Checkpointing is one of the fault-tolerant techniques to restore faults and to...

متن کامل

Checkpointing Orchestration for Performance Improvement

2010

Hui Jin

Checkpointing is a mostly used mechanism for supporting fault tolerance of high performance computing (HPC), but notorious in its expensive disk access. Parallel file systems such as Lustre, GPFS, PVFS are widely deployed on super computers to provide fast I/O bandwidth for general data-intensive applications. However, the unique feature of checkpointing makes it impossible to benefit from the ...

متن کامل

Pii: S0950-5849(99)00057-9

2000

S. K. Woo M. H. Kim Y. J. Lee

In main memory databases, fuzzy checkpointing gives less transaction overhead due to its asynchronous backup feature. However, till now, fuzzy checkpointing has considered only physical logging schemes. The size of the physical log records is very large, and hence it incurs space and recovery processing overhead. In this paper, we propose a recovery method based on a hybrid logging scheme, whic...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید