نتایج جستجو برای: coordinated checkpointing

تعداد نتایج: 48092  

Journal: :Scalable Computing: Practice and Experience 2016
Eszter Kail Krisztián Karóczkai Péter Kacsuk Miklós Kozlovszky

Smart systems in telemedicine frequently use intelligent sensor devices at large scale. Practitioners can monitor non-stop the vital parameters of hundreds of patients in real-time. The most important pillars of remote patient monitoring services are communication and data processing. Large scale data processing is done mainly using workflows. Some workflows are working in real-time, more compl...

2009
Dimitar Nikolov Urban Ingelsson Virendra Singh Erik Larsson

Due to increased susceptibility to soft errors in recent semiconductor technologies, techniques for detecting and recovering from errors are required. Roll-back Recovery with Checkpointing (RRC) is one well known technique that copes with soft errors by taking and storing checkpoints during execution of a job. Employing this technique, increases the average execution time (AET), i.e. the expect...

Journal: :IEEE Transactions on Computers 2017

2008
Camille Coti Thomas Herault Pierre Lemarinier Laurence Pilard Ala Rezmerita Eric Rodriguez Franck Cappello

Nowadays, clusters and grids are made of more and more computing nodes. The programming of multi-processes applications is the most often achieved through message passing. The increase of the number of processes implies that theses applications need to use a fault tolerant message passing library. In this paper, we present two implementations of fault tolerant protocols based on MPICH, a blocki...

1998
Jon Howell

Several techniques have been proposed for adding persistence to the Java language environment. This paper describes a system we call icee that works by checkpointing the Java Virtual Machine. We compare the scheme to other persistent Java techniques. Checkpointing offers two unique advantages: first, the implementation is independent of the JVM implementation, and therefore survives JVM updates...

Journal: :Future Generation Comp. Syst. 2015
Henri Casanova Yves Robert Frédéric Vivien Dounia Zaidouni

Processor failures in post-petascale parallel computing platforms are common occurrences. The traditional fault-tolerance solution, checkpoint-rollback-recovery, severely limits parallel efficiency. One solution is to replicate application processes so that a processor failure does not necessarily imply an application failure. Process replication, combined with checkpoint-rollbackrecovery, has ...

2013
Dilbag Singh Jaswinder Singh

Main objective of this research work is to improve the checkpoint efficiency for integrated multilevel checkpointing algorithms (IMLCA) and prevent checkpointing from becoming the bottleneck of cloud data centers. In order to find an efficient checkpoint interval, checkpointing overheads has also considered in this paper. Traditional checkpointing methods stores persistently snapshots of the pr...

1998
Luís Moura Silva João Gabriel Silva

Checkpointing and rollback recovery is a very effective technique to tolerate transient faults and preventive shutdowns. In the past, most of the checkpointing schemes published in the literature were supposed to be transparent to the application programmer and implemented at the operating-system level. In the recent years, there has been some work on higher-level forms of checkpointing. In thi...

2014
Steven Chan Karen R. Sollins

This thesis proposes to enable runtime-reconfigurable applications through the use of semantic checkpointing. We view applications here as a collection of inter-connected components, and reconfigurations as the reconstitution of components that make up an application. By checkpointing only values that are deemed to be of semantic significance, application state is maintained across reconfigurat...

1998
Francesco Quaglia Bruno Ciciani Roberto Baldoni

Checkpointing distributed applications involving mobile hosts is an important task to reduce the rollback during a recovery from a failure and to manage voluntary disconnections. In this paper we show the basic characteristics a checkpointing protocol needs to work with mobile hosts, namely, reduction of the number of checkpoints, the use of incremental checkpointing and consistent global check...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید