Statistical Simulations on Parallel Computers

نویسنده

  • Hana ŠEVČÍKOVÁ
چکیده

The potential benefits of parallel computing for time-consuming statistical applications are well known, but have not been widely realized in practice, perhaps in part due to associated technical obstacles. This article develops a simple framework for programming statistical simulations using parallel processing, which does not require changing programming language or forgoing the use of standard statistical libraries. The basic idea of using parallel computing for statistical simulation studies is straightforward in principle, and is based on the standard master-slave model. However, there are several technical obstacles that can make it difficult to implement in practice. These include: nonreproducibility of results due to variations in the distribution of random numbers among processes, creation of excessive numbers of slaves, proliferation of slaves with very short lifetimes, and slaves destroyed due to hardware failures. This article proposes solutions for each of these difficulties, and together these solutions constitute an overall parallel computing framework for statistical simulation studies. In an experiment with 15 processors, the methods detailed here led to increases in speed by factors that can actually exceed the maximum expected factor of 15, due to the efficiencies of the proposed problem decomposition methods. Different gains may be achieved with different strategies, depending on the problem decomposition used and heterogeneity of the processors. Fault tolerance is an important feature of the framework. In an experiment with faults, a non-fault-tolerant version of our method took almost twice as long, and did not produce any results, while the fault-tolerant method dealt efficiently with the faults. We conclude that parallel computing can greatly improve the efficiency of statistical computation without greatly increasing programming complexity, and that it deserves wider investigation for such applications. Software to implement the proposed framework in R is available from http://www.stat.washington.edu/hana.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Random Number Generators: A Survival Guide for Large Scale Simulations

Monte Carlo simulations are an important tool in statistical physics, complex systems science, and many other fields. An increasing number of these simulations is run on parallel systems ranging from multicore desktop computers to supercomputers with thousands of CPUs. This raises the issue of generating large amounts of random numbers in a parallel application. In this lecture we will learn ju...

متن کامل

Parallel Spatial Pyramid Match Kernel Algorithm for Object Recognition using a Cluster of Computers

This paper parallelizes the spatial pyramid match kernel (SPK) implementation. SPK is one of the most usable kernel methods, along with support vector machine classifier, with high accuracy in object recognition. MATLAB parallel computing toolbox has been used to parallelize SPK. In this implementation, MATLAB Message Passing Interface (MPI) functions and features included in the toolbox help u...

متن کامل

Next-Generation Massively Parallel Computers | Massively Parallel Computer for Particle-based Simulations

Subleader Junichiro Makino Associate Professor, School of Science, University of Tokyo Izumi Hachisu Associate Professor, College of Arts and Sciences, University of Tokyo Makoto Taiji Associate Professor, Institute for Statistical Mathematics Yoko Funato Research Associate, College of Arts and Sciences, University of Tokyo Toshiyuki Fukushige Research Associate, College of Arts and Sciences, U...

متن کامل

Langevin Dynamics Simulations of Macromolecules on Parallel Computers

A parallel algorithm is developed that allows eecient Langevin{dynamics simulations of macromolecular coils, which is the usual structure of synthetic polymers in solution and in bulk. Contrary to usual so{called spatial decomposition algorithms, we map the one{dimensional topology of the chain molecule on the parallel computer. The speedup of the algorithm is measured on diierent multi{process...

متن کامل

Turbomachinery CFD on Parallel Computers

The role of multistage turbomachinery simulation in the development of propulsion system models is discussed. Particularly, the need for simulations with higher fidelity and faster turnaround time is highlighted. It is shown how such fast simulations can be used in engineering-oriented environments. The use of parallel processing to achieve the required turnaround times is discussed. Current wo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004