A Highly Scalable Parallel Implementation of Balancing Domain Decomposition by Constraints
نویسندگان
چکیده
In this work we propose a novel parallelization approach of two-level balancing domain decomposition by constraints preconditioning based on overlapping of fine-grid and coarsegrid duties in time. The global set of MPI tasks is split into those that have fine-grid duties and those that have coarse-grid duties, and the different computations and communications in the algorithm are then re-scheduled and mapped in such a way that the maximum degree of overlapping is achieved while preserving data dependencies among them. In many ranges of interest, the extra cost associated to the coarse-grid problem can be fully masked by fine-grid related computations (which are embarrassingly parallel). Apart from discussing code implementation details, the paper also presents a comprehensive set of numerical experiments, that includes weak scalability analyses with structured and unstructured meshes for the 3D Poisson and linear elasticity problems on a pair of state-of-the-art multicore-based distributed-memory machines. This experimental study reveals remarkable weak scalability in the solution of problems with thousands of millions of unknowns on several tens of thousands of computational cores.
منابع مشابه
Towards space-time iterative solvers based on balancing domain decomposition
The usual approach to transient problems is to exploit sequentiality in time, and solve one space problemevery time step. This approach has recently been re-considered, motivated by the forthcoming exascalesupercomputers with billions of cores. This sequential approach has a clear problem, parallelization cannotbe exploited in time. Many key computational engineering problems, e...
متن کاملSpace-time balancing domain decomposition
In this work, we propose two-level space-time domain decomposition preconditioners for parabolic problems discretized using finite elements. They are motivated as an extension to space-time of balancing domain decomposition by constraints preconditioners. The key ingredients to be defined are the sub-assembled space and operator, the coarse degrees of freedom (DOFs) in which we want to enforce ...
متن کاملScalable Parallel Benders Decomposition for Stochastic Linear Programming
We develop a scalable parallel implementation of the classical Benders decomposition algorithm for two-stage stochastic linear programs. Using a primal-dual, path-following algorithm for solving the scenario subproblems we develop a parallel implementation that alleviates the diiculties of load balancing. Furthermore, the dual and primal step calculations can be implemented using a data-paralle...
متن کاملModeling Dynamic Load Balancing in Molecular Dynamics to Achieve Scalable Parallel Execution
To achieve scalable parallel performance in Molecular Dynamics Simulation, we have modeled and implemented several dynamic spatial domain decomposition algorithms. The modeling is based upon Valiant’s Bulk Synchronous Parallel architecture model (BSP), which describes supersteps of computation, communication, and synchronization. We have developed prototypes that estimate the differing costs of...
متن کاملAchieving Scalable Parallel Molecular Dynamics Using Dynamic Spatial Domain Decomposition Techniques
To achieve scalable parallel performance in Molecular Dynamics Simulations, we have modeled and implemented several dynamic spatial domain decomposition algorithms. The modeling is based upon the Bulk Synchronous Parallel architecture model (BSP), which describes supersteps of computation, communication, and synchronization. Using this model, we have developed prototypes that explore the differ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- SIAM J. Scientific Computing
دوره 36 شماره
صفحات -
تاریخ انتشار 2014