Performance analysis of the Kahan-enhanced scalar product on current multi- and manycore processors

نویسندگان

  • Johannes Hofmann
  • Dietmar Fey
  • Michael Riedmann
  • Jan Eitzinger
  • Georg Hager
  • Gerhard Wellein
چکیده

SUMMARY We investigate the performance characteristics of a numerically enhanced scalar product (dot) kernel loop that uses the Kahan algorithm to compensate for numerical errors, and describe efficient SIMD-vectorized implementations on recent multi-and manycore processors. Using low-level instruction analysis and the execution-cache-memory (ECM) performance model we pinpoint the relevant performance bottlenecks for single-core and thread-parallel execution, and predict performance and saturation behavior. We show that the Kahan-enhanced scalar product comes at almost no additional cost compared to the naive (non-Kahan) scalar product if appropriate low-level optimizations, notably SIMD vectorization and unrolling, are applied. The ECM model is extended appropriately to accommodate not only modern Intel multicore chips but also the Intel Xeon Phi " Knights Corner " coprocessor and an IBM POWER8 CPU. This allows us to discuss the impact of processor features on the performance across four modern architectures that are relevant for high performance computing.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Performance Analysis of the Kahan-Enhanced Scalar Product on Current Multicore Processors

We investigate the performance characteristics of a numerically enhanced scalar product (dot) kernel loop that uses the Kahan algorithm to compensate for numerical errors, and describe efficient SIMD-vectorized implementations on recent Intel processors. Using low-level instruction analysis and the execution-cache-memory (ECM) performance model we pinpoint the relevant performance bottlenecks f...

متن کامل

Modeling and Performance Evaluation of Multi-Processors Organization with Shared Memories

This paper is primarily concerned with theoretical evaluation of the performance of multiprocessors system. A markovian waiting line model has been developed for various different multi-processors configurations, with shared memory. The system is analysed at the request level rather than job level.

متن کامل

Highly Parallel Multigrid Solvers for Multicore and Manycore Processors

In this paper we present an analysis of parallelization properties and implementation details of the new Algebraic multigrid solvers. Variants of smoothers and multicolor grid partitionings are discussed. Optimizations for modern throughput-oriented processors are considered together with different storage schemes. Finally, comparative performance results for multicore and manycore processors a...

متن کامل

A Methodology for Product Performance Analysis under Effects of Multi-Physical Phenomena

Due to the development of science and technology, the computer has become a useful tool for supporting engineering activities in product design. Many computer aided tools such as CAD/CAM, product data management (PDM), product life cycle assessment (PLA), etc., have been popularly used in industry for reducing product development lead-time and increasing total product quality. However, the nume...

متن کامل

On the energy efficiency and performance of irregular application executions on multicore, NUMA and manycore platforms

Until the last decade, performance of HPC architectures has been almost exclusively quantified by their processing power. However, energy efficiency is being recently considered as important as raw performance and has become a critical aspect to the development of scalable systems. These strict energy constraints guided the development of a new class of so-called light-weight manycore processor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Concurrency and Computation: Practice and Experience

دوره 29  شماره 

صفحات  -

تاریخ انتشار 2017