Load Balancing Approach Based on Limitations and Bottlenecks of Multi-core Architectures on a Beowulf Cluster Compute-Node

نویسندگان

  • Damian Valles
  • David H. Williams
  • Patricia A. Nava
چکیده

This paper studies the improvement of performance and execution time of a single compute-node in a Beowulf cluster. We want to implement a load balancing approach through the Linux scheduler which improves the performance and execution time of High Performance Linpack (HPL) benchmark. We compare the performance and execution time when spawning processes for two processing cores in a local processor up to all eight cores in two processors. The results showed that this approach helped to improve performance throughput since the load balancing approach created a higher L2-cache awareness, with increased hit rate, while reducing the number of times processes accessed the Frontside Bus (FSB) and Memory Controller Hub (MCH) during execution. Performance and execution time peaked with block sizes of 64 and 128 for different HPL matrix size and problem sizes; however, the performance throughput decreased for other sizes due to hardware contentions in the FSBes and

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards Dynamic Load Balancing Using Page Migration and Loop Re-partitioning on Omni/SCASH

Increasingly large-scale clusters of SMPs continue to become majority platform in HPC field. Such a cluster environment, there may be load imbalances due to several reasons and mis-placement of data which bring performance bottlenecks. To overcome these problems, some dynamic load balancing mechanisms are needed. In this paper, we report our ongoing work on dynamic load balancing extention to O...

متن کامل

A load balancing parallel method for frequent pattern mining on multi-core cluster

In this paper, we present a new parallel method named SDFEM that enables frequent pattern mining (FPM) on cluster with multiple multi-core compute nodes to provide high performance. SDFEM is distinguished from previous parallel FPM works due to incorporating three advanced features to provide high mining performance for large-scale data analytic applications. First, SDFEM combines both shared m...

متن کامل

History-Based Adaptive Work Distribution

Exploiting parallelism of increasingly heterogeneous parallel architectures is challenging due to the complexity of parallelism management. To achieve high performance portability whilst preserving high productivity, high-level approaches to parallel programming delegate parallelism management, such as partitioning and work distribution, to the compiler and the run-time system. Random work stea...

متن کامل

Reconfigurable Parallel Sorting and Load Balancing on a Beowulf Cluster: HeteroSort

HeteroSort load balances and sorts within static or dynamic networks using a conceptual torus mesh. We ported HeteroSort to a 16-node Beowulf cluster with a central switch architecture. By capturing global system knowledge in overlapping microregions of nodes, HeteroSort is useful in data dependent applications such as data information fusion on distributed processors.

متن کامل

Performance Evaluation of Load Sharing Policies with PANTS on a Beowulf Cluster

Powerful, low-cost clusters of personal computers, such as Beowulf clusters, have fueled the potential for widespread distributed computation. While these Beowulf clusters typically have software that facilitates development of distributed applications, there is still a need for effective distributed computation that is transparent to the application programmer. The PANTS Application Node Trans...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012