Parallel MLEM on Multicore Architectures
نویسندگان
چکیده
The efficient use of multicore architectures for sparse matrixvector multiplication (SpMV) is currently an open challenge. One algorithm which makes use of SpMV is the maximum likelihood expectation maximization (MLEM) algorithm. When using MLEM for positron emission tomography (PET) image reconstruction, one requires a particularly large matrix. We present a new storage scheme for this type of matrix which cuts the memory requirements by half, compared to the widelyused compressed sparse row format. For parallelization we combine the two partitioning techniques recursive bisection and striping. Our results show good load balancing and cache behavior. We also give speedup measurements on various modern multicore systems.
منابع مشابه
Design of a novel congestion-aware communication mechanism for wireless NoC architecture in multicore systems
Hybrid Wireless Network-on-Chip (WNoC) architecture is emerged as a scalable communication structure to mitigate the deficits of traditional NOC architecture for the future Multi-core systems. The hybrid WNoC architecture provides energy efficient, high data rate and flexible communications for NoC architectures. In these architectures, each wireless router is shared by a set of processing core...
متن کاملAn Evaluation of Parallel Knapsack Algorithms on Multicore Architectures
Emergence of chip multiprocessor systems has dramatically increased the performance potential of computer systems. Since the amount of exploited parallelism is directly influenced by the selection of the algorithm, algorithmic choice also plays a critical role in achieving high performance on modern architectures. Hence, in the era of multicore computing, it is important to re-evaluate algorith...
متن کاملReevaluating Amdahl's law in the multicore era
Microprocessor architecture has entered the multicore era. Recently, Hill and Marty presented a pessimistic view of multicore scalability. Their analysis was based on Amdahl’s law (i.e. fixed-workload condition) and challenged readers to develop bettermodels. In this study,we analyzemulticore scalability under fixed-time andmemory-bound conditions and from the data access (memorywall) perspecti...
متن کاملpOSKI: An Extensible Autotuning Framework to Perform Optimized SpMVs on Multicore Architectures
We have developed pOSKI: the Parallel Optimized Sparse Kernel Interface – an autotuning framework to optimize Sparse Matrix Vector Multiply (SpMV) performance on emerging shared memory multicore architectures. Our autotuning methodology extends previous work done in the scientific computing community targeting serial architectures. In addition to previously explored parallel optimizations, we f...
متن کاملTackling Real-Time Signal Processing Applications on Shared Memory Multicore Architectures Using XPU
General-purpose shared memory multicore architectures are becoming widely available. They are likely to stand as attractive alternatives to more specialized processing architectures such as FPGA and DSP-based platforms to perform real-time digital signal processing. In this paper, we show how we can ease parallelism expression on shared memory multicore architecture through the XPU high-level p...
متن کامل