A Low-Cost Parallel K-Means VQ Algorithm Using Cluster Computing

نویسندگان

  • Alceu de Souza Britto
  • Paulo Sergio Lopes de Souza
  • Robert Sabourin
  • Simone do Rocio Senger de Souza
  • Díbio Leandro Borges
چکیده

In this paper we propose a parallel approach for the Kmeans Vector Quantization (VQ) algorithm used in a twostage Hidden Markov Model (HMM)-based system for recognizing handwritten numeral strings. With this parallel algorithm, based on the master/slave paradigm, we overcome two drawbacks of the sequential version: a) the time taken to create the codebook; and b) the amount of memory necessary to work with large training databases. Distributing the training samples over the slaves’ local disks reduces the overhead associated with the communication process. In addition, models predicting computation and communication time have been developed. These models are useful to predict the optimal number of slaves taking into account the number of training samples and codebook size.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

K-Means VQ algorithm using a low-cost parallel cluster computing

It is well-known that the time and memory necessary to create a codebook from large training databases have hindered the vector quantization based systems for real applications. To overcome this problem, we present a parallel approach for the K-means Vector Quantization (VQ) algorithm based on master/slave paradigm and lowcost parallel cluster computing. Distributing the training samples over t...

متن کامل

Parallel computing using MPI and OpenMP on self-configured platform, UMZHPC.

Parallel computing is a topic of interest for a broad scientific community since it facilitates many time-consuming algorithms in different application domains.In this paper, we introduce a novel platform for parallel computing by using MPI and OpenMP programming languages based on set of networked PCs. UMZHPC is a free Linux-based parallel computing infrastructure that has been developed to cr...

متن کامل

Algorithmes de classification répartis sur le cloud

The subjects addressed in this thesis are inspired from research problems faced by the Lokad company. These problems are related to the challenge of designing efficient parallelization techniques of clustering algorithms on a Cloud Computing platform. Chapter 2 provides an introduction to the Cloud Computing technologies, especially the ones devoted to intensive computations. Chapter 3 details ...

متن کامل

Parallel Spatial Pyramid Match Kernel Algorithm for Object Recognition using a Cluster of Computers

This paper parallelizes the spatial pyramid match kernel (SPK) implementation. SPK is one of the most usable kernel methods, along with support vector machine classifier, with high accuracy in object recognition. MATLAB parallel computing toolbox has been used to parallelize SPK. In this implementation, MATLAB Message Passing Interface (MPI) functions and features included in the toolbox help u...

متن کامل

Distributed clustering algorithms over a cloud computing platform. (Algorithmes de classification répartis sur le cloud)

The subjects addressed in this thesis are inspired from research problems faced by the Lokad company. These problems are related to the challenge of designing efficient parallelization techniques of clustering algorithms on a Cloud Computing platform. Chapter 2 provides an introduction to the Cloud Computing technologies, especially the ones devoted to intensive computations. Chapter 3 details ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003