Tupleware: A Distributed Tuple Space for the Development and Execution of Array-based Applications in a Cluster Computing Environment

نویسنده

  • Alistair Atkinson
چکیده

This thesis describes Tupleware, an implementation of a distributed tuple space which acts as a scalable and efficient cluster middleware for computationally intensive numerical and scientific applications. Tupleware is based on the Linda coordination language (Gelernter 1985), and incorporates additional techniques such as peer-to-peer communications and exploitation of data locality in order to address problems such as scalability and performance, which are commonly encountered by traditional centralised tuple space implementations. Tupleware is implemented in such as way that, while processing is taking place, all communication between cluster nodes is decentralised in a peer-to-peer fashion. Communication events are initiated by a node requesting a tuple which is located on a remote node, and in order to make tuple retrieval as efficient as possible, a tuple search algorithm is used to minimise the number of communication instances required to retrieve a remote tuple. This algorithm is based on the locality of a remote tuple and the success of previous remote tuple requests. As Tupleware is targetted at numerical applications which generally involve the partitioning and processing of 1-D or 2-D arrays, the locality of a remote tuple can generally be determined as being located on one of a small number nodes which are processing neighbouring partitions of the array. Furthermore, unlike some other distributed tuple space implementations, Tupleware does not burden the programmer with any additional complexity due to this distribution. At the application level, the Tupleware middleware behaves exactly like a centralised tuple space, and provides much greater flexibility with regards to where components of a system are executed. The design and implementation of Tupleware is described, and placed in the context of other distributed tuple space implementations, along with the specific requirements of the applications that the system caters for. Finally, Tupleware is evaluated using several numerical and/or scientific applications, which show it to provide a sufficient level of scalability for a broad range tasks. The main contribution of this work is the identification of techniques which en-

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Effective Task Scheduling Framework for Cloud Computing using NSGA-II

Cloud computing is a model for convenient on-demand user’s access to changeable and configurable computing resources such as networks, servers, storage, applications, and services with minimal management of resources and service provider interaction. Task scheduling is regarded as a fundamental issue in cloud computing which aims at distributing the load on the different resources of a distribu...

متن کامل

A Mobile and Fog-based Computing Method to Execute Smart Device Applications in a Secure Environment

With the rapid growth of smart device and Internet of things applications, the volume of communication and data in networks have increased. Due to the network lag and massive demands, centralized and traditional cloud computing architecture are not accountable to the high users' demands and not proper for execution of delay-sensitive and real time applications. To resolve these challenges, we p...

متن کامل

Adaptive Dynamic Data Placement Algorithm for Hadoop in Heterogeneous Environments

Hadoop MapReduce framework is an important distributed processing model for large-scale data intensive applications. The current Hadoop and the existing Hadoop distributed file system’s rack-aware data placement strategy in MapReduce in the homogeneous Hadoop cluster assume that each node in a cluster has the same computing capacity and a same workload is assigned to each node. Default Hadoop d...

متن کامل

Data Replication-Based Scheduling in Cloud Computing Environment

Abstract— High-performance computing and vast storage are two key factors required for executing data-intensive applications. In comparison with traditional distributed systems like data grid, cloud computing provides these factors in a more affordable, scalable and elastic platform. Furthermore, accessing data files is critical for performing such applications. Sometimes accessing data becomes...

متن کامل

JESSICA2: A Distributed Java Virtual Machine with Transparent Thread Migration Support

A distributed Java Virtual Machine (DJVM) spanning multiple cluster nodes can provide a true parallel execution environment for multi-threaded Java applications. Most existing DJVMs suffer from the slow Java execution in interpretive mode and thus may not be efficient enough for solving computation-intensive problems. We present JESSICA2, a new DJVM running in JIT compilation mode that can exec...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008