Cluster management system design for big data infrastructures

نویسنده

  • Shekhar Gupta
چکیده

ION OF HETEROGENEITY YARN creates containers on each machine based on the total memory and the number of CPU cores. If there are two machines with different memory size, then they will have different numbers of containers. In other words, unlike Hadoop, YARN takes resource heterogeneity into account, in the case of memory. However, YARN still does not consider heterogeneity in other resource characteristics, such as CPU speed, IO and network bandwidth. For example, let’s assume that two machines have the same memory and the same number of cores, but the CPU speeds might differ significantly. In that case, running the same number of containers on both the machines might not be optimal. FIXED CONTAINER SIZE YARN creates containers of a fixed size, and the size is configured by administrators while initiating the clusters. In the current implementation of YARN, the container size cannot be changed while running applications. Each container runs one task, therefore, a resource provided by one container can only be used by one container. However, tasks belonging to different applications may have different resource requirements. Creating 2.6. YARN LIMITATIONS

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Distributed mining of large scale remote sensing image archives on public computing infrastructures

Earth Observation (EO) mining aims at supporting efficient access to and exploration of petabyte-scale spaceand airborne remote sensing archives that are currently expanding at rates of terabytes per day. A significant challenge is performing the analysis required by envisaged applications — like for instance process mapping for environmental risk management — in reasonable time. In this work, ...

متن کامل

Intelligent Management and Efficient Operation of Big Data

This chapter details how Big Data can be used and implemented in networking and computing infrastructures. Specifically, it addresses three main aspects: the timely extraction of relevant knowledge from heterogeneous, and very often unstructured large data sources; the enhancement on the performance of processing and networking (cloud) infrastructures that are the most important foundational pi...

متن کامل

A Design of Pipelined Architecture for on-the-Fly Processing of Big Data Streams

Conventional processing infrastructures have been challenged by huge demand of stream-based applications. The industry responded by introducing traditional stream processing engines along-with emerged technologies. The ongoing paradigm embraces parallel computing as the most-suitable proposition. Pipelining and Parallelism have been intensively studied in recent years, yet parallel programming ...

متن کامل

Aggregating and Managing Big Realtime Data in the Cloud - Application to Intelligent Transport for Smart Cities

The increasing power of computer hardware and the sophistication of computer software have brought many new possibilities to information world. On one side the possibility to analyse massive data sets has brought new insight, knowledge and information. On the other, it has enabled to massively distribute computing and has opened to a new programming paradigm called Service Oriented Computing pa...

متن کامل

Design and Test of the Real-time Text mining dashboard for Twitter

One of today's major research trends in the field of information systems is the discovery of implicit knowledge hidden in dataset that is currently being produced at high speed, large volumes and with a wide variety of formats. Data with such features is called big data. Extracting, processing, and visualizing the huge amount of data, today has become one of the concerns of data science scholar...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016