نتایج جستجو برای: mapreduce

تعداد نتایج: 3018  

Journal: :Concurrency and Computation: Practice and Experience 2015
Qutaibah Althebyan Yaser Jararweh Qussai Yaseen Omar AlQudah Mahmoud Al-Ayyoub

Efficiently scheduling MapReduce tasks is considered as one of the major challenges that face MapReduce frameworks. Many algorithms were introduced to tackle this issue. Most of these algorithms are focusing on the data locality property for tasks scheduling. The data locality may cause less physical resources utilization in non-virtualized clusters and more power consumption. Virtualized clust...

2013
Jimmy Lin

Hadoop is currently the large-scale data analysis ‘‘hammer’’ of choice, but there exist classes of algorithms that aren’t ‘‘nails’’ in the sense that they are not particularly amenable to the MapReduce programming model. To address this, researchers have proposed MapReduce extensions or alternative programming models in which these algorithms can be elegantly expressed. This article espouses a ...

2014
Yongzhi Wang Jinpeng Wei Yucong Duan

MapReduce, a large-scale data processing paradigm, is gaining popularity. However, like other distributed computing frameworks, MapReduce suffers from the integrity assurance vulnerability: malicious workers in the MapReduce cluster could tamper with its computation result and thereby render the overall computation result inaccurate. Existing solutions are effective in defeating the malicious b...

2013
Changchun Zhang Jing Li Lei Wu

Data analyzing and processing are important tasks in cloud computing. In this field, the MapReduce framework has become a more and more popular tool to analyze large-scale data over large clusters. Compared with the parallel relational database, it has the advantages of excellent scalability and good fault tolerance. However, the performance of join operation using MapReduce is not as good as t...

2014
Byeong Soo Kim Sun Ju Lee Tag Gon Kim Hae Sang Song

Simulation-based experiment of complex systems is a time consuming-job. Parallel and distributed simulation is one of the methods to reduce the simulation time. To simulate and analyze the system with this method, it is required to design a suitable experimental frame. Therefore, this paper proposes a MapReduce based experimental frame for the parallel and distributed simulation. Because Hadoop...

2010
Linh T.X. Phan Zhuoyao Zhang Boon Thau Loo Insup Lee

In this paper, we explore the feasibility of enabling the scheduling of mixed hard and soft real-time MapReduce applications. We first present an experimental evaluation of the popular Hadoop MapReduce middleware on the Amazon EC2 cloud. Our evaluation reveals tradeoffs between overall system throughput and execution time predictability, as well as highlights a number of factors affecting real-...

2013
Jingen Xiang

Cloud computing systems, like MapReduce and Pregel, provide a scalable and fault tolerant environment for running computations at massive scale. However, these systems are designed primarily for data intensive computational tasks, while a large class of problems in scientific computing and business analytics are computationally intensive (i.e., they require a lot of CPU in addition to I/O). In ...

Journal: :Parallel Computing 2011
Steven J. Plimpton Karen D. Devine

We describe a parallel library written with message-passing (MPI) calls that allows algorithms to be expressed in the MapReduce paradigm. This means the calling program does not need to include explicit parallel code, but instead provides “map” and “reduce” functions that operate independently on elements of a data set distributed across processors. The library performs needed data movement bet...

2012
Kento Emoto Hiroto Imachi

MapReduce, the de facto standard for large scale data-intensive applications, is a remarkable parallel programming model, allowing for easy parallelization of data intensive computations over many machines in a cloud. As huge tree data such as XML has achieved the status of the de facto standard for representing structured information, the situation calls for efficient MapReduce programs treati...

2013
Jong Wook Kim

Computing all signature pairs whose bit differences are less than or equal to a given threshold in large signature collections is an important problem in many applications. In this paper, we leverage MapReduce-based parallelization in order to enable scalable similarity search on the signatures. A road-block in using MapReduce framework in this problem, however, is that the cost of merging and ...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید