نتایج جستجو برای: mapreduce

تعداد نتایج: 3018  

2012
Amit Sangroya Damián Serrano Sara Bouchenak

MapReduce is a promising programming model for distributed data processing. Extensive research has been conducted on the scalability of MapReduce, and several systems have been proposed in the literature, ranging from job scheduling to data placement and replication. However, realistic benchmarks are still missing to analyze and compare the effectiveness of these proposals. To date, most MapRed...

2014
Jared Gray Thomas C. Bressoud

In the modern age, our ability to generate large data sets far outpaces our capacity for analyzing them. Google’s proposed solution to this fundamental problem – the MapReduce paradigm and runtime system – has recently gained traction in the scientific and “big data” industries. However, the performance characteristics of MapReduce are not well known. This paper builds on the e↵orts of prior re...

2010
Siddhartha Jain Matthew Mallozzi

We analyze the possibility of parallelizing the Traveling Salesman Problem over the MapReduce architecture. We present the serial and parallel versions of two algorithms Tabu Search and Large Neighborhood Search. We compare the best tour length achieved by the Serial version versus the best achieved by the MapReduce version. We show that Tabu Search and Large Neighborhood Search are not well su...

2016
Sherif Akoush Ripduman Sohan Andy Hopper

Ensuring block-level reliability of MapReduce datasets is expensive due to the spatial overheads of replicating or erasure coding data. As the amount of data processed with MapReduce continues to increase, this cost will increase proportionally. In this paper we introduce Recomputation-Based Reliability in MapReduce (RMR), a system for mitigating the cost of maintaining reliable MapReduce datas...

Journal: :Technique et Science Informatiques 2012
Luciana Arantes Alysson Neves Bessani Vinicius V. Cogo Miguel Correia Pedro Costa Jonathan Lejeune M. Piffaretti Olivier Marin Marcelo Pasin Pierre Sens F. Silva Julien Sopena

Byzantine faults are inherent in massive parallel computation, including those based on the MapReduce model. Yet, the current MapReduce framework implementations do not tolerate Byzantine failures. Therefore, it is not possible to verify if the final results of a MapReduce application are correct. We present in this article a MapReduce architecture where tasks are replicated aiming at ensuring ...

2017
ZuKuan Wei Bo Hong JaeHong Kim

The demand for highly parallel data processing platform was growing due to an explosion in the number of massive-scale data applications both in academia and industry. MapReduce was one of the most meaningful solutions to deal with big data distributed computing. This paper was based on the work of Hadoop MapReduce. In the face of massive data computing and calculation process, MapReduce genera...

2010
NIKZAD BABAII RIZVANDI YOUNG CHOON LEE ALBERT Y. ZOMAYA Nikzad Babaii Rizvandi Young Choon Lee Albert Y. Zomaya

In this paper, we propose an approach for predicting the CPU utilization of applications when they are running on MapReduce. Our approach has two key components: a set of an application experiments running on MapReduce to profile the CPU utilization of the application on a given platform, and a regression-based model that maps the MapReduce configuration parameters (number of Mappers, number of...

Journal: :PVLDB 2012
Jens Dittrich Jorge-Arnulfo Quiané-Ruiz

This tutorial is motivated by the clear need of many organizations, companies, and researchers to deal with big data volumes efficiently. Examples include web analytics applications, scientific applications, and social networks. A popular data processing engine for big data is Hadoop MapReduce. Early versions of Hadoop MapReduce suffered from severe performance problems. Today, this is becoming...

Journal: :International Journal of Database Theory and Application 2017

2013
Ross Adelman

MapReduce has been used before to analyze N -body-like data. For example, in [4], a friends of friends algorithm was distributed across a MapReduce-like framework. Also, in [5], Pig was used to analyze large amounts of astronomical data. In both of these, the datasets were very large, in the hundreds of GBs and low TBs. These examples give hope that MapReduce can be used effectivly on a N -body...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید