نتایج جستجو برای: mapreduce
تعداد نتایج: 3018 فیلتر نتایج به سال:
MapReduce is a promising programming model for distributed data processing. Extensive research has been conducted on the scalability of MapReduce, and several systems have been proposed in the literature, ranging from job scheduling to data placement and replication. However, realistic benchmarks are still missing to analyze and compare the effectiveness of these proposals. To date, most MapRed...
In the modern age, our ability to generate large data sets far outpaces our capacity for analyzing them. Google’s proposed solution to this fundamental problem – the MapReduce paradigm and runtime system – has recently gained traction in the scientific and “big data” industries. However, the performance characteristics of MapReduce are not well known. This paper builds on the e↵orts of prior re...
We analyze the possibility of parallelizing the Traveling Salesman Problem over the MapReduce architecture. We present the serial and parallel versions of two algorithms Tabu Search and Large Neighborhood Search. We compare the best tour length achieved by the Serial version versus the best achieved by the MapReduce version. We show that Tabu Search and Large Neighborhood Search are not well su...
Ensuring block-level reliability of MapReduce datasets is expensive due to the spatial overheads of replicating or erasure coding data. As the amount of data processed with MapReduce continues to increase, this cost will increase proportionally. In this paper we introduce Recomputation-Based Reliability in MapReduce (RMR), a system for mitigating the cost of maintaining reliable MapReduce datas...
Byzantine faults are inherent in massive parallel computation, including those based on the MapReduce model. Yet, the current MapReduce framework implementations do not tolerate Byzantine failures. Therefore, it is not possible to verify if the final results of a MapReduce application are correct. We present in this article a MapReduce architecture where tasks are replicated aiming at ensuring ...
The demand for highly parallel data processing platform was growing due to an explosion in the number of massive-scale data applications both in academia and industry. MapReduce was one of the most meaningful solutions to deal with big data distributed computing. This paper was based on the work of Hadoop MapReduce. In the face of massive data computing and calculation process, MapReduce genera...
In this paper, we propose an approach for predicting the CPU utilization of applications when they are running on MapReduce. Our approach has two key components: a set of an application experiments running on MapReduce to profile the CPU utilization of the application on a given platform, and a regression-based model that maps the MapReduce configuration parameters (number of Mappers, number of...
This tutorial is motivated by the clear need of many organizations, companies, and researchers to deal with big data volumes efficiently. Examples include web analytics applications, scientific applications, and social networks. A popular data processing engine for big data is Hadoop MapReduce. Early versions of Hadoop MapReduce suffered from severe performance problems. Today, this is becoming...
MapReduce has been used before to analyze N -body-like data. For example, in [4], a friends of friends algorithm was distributed across a MapReduce-like framework. Also, in [5], Pig was used to analyze large amounts of astronomical data. In both of these, the datasets were very large, in the hundreds of GBs and low TBs. These examples give hope that MapReduce can be used effectivly on a N -body...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید