نتایج جستجو برای: mapreduce

تعداد نتایج: 3018  

2011
Frederick Maier Pascal Hitzler

It has recently been shown that the MapReduce framework for distributed computation can be used effectively for large-scale RDF Schema reasoning, computing the deductive closure of over a billion RDF triples within a reasonable time [23]. Later work has carried this approach over to OWL Horst [22]. In this paper, we provide a MapReduce algorithm for classifying knowledge bases in the descriptio...

Journal: :Big data 2013
Jimmy J. Lin

Hadoop is currently the large-scale data analysis "hammer" of choice, but there exist classes of algorithms that aren't "nails" in the sense that they are not particularly amenable to the MapReduce programming model. To address this, researchers have proposed MapReduce extensions or alternative programming models in which these algorithms can be elegantly expressed. This article espouses a very...

2013
Changchun Zhang Lei Wu Jing Li

The MapReduce framework has been widely used to process and analyze largescale datasets over large clusters. As an essential problem, join operation among large clusters attracts more and more attention in recent years due to the utilization of MapReduce. Many strategies have been proposed to improve the efficiency of distributed join, among which bloomfilter is a successful one. However, the b...

2012
Daniel Q. Duffy John L. Schnase John H. Thompson Shawn M. Freeman Thomas L. Clune

MapReduce is an approach to high-performance analytics that may be useful to data intensive problems in climate research. It offers an analysis paradigm that uses clusters of computers and combines distributed storage of large data sets with parallel computation. We are particularly interested in the potential of MapReduce to speed up basic operations common to a wide range of analyses. In orde...

2015
Varda C. Dhande B. V. Pawar

In this paper Present survey on Data mining, Data mining using Rough set Theory and Data Mining using parallel method for rough set Approximation with MapReduce Technique. With the development of Information technology data growing at a tremendous rate, so big data mining and knowledge discovery become a new challenge. Rough set theory has been successfully applied in data mining by using MapRe...

2009
Matei Zaharia Dhruba Borthakur Joydeep Sen Sarma Khaled Elmeleegy Scott Shenker Ion Stoica

Sharing a MapReduce cluster between users is attractive because it enables statistical multiplexing (lowering costs) and allows users to share a common large data set. However, we find that traditional scheduling algorithms can perform very poorly in MapReduce due to two aspects of the MapReduce setting: the need for data locality (running computation where the data is) and the dependence betwe...

Journal: :CoRR 2016
Jernej Vicic Andrej Brodnik

The manuscript presents an experiment at implementation of a Machine Translation system in a MapReduce model. The empirical evaluation was done using fully implemented translation systems embedded into the MapReduce programming model. Two machine translation paradigms were studied: shallow transfer Rule Based Machine Translation and Statistical Machine Translation. The results show that the Map...

2012
Yongzhi Wang Jinpeng Wei Mudhakar Srivatsa

MapReduce [1] is becoming a popular data processing application on Cloud Environment. However, security issues make many customers reluctant to move their critical computation tasks to cloud. For instance, [2] points out a real security vulnerability that the cloud service leader Amazon EC2 suffers from: some members of EC2 can create and share Amazon Machine Image (AMI) to the EC2 community so...

2017
Olivier Beaumont Thomas Lambert Loris Marchal Bastien Thomas

MapReduce is a well-know framework for distributing data-processing computations on parallel clusters. In MapReduce, a large computation is broken into small tasks that run in parallel on multiple machines, and scales easily to very large clusters of inexpensive commodity computers. Before the Map phase, the original dataset is first split into chunks, that are replicated (a constant number of ...

Journal: :CoRR 2011
Herodotos Herodotou

Hadoop MapReduce is now a popular choice for performing large-scale data analytics. This technical report describes a detailed set of mathematical performance models for describing the execution of a MapReduce job on Hadoop. The models describe dataflow and cost information at the fine granularity of phases within the map and reduce tasks of a job execution. The models can be used to estimate t...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید