نتایج جستجو برای: Apache Spark

تعداد نتایج: 18089  

2015
Mohit Sewak Sachchidanand Singh

Apache Spark is an execution engine that besides working as an isolated distributed, in-memory computing engine also offers close integration with Hadoop’s distributed file system (HDFS). Apache Spark's underlying appeal is in providing a unified framework to create sophisticated applications involving workloads. It unifies multiple workloads, handles unstructured data very well and has easy-to...

Journal: :CoRR 2017
Ahsan Javed Awan Mats Brorsson Vladimir Vlassov Eduard Ayguadé

While cluster computing frameworks are continuously evolving to provide real-time data analysis capabilities, Apache Spark has managed to be at the forefront of big data analytics for being a unified framework for both, batch and stream data processing. There is also a renewed interest is Near Data Computing (NDC) due to technological advancement in the last decade. However, it is not known if ...

Journal: :Software Testing, Verification & Reliability 2022

Summary This paper proposes TRANSMUT‐Spark for automating mutation testing of big data processing code within Spark programs. Apache is an engine analytics/processing that hides the inherent complexity parallel programming. Nonetheless, programmers must cleverly combine built‐in functions programs and guide to use right management strategies exploit computational resources required by avoid sub...

2017
Bilal Akil Ying Zhou Uwe Röhm

Distributed data processing platforms for cloud computing are important tools for large-scale data analytics. Apache Hadoop MapReduce has become the de facto standard in this space, though its programming interface is relatively low-level, requiring many implementation steps even for simple analysis tasks. This has led to the development of advanced dataflow oriented platforms, most prominently...

Journal: :Proceedings of the Institute for System Programming of RAS 2015

2016
Kun Liu Jan Boehm Christian Alis

Change detection has long been a challenging problem although a lot of research has been conducted in different fields such as remote sensing and photogrammetry, computer vision, and robotics. In this paper, we blend voxel grid and Apache Spark together to propose an efficient method to address the problem in the context of big data. Voxel grid is a regular geometry representation consisting of...

2018
Matteo Cossu Michael Färber Georg Lausen

The rapidly growing size of RDF graphs in recent years necessitates distributed storage and parallel processing strategies. To obtain efficient query processing using computer clusters a wide variety of different approaches have been proposed. Related to the approach presented in the current paper are systems built on top of Hadoop HDFS, for example using Apache Accumulo or using Apache Spark. ...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید