Stream Processing with Bigdata by SSS-MapReduce
نویسندگان
چکیده
We propose a MapReduce based stream processing system, called SSS, which is capable of processing stream along with large scale static data. Unlike the existing stream processing systems that can work only on the relatively small on-memory data-set, SSS can process incoming streamed data consulting the stored data. SSS processes streamed data with continuous Mappers and Reducers, that are periodically invoked by the system. It also supports merge operation on two set of data, which enables stream data processing with large static data. This poster shows overview of SSS stream processing and preliminary evaluation results.
منابع مشابه
On the usability of Hadoop MapReduce, Apache Spark & Apache flink for data science
Distributed data processing platforms for cloud computing are important tools for large-scale data analytics. Apache Hadoop MapReduce has become the de facto standard in this space, though its programming interface is relatively low-level, requiring many implementation steps even for simple analysis tasks. This has led to the development of advanced dataflow oriented platforms, most prominently...
متن کاملA Study of Data Management Technology for Handling Big Data
The amount of data is increasing daily. Data requires storage and effective processing for information retrieval. These both are challenge in case of the BigData due its velocity, variety and volume. It requires different management and efficient information retrieval schemes. There are different techniques available for the management of the Bigdata. The distribution of the storage and the pro...
متن کاملThe Prototype for Implementation of Security Issue in Big Data Application using Hadoop Server
A large amount of data can be referred as BigData. A vast size of data requires special kind of methodology to process and store. BigData research consortium team developed a distributed server known as Hadoop Server, to divide and partition large data into multiple pieces for fast and efficient processing. Hadoop is an open source solution developed by Google Corporation for large data process...
متن کاملA Modified Key Partitioning for BigData Using MapReduce in Hadoop
Corresponding Author: Gothai Ekambaram Department of CSE, Kongu Engineering College, Erode638052, Tamilnadu, India Email: [email protected] Abstract: In the period of BigData, massive amounts of structured and unstructured data are being created every day by a multitude of everpresent sources. BigData is complicated to work with and needs extremely parallel software executing on a huge number...
متن کاملH2RDF+: High-performance distributed joins over large-scale RDF graphs
The proliferation of data in RDF format calls for efficient and scalable solutions for their management. While scalability in the era of big data is a hard requirement, modern systems fail to adapt based on the complexity of the query. Current approaches do not scale well when faced with substantially complex, non-selective joins, resulting in exponential growth of execution times. In this work...
متن کامل