Building a Replicated Logging System with Apache Kafka

نویسندگان

Guozhang Wang

Joel Koshy

Sriram Subramanian

Kartik Paramasivam

Mammad Zadeh

Neha Narkhede

Jun Rao

Jay Kreps

Joe Stein

چکیده

Apache Kafka is a scalable publish-subscribe messaging system with its core architecture as a distributed commit log. It was originally built at LinkedIn as its centralized event pipelining platform for online data integration tasks. Over the past years developing and operating Kafka, we extend its log-structured architecture as a replicated logging backbone for much wider application scopes in the distributed environment. In this abstract, we will talk about our design and engineering experience to replicate Kafka logs for various distributed data-driven systems at LinkedIn, including source-of-truth data storage and stream processing.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Kafka, Samza and the Unix Philosophy of Distributed Data

Apache Kafka is a scalable message broker, and Apache Samza is a stream processing framework built upon Kafka. They are widely used as infrastructure for implementing personalized online services and real-time predictive analytics. Besides providing high throughput and low latency, Kafka and Samza are designed with operational robustness and long-term maintenance of applications in mind. In thi...

متن کامل

Letter from the Editor - in - Chief Delayed

متن کامل

Real-time Text Analytics Pipeline Using Open-source Big Data Tools

Real-time text processing systems are required in many domains to quickly identify patterns, trends, sentiments, and insights. Nowadays, social networks, e-commerce stores, blogs, scientific experiments, and server logs are main sources generating huge text data. However, to process huge text data in real time requires building a data processing pipeline. The main challenge in building such pip...

متن کامل

‘Like waking up in a Franz Kafka novel’: Service users' experiences of the child protection system when domestic violence and acrimonious separations are involved

متن کامل

Processing IoT Data with Cloud Computing for Smart Cities

A smart city requires the intelligent management of infrastructure like the Internet of Things (IoT) devices in order to provide smart services that improve the quality of human life. To obtain the information needed to implement smart city services, stream reasoning is used to intelligently process the big data stream constantly generated from IoT devices. However, there are constraints associ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

PVLDB

دوره 8 شماره

صفحات -

تاریخ انتشار 2015

Building a Replicated Logging System with Apache Kafka

نویسندگان

چکیده

منابع مشابه

Kafka, Samza and the Unix Philosophy of Distributed Data

Letter from the Editor - in - Chief Delayed

Real-time Text Analytics Pipeline Using Open-source Big Data Tools

‘Like waking up in a Franz Kafka novel’: Service users' experiences of the child protection system when domestic violence and acrimonious separations are involved

Processing IoT Data with Cloud Computing for Smart Cities

عنوان ژورنال:

اشتراک گذاری