Apache Kafka

Introduction

Apache Kafka® is a distributed streaming platform.

A streaming platform has three key capabilities:

  • Publish and subscribe to streams of records, similar to a message queue or enterprise messaging system.

  • Store streams of records in a fault-tolerant durable way.

  • Process streams of records as they occur. Kafka is generally used for two broad classes of applications:

  • Building real-time streaming data pipelines that reliably get data between systems or applications

  • Building real-time streaming applications that transform or react to the streams of data

    To understand how Kafka does these things, let's dive in and explore Kafka's capabilities from the bottom up.

First a few concepts:

  • Kafka is run as a cluster on one or more servers that can span multiple datacenters.

  • The Kafka cluster stores streams of records in categories called topics.

  • Each record consists of a key, a value, and a timestamp.

Kafka Adoption Stories

References

Apache Kafka and Microservices

Apache Kafka and Blockchain

Last updated