Kafka Architecture

January 9, 2025

🚀 What’s New in This 2025 Update

Major Updates and Changes

Kafka 4.0.0 Architecture - Complete removal of ZooKeeper dependency
KRaft Mode - Kafka’s native consensus protocol now mandatory
Performance Enhancements - Faster rebalancing and failover
New Consumer Group Protocol - KIP-848 as default
Cloud-Native Features - Docker images and BYOC support
Java Requirements - Java 17 for brokers, Java 11+ for clients

Deprecated Features

❌ ZooKeeper - Completely removed in Kafka 4.0.0
❌ Legacy Wire Formats - Pre-0.10.x formats no longer supported
❌ Java 8 - No longer supported
❌ –zookeeper CLI flags - Removed from all admin tools

Ready to master Apache Kafka’s revolutionary architecture? Let’s dive into the distributed streaming platform that powers real-time data at scale.

mindmap
  root((Kafka 4.0 Architecture))
    Core Components
      Records
      Topics
      Partitions
      Brokers
    Data Flow
      Producers
      Consumers
      Consumer Groups
    Storage
      Logs
      Segments
      Compaction
    Distributed System
      KRaft Consensus
      Replication
      Clusters
    Modern Features
      Cloud-Native
      Streaming
      Connect

Kafka revolutionizes data streaming with Records, Topics, Consumers, Producers, Brokers, Logs, Partitions, and Clusters. Records contain key (optional), value, headers, and timestamp—all immutable for data integrity. Think of a Kafka Topic as a named stream of records ("/orders", "/user-signups")—your data highway. Each topic maintains a Log (on-disk storage) split into partitions and segments for scalability. The Producer API streams data in, while the Consumer API streams data out. Brokers (Kafka servers) form clusters that handle millions of messages per second.

Cloudurable provides Kafka training, Kafka consulting, Kafka support and helps setting up Kafka clusters in AWS.

Topics, Producers and Consumers

Kafka Architecture - Topics, Producers and Consumers Diagram

flowchart TB
  subgraph Kafka Cluster
    direction TB
    T1[Topic: Orders]
    T2[Topic: User Signups]
    T3[Topic: Payments]
    B1[Broker 1]
    B2[Broker 2]
    B3[Broker 3]
    T1 --> B1
    T2 --> B2
    T3 --> B3
  end
  P1[Producer 1] -->|Writes| T1
  P2[Producer 2] -->|Writes| T2
  P3[Producer 3] -->|Writes| T3
  C1[Consumer Group A] -->|Reads| T1
  C2[Consumer Group B] -->|Reads| T2
  C3[Consumer Group C] -->|Reads| T3
  classDef default fill:#bbdefb,stroke:#1976d2,stroke-width:1px,color:#333333
  class P1,P2,P3,C1,C2,C3,T1,T2,T3,B1,B2,B3 default

Step-by-Step Explanation:

Producers write messages to specific topics
Topics distribute across multiple brokers for scalability
Consumer groups read from topics independently
Each broker handles multiple topic partitions
Data flows from producers through topics to consumers

Core Kafka with KRaft

Kafka Architecture - Core Kafka Diagram

KRaft: The New Heart of Kafka

Kafka 4.0 eliminates ZooKeeper entirely. KRaft (Kafka Raft) handles:

Leadership election for brokers and partitions
Metadata management across the cluster
Configuration storage and propagation
Service discovery for dynamic broker management

This architectural shift delivers faster failover, simplified operations, and reduced operational complexity. No more ZooKeeper means one less system to monitor, secure, and scale.

Producer, Consumer, Topic Details

Producers write to Topics. Consumers read from Topics. Simple? The magic happens underneath.

Each topic’s log structure enables:

Sequential writes - Blazing fast disk I/O
Parallel reads - Multiple consumers process simultaneously
Configurable retention - Time or size-based cleanup
Offset tracking - Consumers control their position

Topic logs split into partitions spread across nodes. Consumers in a group coordinate to process partitions in parallel. Kafka replicates partitions for fault tolerance.

Topic Partition, Consumer Groups, and Offsets

Kafka Architecture: Topic Partition, Consumer group, Offset and Producers Diagram

classDiagram
  class Topic {
    +name: string
    +partitions: Partition[]
    +replicationFactor: number
    +retentionMs: long
    +addPartition(): void
    +getPartition(id: number): Partition
  }
  
  class Partition {
    +id: number
    +leader: Broker
    +replicas: Broker[]
    +logSegments: Segment[]
    +highWatermark: long
    +appendRecord(record: Record): void
  }
  
  class ConsumerGroup {
    +groupId: string
    +members: Consumer[]
    +coordinator: Broker
    +protocol: string
    +rebalance(): void
    +commitOffset(partition: number, offset: long): void
  }
  
  class Producer {
    +clientId: string
    +acks: string
    +compressionType: string
    +send(record: Record): Future
    +flush(): void
  }
  
  Topic "1" *-- "many" Partition
  ConsumerGroup "many" -- "many" Partition : reads from
  Producer "many" -- "many" Partition : writes to

Step-by-Step Explanation:

Topics contain multiple partitions for parallelism
Each partition has a leader broker and replicas
Consumer groups coordinate partition assignment
Producers can write to multiple partitions
The new KIP-848 protocol optimizes rebalancing

Scale and Speed

Ever wondered how Kafka handles billions of messages? The secret sauce:

Write Performance:

Sequential disk writes reach 700+ MB/second
Partitioning enables parallel writes across brokers
Zero-copy transfer minimizes CPU overhead

Read Performance:

Consumer groups parallelize processing
Batch fetching reduces network overhead
Smart caching leverages OS page cache

Horizontal Scaling:

Add brokers to increase capacity
Partition count determines parallelism
Automatic load distribution via KRaft

Kafka Brokers

A Kafka cluster scales from 3 to 1,000+ brokers. Each broker:

Has a unique ID
Stores partition replicas
Handles client connections
Participates in KRaft consensus

Bootstrap any client by connecting to any broker—the cluster topology propagates automatically.

Cluster, Failover, and ISRs

stateDiagram-v2
  [*] --> Healthy: All replicas in sync
  Healthy --> LeaderFailure: Leader broker crashes
  LeaderFailure --> ElectingLeader: KRaft consensus
  ElectingLeader --> NewLeader: ISR promoted
  NewLeader --> Recovering: Sync replicas
  Recovering --> Healthy: All replicas caught up
  
  state Healthy {
    [*] --> Normal
    Normal --> Writing: Producer sends
    Writing --> Replicating: To ISRs
    Replicating --> Normal: Acknowledged
  }
  
  classDef healthy fill:#c8e6c9,stroke:#43a047
  classDef failure fill:#ffcdd2,stroke:#e53935
  classDef election fill:#bbdefb,stroke:#1976d2
  classDef recovering fill:#fff9c4,stroke:#f9a825
  
  class Healthy healthy
  class LeaderFailure failure
  class ElectingLeader,NewLeader election
  class Recovering recovering

Step-by-Step Explanation:

Healthy state: Leader handles writes, replicates to ISRs
Leader failure triggers automatic failover
KRaft consensus elects new leader from ISRs
New leader accepts writes immediately
Cluster recovers to full replication

Kafka achieves high availability through:

Replication Factor - Set to 3+ for production
ISRs (In-Sync Replicas) - Replicas caught up with leader
Min ISRs - Configurable write availability guarantee
Unclean Leader Election - Trade-off between availability and consistency

Failover vs. Disaster Recovery

Failover (Automatic):

Replication handles rack/AZ failures
RF=3 survives single AZ outage
Sub-second leader election with KRaft
Zero data loss with proper configuration

Disaster Recovery (Manual):

MirrorMaker 2.0 for cross-region replication
Active-passive or active-active setups
RPO/RTO depends on replication lag
Automated failover possible with additional tooling

Modern Kafka Architecture with KRaft

flowchart TB
  subgraph "Kafka 4.0 Cluster"
    direction TB
    subgraph "KRaft Controllers"
      KC1[Controller 1<br>Leader]
      KC2[Controller 2<br>Follower]
      KC3[Controller 3<br>Follower]
      KC1 -.->|Raft Consensus| KC2
      KC1 -.->|Raft Consensus| KC3
    end
    
    subgraph "Kafka Brokers"
      B1[Broker 1]
      B2[Broker 2]
      B3[Broker 3]
      B4[Broker 4]
    end
    
    subgraph "Metadata"
      MD[Cluster Metadata<br>Topics, Partitions, ACLs]
    end
    
    KC1 -->|Manages| MD
    MD -->|Propagates to| B1
    MD -->|Propagates to| B2
    MD -->|Propagates to| B3
    MD -->|Propagates to| B4
  end
  
  P[Producers] -->|Write| B1
  P -->|Write| B2
  C[Consumers] -->|Read| B3
  C -->|Read| B4
  
  classDef controller fill:#e1bee7,stroke:#8e24aa,stroke-width:2px,color:#333333
  classDef broker fill:#bbdefb,stroke:#1976d2,stroke-width:1px,color:#333333
  classDef client fill:#c8e6c9,stroke:#43a047,stroke-width:1px,color:#333333
  
  class KC1,KC2,KC3 controller
  class B1,B2,B3,B4 broker
  class P,C client

Step-by-Step Explanation:

KRaft controllers form a Raft consensus group
Controller leader manages all metadata
Metadata propagates to all brokers
Brokers handle client requests independently
No external coordination service needed

Kafka Topics Architecture

Ready to explore deeper? Continue reading about Kafka Topics Architecture to understand partitions, parallel processing, and advanced configurations.

About Cloudurable

Accelerate your Kafka journey with expert guidance. Cloudurable provides Kafka training, Kafka consulting, Kafka support and helps setting up Kafka clusters in AWS.

Check out our new GoLang course. We provide onsite Go Lang training which is instructor led.

comments powered by Disqus

Apache Spark Training
Kafka Tutorial
Akka Consulting
Cassandra Training
AWS Cassandra Database Support
Kafka Support Pricing
Cassandra Database Support Pricing
Non-stop Cassandra
Watchdog
Advantages of using Cloudurable™
Cassandra Consulting
Cloudurable™| Guide to AWS Cassandra Deploy
Cloudurable™| AWS Cassandra Guidelines and Notes
Free guide to deploying Cassandra on AWS
Kafka Training
Kafka Consulting
DynamoDB Training
DynamoDB Consulting
Kinesis Training
Kinesis Consulting
Kafka Tutorial PDF
Kubernetes Security Training
Redis Consulting
Redis Training
ElasticSearch / ELK Consulting
ElasticSearch Training
InfluxDB/TICK Training TICK Consulting

🚀 What’s New in This 2025 Update

Major Updates and Changes

Deprecated Features