January 9, 2025
🚀 What’s New in This 2025 Update
Major Updates and Changes
- KIP-848 Protocol - Revolutionary consumer rebalancing without global pauses
- Elimination of Rebalance Downtime - Consumers continue processing during rebalances
- Queue Semantics - Native point-to-point messaging (early access)
- KRaft-Based Coordination - Simplified group management without ZooKeeper
- Metadata Rebootstrap - Automatic recovery from metadata failures
- Enhanced Scalability - Support for larger consumer groups
Deprecated Features
- ❌ ZooKeeper-based coordination - Completely removed
- ❌ Legacy rebalance protocols - Replaced by KIP-848
- ❌ Pre-2.1 client protocols - No longer supported
- ❌ Old consumer group management tools - Updated for KRaft
Ready to build resilient, high-performance consumer applications? Let’s explore how Kafka 4.0 revolutionizes consumer architecture.
Kafka Consumer Architecture - Consumer Groups and Subscriptions
mindmap
root((Kafka Consumer Architecture))
Consumer Groups
Group Coordination
Offset Management
Load Balancing
KIP-848 Protocol
Processing Models
At-least-once
Exactly-once
Queue Semantics
Failover & Recovery
Partition Reassignment
Consumer Heartbeats
Automatic Recovery
Advanced Features
Threading Models
Batch Processing
Metadata Bootstrap
Performance Tuning
This deep dive into Kafka consumer architecture builds on Kafka Architecture, Kafka Topic Architecture, and Kafka Producer Architecture.
Modern Kafka consumers leverage the revolutionary KIP-848 protocol for seamless rebalancing, enabling massive scale without sacrificing availability.
Cloudurable provides Kafka training, Kafka consulting, Kafka support and helps setting up Kafka clusters in AWS.
Kafka Consumer Groups
Consumer groups transform Kafka from a simple messaging system into a distributed processing powerhouse. Group consumers by:
- Business function - Orders processing, analytics, alerting
- Processing type - Real-time, batch, streaming
- Destination system - Databases, data lakes, microservices
Each consumer group maintains independent progress through topics, enabling multiple applications to process the same data stream at their own pace.
Core Concepts:
- Unique Group ID - Identifies the consumer group
- Subscription Model - Groups subscribe to one or more topics
- Per-Partition Offsets - Each group tracks progress independently
- Exclusive Consumption - Only one consumer per partition within a group
- Load Balancing - Automatic distribution of partitions
Kafka Consumer Groups Architecture
flowchart TB
subgraph Topic["Topic: Orders"]
P0[Partition 0]
P1[Partition 1]
P2[Partition 2]
P3[Partition 3]
end
subgraph CG1["Consumer Group: Analytics"]
C1A[Consumer 1A]
C2A[Consumer 2A]
C1A --> P0
C1A --> P1
C2A --> P2
C2A --> P3
end
subgraph CG2["Consumer Group: Billing"]
C1B[Consumer 1B]
C2B[Consumer 2B]
C3B[Consumer 3B]
C1B --> P0
C2B --> P1
C2B --> P2
C3B --> P3
end
subgraph Offsets["Offset Tracking"]
O1[Analytics: P0=1000, P1=1200...]
O2[Billing: P0=800, P1=950...]
end
classDef partition fill:#e1bee7,stroke:#8e24aa,stroke-width:1px,color:#333333
classDef consumer fill:#bbdefb,stroke:#1976d2,stroke-width:1px,color:#333333
classDef offset fill:#fff9c4,stroke:#f9a825,stroke-width:1px,color:#333333
class P0,P1,P2,P3 partition
class C1A,C2A,C1B,C2B,C3B consumer
class O1,O2 offset
Step-by-Step Explanation:
- Topic has 4 partitions distributed across brokers
- Analytics group uses 2 consumers processing 2 partitions each
- Billing group uses 3 consumers with uneven distribution
- Each group maintains independent offset tracking
- Groups can process at different speeds
Revolutionary KIP-848: Zero-Downtime Rebalancing
The game-changing KIP-848 protocol eliminates the stop-the-world rebalancing that plagued earlier Kafka versions:
stateDiagram-v2
[*] --> Stable: All partitions assigned
state Traditional {
StableOld --> Rebalancing: Member change
Rebalancing --> StopProcessing: All consumers pause
StopProcessing --> Reassign: New assignments
Reassign --> Resume: All restart
Resume --> StableOld
}
state "KIP-848 Protocol" as KIP848 {
StableNew --> Incremental: Member change
Incremental --> ContinueProcessing: Others keep working
ContinueProcessing --> TargetedReassign: Only affected partitions
TargetedReassign --> StableNew: Seamless transition
}
note right of Traditional
Old: Full stop for any change
Downtime proportional to group size
end note
note right of KIP848
New: Incremental changes only
Near-zero downtime
end note
classDef traditional fill:#ffcdd2,stroke:#e53935
classDef modern fill:#c8e6c9,stroke:#43a047
class Traditional traditional
class KIP848 modern
Step-by-Step Explanation:
- Traditional protocol stops all consumers for any membership change
- KIP-848 allows unaffected consumers to continue processing
- Only partitions that need reassignment are moved
- Result: Massive reduction in rebalancing impact
Consumer Load Sharing
Kafka automatically distributes partitions across consumers for optimal load balancing:
classDiagram
class ConsumerGroup {
+groupId: string
+members: Consumer[]
+coordinator: Broker
+protocol: "KIP-848"
+sessionTimeout: int
+heartbeatInterval: int
+assignPartitions(): void
+rebalance(): void
}
class Consumer {
+consumerId: string
+assignedPartitions: Partition[]
+lastHeartbeat: timestamp
+commitOffset(partition, offset): void
+poll(): Records[]
}
class Partition {
+topicName: string
+partitionId: int
+currentOffset: long
+highWatermark: long
+leader: Broker
}
class GroupCoordinator {
+managedGroups: ConsumerGroup[]
+handleJoinGroup(): void
+handleLeaveGroup(): void
+handleHeartbeat(): void
+triggerRebalance(): void
}
ConsumerGroup "1" *-- "many" Consumer
Consumer "1" *-- "many" Partition
GroupCoordinator "1" *-- "many" ConsumerGroup
Step-by-Step Explanation:
- ConsumerGroup manages member consumers
- GroupCoordinator handles membership changes
- Consumers get exclusive access to assigned partitions
- Dynamic rebalancing maintains even distribution
Key behaviors:
- New consumer joins → Partitions redistribute for balance
- Consumer fails → Its partitions transfer to survivors
- Dynamic membership → Kafka protocol handles changes automatically
- Fair distribution → Each consumer gets approximately equal load
Kafka Consumer Failover
Modern Kafka ensures reliable message processing through sophisticated failover:
Offset Commit Strategies:
- Auto-commit - Periodic automatic offset commits
- Manual commit - Application controls when to commit
- Transactional - Exactly-once semantics with transactions
Failure Scenarios:
-
Consumer crashes before commit:
- Next consumer starts from last committed offset
- Some messages reprocessed (at-least-once)
- Make processing idempotent
-
Consumer crashes after processing:
- Offset already committed
- No message loss or duplication
- Clean failover
-
Network partition:
- Session timeout triggers reassignment
- Healthy consumers take over partitions
- Processing continues with minimal disruption
Advanced Offset Management
Kafka 4.0 stores offsets in the internal __consumer_offsets
topic with enhancements:
flowchart LR
subgraph Consumer["Consumer Application"]
Process[Process Records]
Commit[Commit Offset]
end
subgraph Kafka["Kafka Cluster"]
OP[Offset Partition<br>Compacted Log]
Coordinator[Group Coordinator]
end
Process -->|1. Process batch| Commit
Commit -->|2. Send commit| Coordinator
Coordinator -->|3. Write offset| OP
Coordinator -->|4. Acknowledge| Commit
style Consumer fill:#bbdefb,stroke:#1976d2,stroke-width:1px,color:#333333
style Kafka fill:#e1bee7,stroke:#8e24aa,stroke-width:1px,color:#333333
Features:
- Log compaction - Only latest offset per partition retained
- Replication - Offset data replicated for durability
- Fast recovery - Consumers resume from exact position
- Group metadata - Stores consumer group configuration
Consumer Visibility: High Watermark
Understanding what consumers can see is crucial:
flowchart TB
subgraph Partition["Topic Partition"]
M1[Message 1<br>Offset: 100]
M2[Message 2<br>Offset: 101]
M3[Message 3<br>Offset: 102]
HW[High Watermark<br>Offset: 102]
M4[Message 4<br>Offset: 103<br>⚠️ Unreplicated]
LEO[Log End Offset<br>Offset: 104]
end
subgraph Replicas["Replication Status"]
R1[Replica 1: 103 ✓]
R2[Replica 2: 102 ✓]
R3[Replica 3: 102 ✓]
end
Consumer -->|Can read| M1
Consumer -->|Can read| M2
Consumer -->|Can read| M3
Consumer -.->|Cannot read| M4
style M1,M2,M3 fill:#c8e6c9,stroke:#43a047,stroke-width:1px,color:#333333
style M4 fill:#ffcdd2,stroke:#e53935,stroke-width:1px,color:#333333
style HW fill:#fff9c4,stroke:#f9a825,stroke-width:2px,color:#333333
Key Points:
- High Watermark = Last fully replicated offset
- Consumers read up to High Watermark only
- Guarantees data durability before consumption
- Log End Offset = Where producers write next
Consumer to Partition Cardinality
Optimal partition assignment ensures maximum parallelism:
Kafka Architecture: Consumer Group to Partitions
Assignment Rules:
- One consumer per partition within a group
- Extra consumers remain idle as hot standbys
- Fewer consumers than partitions → Some handle multiple
- Automatic rebalancing maintains even distribution
Best Practices:
- Set partition count ≥ max expected consumers
- Use consistent partition counts across topics
- Monitor consumer lag per partition
- Plan for peak load scenarios
Multi-threaded Kafka Consumers
Threading Models
1. Thread per Consumer (Recommended)
// Each thread runs one consumer
class ConsumerThread implements Runnable {
private final KafkaConsumer<String, String> consumer;
public void run() {
while (!closed) {
ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100));
processRecords(records);
consumer.commitSync();
}
}
}
Benefits:
- Simple offset management
- Clear failure boundaries
- Kafka handles partition assignment
- Easy to scale horizontally
2. Consumer with Worker Threads
// One consumer, multiple processing threads
class MultiThreadedConsumer {
private final KafkaConsumer<String, String> consumer;
private final ExecutorService executor;
public void run() {
while (!closed) {
ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100));
// Submit records to worker threads
for (ConsumerRecord<String, String> record : records) {
executor.submit(() -> processRecord(record));
}
// Complex: Must track completion before commit
waitForCompletion();
consumer.commitSync();
}
}
}
Challenges:
- Complex offset management
- Ordering guarantees difficult
- Error handling complexity
- Use only for CPU-intensive processing
Native Queue Semantics (KIP-932)
Kafka 4.0 introduces early-access queue semantics for point-to-point messaging:
flowchart LR
subgraph Traditional["Pub/Sub Model"]
PT[Topic] --> CG1[Group 1]
PT --> CG2[Group 2]
PT --> CG3[Group 3]
end
subgraph Queue["Queue Semantics"]
Q[Queue] --> C1[Consumer 1]
Q -.->|Exclusive| C2[Consumer 2]
Q -.->|Exclusive| C3[Consumer 3]
end
note1[All groups get all messages]
note2[Each message to exactly one consumer]
Traditional -.-> note1
Queue -.-> note2
style Traditional fill:#bbdefb,stroke:#1976d2,stroke-width:1px,color:#333333
style Queue fill:#e1bee7,stroke:#8e24aa,stroke-width:1px,color:#333333
Kafka Consumer Architecture Review
What is a consumer group?
A logical grouping of consumers that coordinate to consume a topic. Each group maintains independent offsets and represents a unique subscription to the topic’s data.
Does each consumer group have its own offset?
Yes. Every consumer group tracks its own offset for each partition, enabling multiple independent consumers of the same data.
When can a consumer see a record?
Only after the record is fully replicated to all in-sync replicas (ISRs) and the High Watermark advances past that offset.
What happens if there are more consumers than partitions?
Extra consumers remain idle as hot standbys, ready to take over if active consumers fail. Design partition count based on expected parallelism.
What happens with multiple consumer threads?
Each thread can manage exclusive partitions (recommended) or share partition processing (complex). Thread-per-consumer provides simplest operation and clearest failure boundaries.
Best Practices for 2025
- Embrace KIP-848 - Enjoy zero-downtime rebalancing
- Plan partition counts - Set based on max parallelism needs
- Use idempotent processing - Handle at-least-once delivery
- Monitor consumer lag - Track performance per partition
- Implement health checks - Detect stuck consumers early
- Tune session timeouts - Balance failure detection vs stability
- Consider queue semantics - Evaluate for point-to-point use cases
- Test failure scenarios - Verify failover behavior
- Use thread-per-consumer - Simplify offset management
- Leverage metrics - Monitor rebalance frequency and duration
Related Content
- What is Kafka?
- Kafka Architecture
- Kafka Topic Architecture
- Kafka Consumer Architecture
- Kafka Producer Architecture
- Kafka Low Level Design
- Kafka and Schema Registry
- Kafka Ecosystem
- Kafka vs. JMS
- Kafka versus Kinesis
- Kafka Command Line Tutorial
- Kafka Failover Tutorial
- Kafka Producer Java Example
- Kafka Consumer Java Example
About Cloudurable
Transform your Kafka deployment with expert guidance. Cloudurable provides Kafka training, Kafka consulting, Kafka support and helps setting up Kafka clusters in AWS.
Check out our new GoLang course. We provide onsite Go Lang training which is instructor led.
TweetApache Spark Training
Kafka Tutorial
Akka Consulting
Cassandra Training
AWS Cassandra Database Support
Kafka Support Pricing
Cassandra Database Support Pricing
Non-stop Cassandra
Watchdog
Advantages of using Cloudurable™
Cassandra Consulting
Cloudurable™| Guide to AWS Cassandra Deploy
Cloudurable™| AWS Cassandra Guidelines and Notes
Free guide to deploying Cassandra on AWS
Kafka Training
Kafka Consulting
DynamoDB Training
DynamoDB Consulting
Kinesis Training
Kinesis Consulting
Kafka Tutorial PDF
Kubernetes Security Training
Redis Consulting
Redis Training
ElasticSearch / ELK Consulting
ElasticSearch Training
InfluxDB/TICK Training TICK Consulting