January 9, 2025
🚀 What’s New in This 2025 Update
Major Updates and Changes
- KRaft Mode - No more ZooKeeper! Using Kafka’s native consensus
- Kafka 4.0 - Latest version with simplified operations
- Updated Commands - Modern CLI syntax and options
- Cloud Options - Quick start with Docker and cloud services
- Performance Tools - New monitoring and testing utilities
Key Improvements Since 2017
- ✅ Single Process - Kafka runs standalone with KRaft
- ✅ Faster Startup - Seconds instead of minutes
- ✅ Simpler Config - Fewer configuration files
- ✅ Better Defaults - Production-ready out of the box
Ready to get hands-on with Kafka? Let’s start from the command line and see Kafka 4.0 in action!
Getting Started with Kafka Tutorial
If you’re not sure what Kafka is, start here “What is Kafka?”.
flowchart LR
subgraph Traditional["Kafka < 4.0"]
ZK[1. Start ZooKeeper]
KB[2. Start Kafka Broker]
ZK --> KB
end
subgraph Modern["Kafka 4.0+"]
KRAFT[1. Start Kafka with KRaft]
end
Traditional -->|Simplified| Modern
style Traditional fill:#ffebee,stroke:#e53935,stroke-width:2px
style Modern fill:#e8f5e9,stroke:#43a047,stroke-width:2px
In this tutorial, we’ll use Apache Kafka 4.0 with KRaft mode - no ZooKeeper required! We’ll create topics, send messages via producers, and consume messages, all from the command line.
Prerequisites and Setup
Download and Install Kafka 4.0
# Download Kafka 4.0
wget https://downloads.apache.org/kafka/4.0.0/kafka_2.13-4.0.0.tgz
# Extract and setup
tar -xzf kafka_2.13-4.0.0.tgz
mv kafka_2.13-4.0.0 ~/kafka-training/kafka
# Create working directory
mkdir -p ~/kafka-training/lab1
cd ~/kafka-training
Requirements:
- Java 17+ (required for Kafka 4.0 brokers)
- Java 11+ (minimum for clients)
- 4GB RAM minimum
- Linux, macOS, or WSL2 on Windows
Cloudurable provides Kafka training, Kafka consulting, Kafka support and helps setting up Kafka clusters in AWS.
Starting Kafka with KRaft Mode
Step 1: Generate Cluster ID
First, we need to generate a unique cluster ID for our Kafka instance:
~/kafka-training/generate-cluster-id.sh
#!/usr/bin/env bash
cd ~/kafka-training
# Generate a cluster ID
KAFKA_CLUSTER_ID="$(kafka/bin/kafka-storage.sh random-uuid)"
echo "Cluster ID: $KAFKA_CLUSTER_ID"
# Save it for later use
echo $KAFKA_CLUSTER_ID > cluster.id
Step 2: Format Storage
Format the Kafka storage with the cluster ID:
~/kafka-training/format-storage.sh
#!/usr/bin/env bash
cd ~/kafka-training
# Read the cluster ID
KAFKA_CLUSTER_ID=$(cat cluster.id)
# Format the storage
kafka/bin/kafka-storage.sh format \
-t $KAFKA_CLUSTER_ID \
-c kafka/config/kraft/server.properties
Step 3: Start Kafka Server
Now start Kafka in KRaft mode:
~/kafka-training/run-kafka-kraft.sh
#!/usr/bin/env bash
cd ~/kafka-training
# Start Kafka with KRaft
kafka/bin/kafka-server-start.sh \
kafka/config/kraft/server.properties
Run the scripts
# Make scripts executable
chmod +x generate-cluster-id.sh format-storage.sh run-kafka-kraft.sh
# Generate cluster ID
./generate-cluster-id.sh
# Format storage
./format-storage.sh
# Start Kafka
./run-kafka-kraft.sh
🎉 That’s it! Kafka is now running with KRaft - no ZooKeeper needed!
Quick Start with Docker (Alternative)
For an even quicker start, use Docker:
# Run Kafka 4.0 with Docker
docker run -d \
--name kafka \
-p 9092:9092 \
-e KAFKA_NODE_ID=1 \
-e KAFKA_PROCESS_ROLES='broker,controller' \
-e KAFKA_CONTROLLER_QUORUM_VOTERS='1@kafka:29093' \
-e KAFKA_LISTENERS='PLAINTEXT://0.0.0.0:9092,CONTROLLER://0.0.0.0:29093' \
-e KAFKA_ADVERTISED_LISTENERS='PLAINTEXT://localhost:9092' \
-e KAFKA_CONTROLLER_LISTENER_NAMES='CONTROLLER' \
-e KAFKA_LISTENER_SECURITY_PROTOCOL_MAP='CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT' \
-e KAFKA_LOG_DIRS='/tmp/kraft-logs' \
-e CLUSTER_ID='MkU3OEVBNTcwNTJENDM2Qk' \
apache/kafka:4.0.0
Creating Kafka Topics
Understanding Modern Topic Creation
flowchart TB
subgraph TopicCreation["Topic Creation Process"]
CLI[kafka-topics CLI]
KRAFT[KRaft Controller]
META[(Metadata Log)]
PART[Partition Assignment]
end
CLI -->|Create Request| KRAFT
KRAFT -->|Write Metadata| META
KRAFT -->|Assign Leaders| PART
style TopicCreation fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
Create a Topic
Let’s create a topic with 13 partitions for parallel processing:
~/kafka-training/lab1/create-topic.sh
#!/usr/bin/env bash
cd ~/kafka-training
# Create a topic with KRaft (no ZooKeeper!)
kafka/bin/kafka-topics.sh --create \
--bootstrap-server localhost:9092 \
--replication-factor 1 \
--partitions 13 \
--topic my-topic
# Create a compacted topic for state management
kafka/bin/kafka-topics.sh --create \
--bootstrap-server localhost:9092 \
--replication-factor 1 \
--partitions 1 \
--topic user-profiles \
--config cleanup.policy=compact \
--config min.cleanable.dirty.ratio=0.01 \
--config segment.ms=100
Run create-topic.sh
cd ~/kafka-training/lab1
chmod +x create-topic.sh
./create-topic.sh
# Output:
Created topic my-topic.
Created topic user-profiles.
List Topics
View all topics in the cluster:
~/kafka-training/lab1/list-topics.sh
#!/usr/bin/env bash
cd ~/kafka-training
# List all topics
kafka/bin/kafka-topics.sh --list \
--bootstrap-server localhost:9092
# Get detailed topic information
kafka/bin/kafka-topics.sh --describe \
--bootstrap-server localhost:9092 \
--topic my-topic
Run list-topics.sh
./list-topics.sh
# Output:
my-topic
user-profiles
__consumer_offsets
Producing Messages
Kafka Console Producer
sequenceDiagram
participant User
participant Producer
participant Broker
participant Partition
User->>Producer: Type message
Producer->>Broker: Send to topic
Broker->>Partition: Assign partition
Partition-->>Producer: Acknowledge
Producer-->>User: Confirmed
~/kafka-training/lab1/start-producer-console.sh
#!/usr/bin/env bash
cd ~/kafka-training
# Modern producer with additional options
kafka/bin/kafka-console-producer.sh \
--bootstrap-server localhost:9092 \
--topic my-topic \
--property "parse.key=true" \
--property "key.separator=:"
# Simple producer (no keys)
# kafka/bin/kafka-console-producer.sh \
# --bootstrap-server localhost:9092 \
# --topic my-topic
Run the producer and send messages
./start-producer-console.sh
# Type messages (format: key:value)
user1:This is message 1
user2:This is message 2
user1:This is message 3
user3:Message 4
user1:Message 5
# Press Ctrl+C to exit
Producer with Performance Metrics
~/kafka-training/lab1/producer-with-metrics.sh
#!/usr/bin/env bash
cd ~/kafka-training
# Producer with metrics reporting
kafka/bin/kafka-console-producer.sh \
--bootstrap-server localhost:9092 \
--topic my-topic \
--producer-property acks=all \
--producer-property retries=3 \
--producer-property enable.idempotence=true \
--property print.timestamp=true \
--property print.key=true
Consuming Messages
Basic Consumer
~/kafka-training/lab1/start-consumer-console.sh
#!/usr/bin/env bash
cd ~/kafka-training
# Consumer reading from beginning
kafka/bin/kafka-console-consumer.sh \
--bootstrap-server localhost:9092 \
--topic my-topic \
--from-beginning \
--property print.key=true \
--property print.timestamp=true \
--property print.partition=true
Run consumer in another terminal
./start-consumer-console.sh
# Output:
CreateTime:1704825600000 Partition:2 user1 This is message 1
CreateTime:1704825601000 Partition:5 user2 This is message 2
CreateTime:1704825602000 Partition:2 user1 This is message 3
CreateTime:1704825603000 Partition:8 user3 Message 4
CreateTime:1704825604000 Partition:2 user1 Message 5
Consumer Groups
Create multiple consumers in a group for load balancing:
~/kafka-training/lab1/start-consumer-group.sh
#!/usr/bin/env bash
cd ~/kafka-training
# Consumer as part of a group
kafka/bin/kafka-console-consumer.sh \
--bootstrap-server localhost:9092 \
--topic my-topic \
--group my-consumer-group \
--property print.key=true \
--property print.value=true \
--property print.partition=true \
--property print.offset=true
Run this script in 3 terminals to see load balancing in action!
Advanced Command Line Tools
1. Performance Testing
~/kafka-training/lab1/performance-test.sh
#!/usr/bin/env bash
cd ~/kafka-training
# Producer performance test
kafka/bin/kafka-producer-perf-test.sh \
--topic my-topic \
--num-records 100000 \
--record-size 1000 \
--throughput 10000 \
--producer-props bootstrap.servers=localhost:9092
# Consumer performance test
kafka/bin/kafka-consumer-perf-test.sh \
--bootstrap-server localhost:9092 \
--topic my-topic \
--messages 100000 \
--threads 1
2. Consumer Group Management
~/kafka-training/lab1/manage-consumer-groups.sh
#!/usr/bin/env bash
cd ~/kafka-training
# List consumer groups
kafka/bin/kafka-consumer-groups.sh \
--bootstrap-server localhost:9092 \
--list
# Describe consumer group
kafka/bin/kafka-consumer-groups.sh \
--bootstrap-server localhost:9092 \
--describe \
--group my-consumer-group
# Reset consumer group offset
kafka/bin/kafka-consumer-groups.sh \
--bootstrap-server localhost:9092 \
--group my-consumer-group \
--reset-offsets \
--to-earliest \
--topic my-topic \
--dry-run
3. Topic Management
~/kafka-training/lab1/manage-topics.sh
#!/usr/bin/env bash
cd ~/kafka-training
# Alter topic configuration
kafka/bin/kafka-configs.sh \
--bootstrap-server localhost:9092 \
--entity-type topics \
--entity-name my-topic \
--alter \
--add-config retention.ms=604800000
# Delete topic (be careful!)
# kafka/bin/kafka-topics.sh \
# --bootstrap-server localhost:9092 \
# --delete \
# --topic my-topic
Monitoring and Debugging
View Log Segments
# Check Kafka logs
ls -la /tmp/kraft-combined-logs/my-topic-0/
# View log segment files
kafka/bin/kafka-dump-log.sh \
--files /tmp/kraft-combined-logs/my-topic-0/00000000000000000000.log \
--print-data-log
JMX Metrics
# Export JMX metrics
export KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote \
-Dcom.sun.management.jmxremote.port=9999 \
-Dcom.sun.management.jmxremote.authenticate=false \
-Dcom.sun.management.jmxremote.ssl=false"
# Use jconsole or other JMX tools to connect to localhost:9999
Common Patterns and Best Practices
Message Ordering
flowchart TB
subgraph Ordering["Message Ordering Guarantees"]
P1[Partition 1<br>user1 messages<br>Ordered ✓]
P2[Partition 2<br>user2 messages<br>Ordered ✓]
P3[Partition 3<br>user3 messages<br>Ordered ✓]
C[Single Consumer<br>Mixed Order]
end
P1 --> C
P2 --> C
P3 --> C
style P1 fill:#e8f5e9,stroke:#43a047,stroke-width:1px
style P2 fill:#e3f2fd,stroke:#1976d2,stroke-width:1px
style P3 fill:#fff9c4,stroke:#f9a825,stroke-width:1px
Messages are ordered within a partition, not across partitions. Use message keys to ensure related messages go to the same partition.
Production Configurations
# Production-ready topic
kafka/bin/kafka-topics.sh --create \
--bootstrap-server localhost:9092 \
--replication-factor 3 \
--partitions 50 \
--topic production-events \
--config min.insync.replicas=2 \
--config retention.ms=604800000 \
--config compression.type=snappy
Troubleshooting
Common Issues and Solutions
Issue | Solution |
---|---|
Broker not available | Check if Kafka is running: jps | grep Kafka |
Topic already exists | List topics first, or use --if-not-exists |
Messages out of order | Use key-based partitioning |
Consumer lag | Add more consumers or partitions |
Connection refused | Verify listeners configuration |
Useful Debug Commands
# Check Kafka process
ps aux | grep kafka
# View Kafka logs
tail -f ~/kafka-training/kafka/logs/server.log
# Test connectivity
nc -zv localhost 9092
# Check disk usage
df -h /tmp/kraft-combined-logs/
Review Questions
What’s different in Kafka 4.0?
No ZooKeeper! Kafka uses KRaft for consensus, making it simpler to deploy and manage.
What tool creates topics?
kafka-topics.sh
with --bootstrap-server
(not --zookeeper
)
How do you ensure message ordering?
Use message keys - messages with the same key go to the same partition.
What’s a consumer group?
A group of consumers that share the work of reading from topic partitions.
How do you check consumer lag?
Use kafka-consumer-groups.sh --describe
Next Steps
Now that you’ve mastered the command line basics:
-
Try the Java APIs
-
Learn about Architecture
-
Explore Advanced Features
Summary
Congratulations! You’ve successfully:
- ✅ Started Kafka 4.0 with KRaft (no ZooKeeper!)
- ✅ Created and managed topics
- ✅ Produced and consumed messages
- ✅ Explored consumer groups
- ✅ Used performance testing tools
Kafka’s command-line tools provide powerful capabilities for development, testing, and operations. With KRaft mode, getting started is simpler than ever!
Related Content
- What is Kafka?
- Kafka Architecture
- Kafka Topic Architecture
- Kafka Consumer Architecture
- Kafka Producer Architecture
- Kafka Ecosystem
- Kafka vs. JMS
- Kafka Failover Tutorial
- Kafka Producer Java Example
- Kafka Consumer Java Example
About Cloudurable
We hope you enjoyed this tutorial. Please provide feedback. Cloudurable provides Kafka training, Kafka consulting, Kafka support and helps setting up Kafka clusters in AWS.
Check out our new GoLang course. We provide onsite Go Lang training which is instructor led.
TweetApache Spark Training
Kafka Tutorial
Akka Consulting
Cassandra Training
AWS Cassandra Database Support
Kafka Support Pricing
Cassandra Database Support Pricing
Non-stop Cassandra
Watchdog
Advantages of using Cloudurable™
Cassandra Consulting
Cloudurable™| Guide to AWS Cassandra Deploy
Cloudurable™| AWS Cassandra Guidelines and Notes
Free guide to deploying Cassandra on AWS
Kafka Training
Kafka Consulting
DynamoDB Training
DynamoDB Consulting
Kinesis Training
Kinesis Consulting
Kafka Tutorial PDF
Kubernetes Security Training
Redis Consulting
Redis Training
ElasticSearch / ELK Consulting
ElasticSearch Training
InfluxDB/TICK Training TICK Consulting