Kafka Tutorial: Using Kafka from the Command Line - 2025 Edition

January 9, 2025

                                                                           

🚀 What’s New in This 2025 Update

Major Updates and Changes

  • KRaft Mode - No more ZooKeeper! Using Kafka’s native consensus
  • Kafka 4.0 - Latest version with simplified operations
  • Updated Commands - Modern CLI syntax and options
  • Cloud Options - Quick start with Docker and cloud services
  • Performance Tools - New monitoring and testing utilities

Key Improvements Since 2017

  • ✅ Single Process - Kafka runs standalone with KRaft
  • ✅ Faster Startup - Seconds instead of minutes
  • ✅ Simpler Config - Fewer configuration files
  • ✅ Better Defaults - Production-ready out of the box

Ready to get hands-on with Kafka? Let’s start from the command line and see Kafka 4.0 in action!

Getting Started with Kafka Tutorial

If you’re not sure what Kafka is, start here “What is Kafka?”.

flowchart LR
  subgraph Traditional["Kafka < 4.0"]
    ZK[1. Start ZooKeeper]
    KB[2. Start Kafka Broker]
    ZK --> KB
  end
  
  subgraph Modern["Kafka 4.0+"]
    KRAFT[1. Start Kafka with KRaft]
  end
  
  Traditional -->|Simplified| Modern
  
  style Traditional fill:#ffebee,stroke:#e53935,stroke-width:2px
  style Modern fill:#e8f5e9,stroke:#43a047,stroke-width:2px

In this tutorial, we’ll use Apache Kafka 4.0 with KRaft mode - no ZooKeeper required! We’ll create topics, send messages via producers, and consume messages, all from the command line.

Prerequisites and Setup

Download and Install Kafka 4.0

# Download Kafka 4.0
wget https://downloads.apache.org/kafka/4.0.0/kafka_2.13-4.0.0.tgz

# Extract and setup
tar -xzf kafka_2.13-4.0.0.tgz
mv kafka_2.13-4.0.0 ~/kafka-training/kafka

# Create working directory
mkdir -p ~/kafka-training/lab1
cd ~/kafka-training

Requirements:

  • Java 17+ (required for Kafka 4.0 brokers)
  • Java 11+ (minimum for clients)
  • 4GB RAM minimum
  • Linux, macOS, or WSL2 on Windows

Cloudurable provides Kafka training, Kafka consulting, Kafka support and helps setting up Kafka clusters in AWS.

Starting Kafka with KRaft Mode

Step 1: Generate Cluster ID

First, we need to generate a unique cluster ID for our Kafka instance:

~/kafka-training/generate-cluster-id.sh

#!/usr/bin/env bash
cd ~/kafka-training

# Generate a cluster ID
KAFKA_CLUSTER_ID="$(kafka/bin/kafka-storage.sh random-uuid)"
echo "Cluster ID: $KAFKA_CLUSTER_ID"

# Save it for later use
echo $KAFKA_CLUSTER_ID > cluster.id

Step 2: Format Storage

Format the Kafka storage with the cluster ID:

~/kafka-training/format-storage.sh

#!/usr/bin/env bash
cd ~/kafka-training

# Read the cluster ID
KAFKA_CLUSTER_ID=$(cat cluster.id)

# Format the storage
kafka/bin/kafka-storage.sh format \
    -t $KAFKA_CLUSTER_ID \
    -c kafka/config/kraft/server.properties

Step 3: Start Kafka Server

Now start Kafka in KRaft mode:

~/kafka-training/run-kafka-kraft.sh

#!/usr/bin/env bash
cd ~/kafka-training

# Start Kafka with KRaft
kafka/bin/kafka-server-start.sh \
    kafka/config/kraft/server.properties

Run the scripts

# Make scripts executable
chmod +x generate-cluster-id.sh format-storage.sh run-kafka-kraft.sh

# Generate cluster ID
./generate-cluster-id.sh

# Format storage
./format-storage.sh

# Start Kafka
./run-kafka-kraft.sh

🎉 That’s it! Kafka is now running with KRaft - no ZooKeeper needed!

Quick Start with Docker (Alternative)

For an even quicker start, use Docker:

# Run Kafka 4.0 with Docker
docker run -d \
  --name kafka \
  -p 9092:9092 \
  -e KAFKA_NODE_ID=1 \
  -e KAFKA_PROCESS_ROLES='broker,controller' \
  -e KAFKA_CONTROLLER_QUORUM_VOTERS='1@kafka:29093' \
  -e KAFKA_LISTENERS='PLAINTEXT://0.0.0.0:9092,CONTROLLER://0.0.0.0:29093' \
  -e KAFKA_ADVERTISED_LISTENERS='PLAINTEXT://localhost:9092' \
  -e KAFKA_CONTROLLER_LISTENER_NAMES='CONTROLLER' \
  -e KAFKA_LISTENER_SECURITY_PROTOCOL_MAP='CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT' \
  -e KAFKA_LOG_DIRS='/tmp/kraft-logs' \
  -e CLUSTER_ID='MkU3OEVBNTcwNTJENDM2Qk' \
  apache/kafka:4.0.0

Creating Kafka Topics

Understanding Modern Topic Creation

flowchart TB
  subgraph TopicCreation["Topic Creation Process"]
    CLI[kafka-topics CLI]
    KRAFT[KRaft Controller]
    META[(Metadata Log)]
    PART[Partition Assignment]
  end
  
  CLI -->|Create Request| KRAFT
  KRAFT -->|Write Metadata| META
  KRAFT -->|Assign Leaders| PART
  
  style TopicCreation fill:#e3f2fd,stroke:#1976d2,stroke-width:2px

Create a Topic

Let’s create a topic with 13 partitions for parallel processing:

~/kafka-training/lab1/create-topic.sh

#!/usr/bin/env bash
cd ~/kafka-training

# Create a topic with KRaft (no ZooKeeper!)
kafka/bin/kafka-topics.sh --create \
  --bootstrap-server localhost:9092 \
  --replication-factor 1 \
  --partitions 13 \
  --topic my-topic

# Create a compacted topic for state management
kafka/bin/kafka-topics.sh --create \
  --bootstrap-server localhost:9092 \
  --replication-factor 1 \
  --partitions 1 \
  --topic user-profiles \
  --config cleanup.policy=compact \
  --config min.cleanable.dirty.ratio=0.01 \
  --config segment.ms=100

Run create-topic.sh

cd ~/kafka-training/lab1
chmod +x create-topic.sh
./create-topic.sh

# Output:
Created topic my-topic.
Created topic user-profiles.

List Topics

View all topics in the cluster:

~/kafka-training/lab1/list-topics.sh

#!/usr/bin/env bash
cd ~/kafka-training

# List all topics
kafka/bin/kafka-topics.sh --list \
    --bootstrap-server localhost:9092

# Get detailed topic information
kafka/bin/kafka-topics.sh --describe \
    --bootstrap-server localhost:9092 \
    --topic my-topic

Run list-topics.sh

./list-topics.sh

# Output:
my-topic
user-profiles
__consumer_offsets

Producing Messages

Kafka Console Producer

sequenceDiagram
  participant User
  participant Producer
  participant Broker
  participant Partition
  
  User->>Producer: Type message
  Producer->>Broker: Send to topic
  Broker->>Partition: Assign partition
  Partition-->>Producer: Acknowledge
  Producer-->>User: Confirmed

~/kafka-training/lab1/start-producer-console.sh

#!/usr/bin/env bash
cd ~/kafka-training

# Modern producer with additional options
kafka/bin/kafka-console-producer.sh \
    --bootstrap-server localhost:9092 \
    --topic my-topic \
    --property "parse.key=true" \
    --property "key.separator=:"

# Simple producer (no keys)
# kafka/bin/kafka-console-producer.sh \
#     --bootstrap-server localhost:9092 \
#     --topic my-topic

Run the producer and send messages

./start-producer-console.sh

# Type messages (format: key:value)
user1:This is message 1
user2:This is message 2
user1:This is message 3
user3:Message 4
user1:Message 5

# Press Ctrl+C to exit

Producer with Performance Metrics

~/kafka-training/lab1/producer-with-metrics.sh

#!/usr/bin/env bash
cd ~/kafka-training

# Producer with metrics reporting
kafka/bin/kafka-console-producer.sh \
    --bootstrap-server localhost:9092 \
    --topic my-topic \
    --producer-property acks=all \
    --producer-property retries=3 \
    --producer-property enable.idempotence=true \
    --property print.timestamp=true \
    --property print.key=true

Consuming Messages

Basic Consumer

~/kafka-training/lab1/start-consumer-console.sh

#!/usr/bin/env bash
cd ~/kafka-training

# Consumer reading from beginning
kafka/bin/kafka-console-consumer.sh \
    --bootstrap-server localhost:9092 \
    --topic my-topic \
    --from-beginning \
    --property print.key=true \
    --property print.timestamp=true \
    --property print.partition=true

Run consumer in another terminal

./start-consumer-console.sh

# Output:
CreateTime:1704825600000	Partition:2	user1	This is message 1
CreateTime:1704825601000	Partition:5	user2	This is message 2
CreateTime:1704825602000	Partition:2	user1	This is message 3
CreateTime:1704825603000	Partition:8	user3	Message 4
CreateTime:1704825604000	Partition:2	user1	Message 5

Consumer Groups

Create multiple consumers in a group for load balancing:

~/kafka-training/lab1/start-consumer-group.sh

#!/usr/bin/env bash
cd ~/kafka-training

# Consumer as part of a group
kafka/bin/kafka-console-consumer.sh \
    --bootstrap-server localhost:9092 \
    --topic my-topic \
    --group my-consumer-group \
    --property print.key=true \
    --property print.value=true \
    --property print.partition=true \
    --property print.offset=true

Run this script in 3 terminals to see load balancing in action!

Advanced Command Line Tools

1. Performance Testing

~/kafka-training/lab1/performance-test.sh

#!/usr/bin/env bash
cd ~/kafka-training

# Producer performance test
kafka/bin/kafka-producer-perf-test.sh \
    --topic my-topic \
    --num-records 100000 \
    --record-size 1000 \
    --throughput 10000 \
    --producer-props bootstrap.servers=localhost:9092

# Consumer performance test
kafka/bin/kafka-consumer-perf-test.sh \
    --bootstrap-server localhost:9092 \
    --topic my-topic \
    --messages 100000 \
    --threads 1

2. Consumer Group Management

~/kafka-training/lab1/manage-consumer-groups.sh

#!/usr/bin/env bash
cd ~/kafka-training

# List consumer groups
kafka/bin/kafka-consumer-groups.sh \
    --bootstrap-server localhost:9092 \
    --list

# Describe consumer group
kafka/bin/kafka-consumer-groups.sh \
    --bootstrap-server localhost:9092 \
    --describe \
    --group my-consumer-group

# Reset consumer group offset
kafka/bin/kafka-consumer-groups.sh \
    --bootstrap-server localhost:9092 \
    --group my-consumer-group \
    --reset-offsets \
    --to-earliest \
    --topic my-topic \
    --dry-run

3. Topic Management

~/kafka-training/lab1/manage-topics.sh

#!/usr/bin/env bash
cd ~/kafka-training

# Alter topic configuration
kafka/bin/kafka-configs.sh \
    --bootstrap-server localhost:9092 \
    --entity-type topics \
    --entity-name my-topic \
    --alter \
    --add-config retention.ms=604800000

# Delete topic (be careful!)
# kafka/bin/kafka-topics.sh \
#     --bootstrap-server localhost:9092 \
#     --delete \
#     --topic my-topic

Monitoring and Debugging

View Log Segments

# Check Kafka logs
ls -la /tmp/kraft-combined-logs/my-topic-0/

# View log segment files
kafka/bin/kafka-dump-log.sh \
    --files /tmp/kraft-combined-logs/my-topic-0/00000000000000000000.log \
    --print-data-log

JMX Metrics

# Export JMX metrics
export KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote \
    -Dcom.sun.management.jmxremote.port=9999 \
    -Dcom.sun.management.jmxremote.authenticate=false \
    -Dcom.sun.management.jmxremote.ssl=false"

# Use jconsole or other JMX tools to connect to localhost:9999

Common Patterns and Best Practices

Message Ordering

flowchart TB
  subgraph Ordering["Message Ordering Guarantees"]
    P1[Partition 1<br>user1 messages<br>Ordered ✓]
    P2[Partition 2<br>user2 messages<br>Ordered ✓]
    P3[Partition 3<br>user3 messages<br>Ordered ✓]
    
    C[Single Consumer<br>Mixed Order]
  end
  
  P1 --> C
  P2 --> C
  P3 --> C
  
  style P1 fill:#e8f5e9,stroke:#43a047,stroke-width:1px
  style P2 fill:#e3f2fd,stroke:#1976d2,stroke-width:1px
  style P3 fill:#fff9c4,stroke:#f9a825,stroke-width:1px

Messages are ordered within a partition, not across partitions. Use message keys to ensure related messages go to the same partition.

Production Configurations

# Production-ready topic
kafka/bin/kafka-topics.sh --create \
    --bootstrap-server localhost:9092 \
    --replication-factor 3 \
    --partitions 50 \
    --topic production-events \
    --config min.insync.replicas=2 \
    --config retention.ms=604800000 \
    --config compression.type=snappy

Troubleshooting

Common Issues and Solutions

Issue Solution
Broker not available Check if Kafka is running: jps | grep Kafka
Topic already exists List topics first, or use --if-not-exists
Messages out of order Use key-based partitioning
Consumer lag Add more consumers or partitions
Connection refused Verify listeners configuration

Useful Debug Commands

# Check Kafka process
ps aux | grep kafka

# View Kafka logs
tail -f ~/kafka-training/kafka/logs/server.log

# Test connectivity
nc -zv localhost 9092

# Check disk usage
df -h /tmp/kraft-combined-logs/

Review Questions

What’s different in Kafka 4.0?

No ZooKeeper! Kafka uses KRaft for consensus, making it simpler to deploy and manage.

What tool creates topics?

kafka-topics.sh with --bootstrap-server (not --zookeeper)

How do you ensure message ordering?

Use message keys - messages with the same key go to the same partition.

What’s a consumer group?

A group of consumers that share the work of reading from topic partitions.

How do you check consumer lag?

Use kafka-consumer-groups.sh --describe

Next Steps

Now that you’ve mastered the command line basics:

  1. Try the Java APIs

  2. Learn about Architecture

  3. Explore Advanced Features

Summary

Congratulations! You’ve successfully:

  • ✅ Started Kafka 4.0 with KRaft (no ZooKeeper!)
  • ✅ Created and managed topics
  • ✅ Produced and consumed messages
  • ✅ Explored consumer groups
  • ✅ Used performance testing tools

Kafka’s command-line tools provide powerful capabilities for development, testing, and operations. With KRaft mode, getting started is simpler than ever!

About Cloudurable

We hope you enjoyed this tutorial. Please provide feedback. Cloudurable provides Kafka training, Kafka consulting, Kafka support and helps setting up Kafka clusters in AWS.

Check out our new GoLang course. We provide onsite Go Lang training which is instructor led.

                                                                           
comments powered by Disqus

Apache Spark Training
Kafka Tutorial
Akka Consulting
Cassandra Training
AWS Cassandra Database Support
Kafka Support Pricing
Cassandra Database Support Pricing
Non-stop Cassandra
Watchdog
Advantages of using Cloudurable™
Cassandra Consulting
Cloudurable™| Guide to AWS Cassandra Deploy
Cloudurable™| AWS Cassandra Guidelines and Notes
Free guide to deploying Cassandra on AWS
Kafka Training
Kafka Consulting
DynamoDB Training
DynamoDB Consulting
Kinesis Training
Kinesis Consulting
Kafka Tutorial PDF
Kubernetes Security Training
Redis Consulting
Redis Training
ElasticSearch / ELK Consulting
ElasticSearch Training
InfluxDB/TICK Training TICK Consulting