January 9, 2025
🚀 What’s New in This 2025 Update
Major Changes Since 2020
- ZooKeeper Deprecated - KRaft, etcd, and Consul are modern alternatives
- Operator Patterns - Automated lifecycle management for complex stateful apps
- CSI Drivers - Advanced storage operations, snapshots, and backup integration
- PVC Cleanup - Automated volume management for StatefulSets
- Security Enhanced - RBAC, network policies, and secret management
- Cloud-Native Ready - Mature operators for Redis, PostgreSQL, MongoDB
Key Improvements
- ✅ Better Performance - Improved memory management and resource optimization
- ✅ Enhanced Security - Encryption, RBAC, and least privilege patterns
- ✅ Automated Operations - Operators handle complex lifecycle management
- ✅ Modern Storage - CSI drivers with snapshot and backup capabilities
Modern StatefulSet Patterns
StatefulSets remain the foundation for running stateful applications in Kubernetes. However, the ecosystem has evolved significantly with better tooling, operators, and modern alternatives to legacy technologies like ZooKeeper.
Modern Alternatives to ZooKeeper
KRaft (Kafka Raft Metadata Mode)
KRaft is now the default for Apache Kafka, eliminating ZooKeeper dependency.
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: kafka-kraft
spec:
serviceName: kafka-headless
replicas: 3
selector:
matchLabels:
app: kafka
template:
metadata:
labels:
app: kafka
spec:
containers:
- name: kafka
image: apache/kafka:3.8.0
env:
- name: KAFKA_NODE_ID
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: KAFKA_PROCESS_ROLES
value: "controller,broker"
- name: KAFKA_CONTROLLER_QUORUM_VOTERS
value: "1@kafka-0.kafka-headless:9093,2@kafka-1.kafka-headless:9093,3@kafka-2.kafka-headless:9093"
ports:
- containerPort: 9092
name: kafka
- containerPort: 9093
name: controller
volumeMounts:
- name: data
mountPath: /var/lib/kafka
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 10Gi
etcd for Distributed Configuration
etcd provides a distributed key-value store with strong consistency.
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: etcd
spec:
serviceName: etcd
replicas: 3
selector:
matchLabels:
app: etcd
template:
metadata:
labels:
app: etcd
spec:
containers:
- name: etcd
image: quay.io/coreos/etcd:v3.5.12
command:
- etcd
- --name=$(POD_NAME)
- --data-dir=/var/lib/etcd
- --initial-advertise-peer-urls=http://$(POD_NAME).etcd:2380
- --listen-peer-urls=http://0.0.0.0:2380
- --advertise-client-urls=http://$(POD_NAME).etcd:2379
- --listen-client-urls=http://0.0.0.0:2379
- --initial-cluster=etcd-0=http://etcd-0.etcd:2380,etcd-1=http://etcd-1.etcd:2380,etcd-2=http://etcd-2.etcd:2380
- --initial-cluster-state=new
- --initial-cluster-token=etcd-cluster
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
ports:
- containerPort: 2379
name: client
- containerPort: 2380
name: peer
volumeMounts:
- name: data
mountPath: /var/lib/etcd
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 1Gi
Modern StatefulSet Best Practices
1. Resource Management
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: redis
spec:
serviceName: redis
replicas: 3
selector:
matchLabels:
app: redis
template:
metadata:
labels:
app: redis
spec:
containers:
- name: redis
image: redis:7.2-alpine
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "500m"
ports:
- containerPort: 6379
name: redis
volumeMounts:
- name: data
mountPath: /data
- name: config
mountPath: /etc/redis
volumes:
- name: config
configMap:
name: redis-config
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: "fast-ssd"
resources:
requests:
storage: 5Gi
2. Pod Disruption Budgets
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: redis-pdb
spec:
minAvailable: 2
selector:
matchLabels:
app: redis
3. Modern Anti-Affinity Rules
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgresql
spec:
serviceName: postgresql
replicas: 3
selector:
matchLabels:
app: postgresql
template:
metadata:
labels:
app: postgresql
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: postgresql
topologyKey: kubernetes.io/hostname
podAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchLabels:
app: postgresql
topologyKey: topology.kubernetes.io/zone
containers:
- name: postgresql
image: postgres:16
env:
- name: POSTGRES_DB
value: mydb
- name: POSTGRES_USER
valueFrom:
secretKeyRef:
name: postgresql-secret
key: username
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: postgresql-secret
key: password
ports:
- containerPort: 5432
name: postgresql
volumeMounts:
- name: data
mountPath: /var/lib/postgresql/data
- name: config
mountPath: /etc/postgresql
volumes:
- name: config
configMap:
name: postgresql-config
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: "high-performance"
resources:
requests:
storage: 20Gi
CSI Drivers and Storage Management
Volume Snapshots for Backup
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: postgresql-snapshot
spec:
volumeSnapshotClassName: csi-snapclass
source:
persistentVolumeClaimName: data-postgresql-0
---
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
name: csi-snapclass
driver: ebs.csi.aws.com
deletionPolicy: Delete
Automated Backup CronJob
apiVersion: batch/v1
kind: CronJob
metadata:
name: postgresql-backup
spec:
schedule: "0 2 * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: backup
image: postgres:16
command:
- /bin/bash
- -c
- |
pg_dump -h postgresql-0.postgresql -U $POSTGRES_USER -d mydb > /backup/backup-$(date +%Y%m%d-%H%M%S).sql
env:
- name: POSTGRES_USER
valueFrom:
secretKeyRef:
name: postgresql-secret
key: username
- name: PGPASSWORD
valueFrom:
secretKeyRef:
name: postgresql-secret
key: password
volumeMounts:
- name: backup-storage
mountPath: /backup
volumes:
- name: backup-storage
persistentVolumeClaim:
claimName: backup-pvc
restartPolicy: OnFailure
Operator Patterns
Redis Operator Example
apiVersion: databases.cloudurable.com/v1alpha1
kind: RedisCluster
metadata:
name: redis-cluster
spec:
replicas: 3
version: "7.2"
resources:
requests:
memory: "1Gi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "1000m"
storage:
size: "10Gi"
storageClassName: "fast-ssd"
backup:
schedule: "0 3 * * *"
retention: "7d"
monitoring:
enabled: true
serviceMonitor: true
security:
authSecret: "redis-auth"
tls:
enabled: true
certManager: true
PostgreSQL Operator Configuration
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: postgresql-cluster
spec:
instances: 3
postgresql:
parameters:
max_connections: "200"
shared_preload_libraries: "pg_stat_statements"
bootstrap:
initdb:
database: myapp
owner: myapp
secret:
name: postgresql-credentials
storage:
size: 50Gi
storageClass: fast-ssd
backup:
retentionPolicy: "30d"
barmanObjectStore:
destinationPath: "s3://my-backup-bucket/postgresql"
s3Credentials:
accessKeyId:
name: backup-credentials
key: ACCESS_KEY_ID
secretAccessKey:
name: backup-credentials
key: SECRET_ACCESS_KEY
region:
name: backup-credentials
key: REGION
monitoring:
enabled: true
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
cnpg.io/cluster: postgresql-cluster
topologyKey: kubernetes.io/hostname
Security Best Practices
Network Policies
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: postgresql-network-policy
spec:
podSelector:
matchLabels:
app: postgresql
policyTypes:
- Ingress
- Egress
ingress:
- from:
- namespaceSelector:
matchLabels:
name: application
- podSelector:
matchLabels:
app: web-server
ports:
- protocol: TCP
port: 5432
egress:
- to:
- namespaceSelector:
matchLabels:
name: kube-system
ports:
- protocol: TCP
port: 53
- protocol: UDP
port: 53
Secret Management
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: postgresql-secret
spec:
refreshInterval: 1h
secretStoreRef:
name: vault-backend
kind: SecretStore
target:
name: postgresql-credentials
creationPolicy: Owner
data:
- secretKey: username
remoteRef:
key: database/postgresql
property: username
- secretKey: password
remoteRef:
key: database/postgresql
property: password
Modern Monitoring Stack
ServiceMonitor for Prometheus
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: redis-metrics
spec:
selector:
matchLabels:
app: redis
endpoints:
- port: metrics
interval: 30s
path: /metrics
Grafana Dashboard ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
name: redis-dashboard
labels:
grafana_dashboard: "1"
data:
dashboard.json: |
{
"dashboard": {
"title": "Redis StatefulSet Dashboard",
"panels": [
{
"title": "Redis Memory Usage",
"type": "graph",
"targets": [
{
"expr": "redis_memory_used_bytes{pod=~\"redis-.*\"}"
}
]
},
{
"title": "Redis Commands Per Second",
"type": "graph",
"targets": [
{
"expr": "rate(redis_commands_total{pod=~\"redis-.*\"}[5m])"
}
]
}
]
}
}
Performance Optimization
Resource Quotas and Limits
apiVersion: v1
kind: ResourceQuota
metadata:
name: stateful-quota
spec:
hard:
requests.cpu: "10"
requests.memory: 20Gi
limits.cpu: "20"
limits.memory: 40Gi
persistentvolumeclaims: "10"
Priority Classes
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: stateful-high-priority
value: 1000
globalDefault: false
description: "High priority for stateful applications"
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: critical-database
spec:
template:
spec:
priorityClassName: stateful-high-priority
containers:
- name: database
image: postgres:16
resources:
requests:
memory: "2Gi"
cpu: "1000m"
limits:
memory: "4Gi"
cpu: "2000m"
Cloud-Native Examples
Redis Cluster with Operator
# Install Redis Operator
kubectl apply -f https://raw.githubusercontent.com/ot-container-kit/redis-operator/master/config/manager/manager.yaml
# Create Redis Cluster
kubectl apply -f - <<EOF
apiVersion: redis.redis.opstreelabs.in/v1beta1
kind: RedisCluster
metadata:
name: redis-cluster
spec:
clusterSize: 3
clusterVersion: v7.2.0
persistenceEnabled: true
storage:
volumeClaimTemplate:
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 1Gi
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 200m
memory: 256Mi
EOF
MongoDB with Operator
apiVersion: mongodbcommunity.mongodb.com/v1
kind: MongoDBCommunity
metadata:
name: mongodb-cluster
spec:
members: 3
type: ReplicaSet
version: "7.0.0"
security:
authentication:
modes: ["SCRAM"]
users:
- name: admin
db: admin
passwordSecretRef:
name: mongodb-admin-password
roles:
- name: clusterAdmin
db: admin
- name: userAdminAnyDatabase
db: admin
scramCredentialsSecretName: mongodb-scram
statefulSet:
spec:
template:
spec:
containers:
- name: mongod
resources:
requests:
cpu: 500m
memory: 1Gi
limits:
cpu: 1000m
memory: 2Gi
volumeClaimTemplates:
- metadata:
name: data-volume
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 10Gi
Troubleshooting Modern StatefulSets
Common Issues and Solutions
# Check StatefulSet status
kubectl get statefulset -o wide
# Inspect pod events
kubectl describe pod <pod-name>
# Check PVC status
kubectl get pvc
# View pod logs
kubectl logs <pod-name> -f
# Check resource usage
kubectl top pods
# Exec into pod for debugging
kubectl exec -it <pod-name> -- /bin/bash
# Check operator logs
kubectl logs -l app=redis-operator -n redis-operator-system
# Validate network policies
kubectl exec -it <pod-name> -- nc -zv <service-name> <port>
Health Checks
livenessProbe:
exec:
command:
- redis-cli
- ping
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
exec:
command:
- redis-cli
- ping
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 1
failureThreshold: 3
Summary
Modern StatefulSet patterns in 2025 emphasize:
- Operator-Driven Management - Use operators for complex lifecycle management
- Advanced Storage - Leverage CSI drivers for snapshots and backup
- Security First - Implement network policies, RBAC, and secret management
- Observability - Comprehensive monitoring with Prometheus and Grafana
- Cloud-Native - Embrace managed services and cloud integrations
- Performance - Optimize resources and use priority classes
- Automation - Automate backups, scaling, and recovery
These patterns provide the foundation for running reliable, scalable, and secure stateful applications in modern Kubernetes environments.
Related Resources
- Kubernetes StatefulSets Documentation
- Redis Operator
- PostgreSQL Operator
- CSI Drivers
- Volume Snapshots
About Cloudurable
We hope you enjoyed this modernized guide. Please provide feedback.
Cloudurable provides:
- Kubernetes Training
- Kubernetes Security Training
- Database Operations Consulting
- Cloud-Native Architecture Services
Last updated: January 2025 for Kubernetes 1.32+ and modern operator patterns
TweetApache Spark Training
Kafka Tutorial
Akka Consulting
Cassandra Training
AWS Cassandra Database Support
Kafka Support Pricing
Cassandra Database Support Pricing
Non-stop Cassandra
Watchdog
Advantages of using Cloudurable™
Cassandra Consulting
Cloudurable™| Guide to AWS Cassandra Deploy
Cloudurable™| AWS Cassandra Guidelines and Notes
Free guide to deploying Cassandra on AWS
Kafka Training
Kafka Consulting
DynamoDB Training
DynamoDB Consulting
Kinesis Training
Kinesis Consulting
Kafka Tutorial PDF
Kubernetes Security Training
Redis Consulting
Redis Training
ElasticSearch / ELK Consulting
ElasticSearch Training
InfluxDB/TICK Training TICK Consulting