January 9, 2025
What’s New in 2025
Key Updates and Changes
- Cassandra 5.0: Vector search, SAI indexes, unified compaction strategy
- Container-First: Docker and Kubernetes have replaced most Vagrant workflows
- Cloud-Native: Multi-cloud deployment with infrastructure as code
- ARM Support: Native ARM64 support for Apple Silicon and AWS Graviton
- Observability: Enhanced monitoring with OpenTelemetry and Prometheus
Major Platform Evolution
- Docker Compose: Simplified multi-container orchestration
- Kubernetes: Production-ready Cassandra operators
- Testcontainers: Integration testing with ephemeral containers
- Colima/Podman: Docker alternatives for development
- GitOps: Infrastructure managed through Git workflows
The modern approach to Cassandra cluster development has evolved significantly since 2017. While Vagrant remains useful for certain scenarios, container-based development has become the standard for 2025.
Modern Cassandra Development Approaches
Container-First Development (Recommended)
Docker Compose has largely replaced Vagrant for local development:
# docker-compose.yml
version: '3.8'
services:
cassandra-node1:
image: cassandra:5.0
container_name: cassandra-node1
environment:
- CASSANDRA_SEEDS=cassandra-node1,cassandra-node2,cassandra-node3
- CASSANDRA_CLUSTER_NAME=test-cluster
- CASSANDRA_DC=datacenter1
- CASSANDRA_RACK=rack1
- CASSANDRA_ENDPOINT_SNITCH=GossipingPropertyFileSnitch
volumes:
- cassandra-data1:/var/lib/cassandra
ports:
- "9042:9042"
networks:
- cassandra-network
cassandra-node2:
image: cassandra:5.0
container_name: cassandra-node2
environment:
- CASSANDRA_SEEDS=cassandra-node1,cassandra-node2,cassandra-node3
- CASSANDRA_CLUSTER_NAME=test-cluster
- CASSANDRA_DC=datacenter1
- CASSANDRA_RACK=rack2
- CASSANDRA_ENDPOINT_SNITCH=GossipingPropertyFileSnitch
volumes:
- cassandra-data2:/var/lib/cassandra
networks:
- cassandra-network
depends_on:
- cassandra-node1
cassandra-node3:
image: cassandra:5.0
container_name: cassandra-node3
environment:
- CASSANDRA_SEEDS=cassandra-node1,cassandra-node2,cassandra-node3
- CASSANDRA_CLUSTER_NAME=test-cluster
- CASSANDRA_DC=datacenter1
- CASSANDRA_RACK=rack3
- CASSANDRA_ENDPOINT_SNITCH=GossipingPropertyFileSnitch
volumes:
- cassandra-data3:/var/lib/cassandra
networks:
- cassandra-network
depends_on:
- cassandra-node1
networks:
cassandra-network:
driver: bridge
volumes:
cassandra-data1:
cassandra-data2:
cassandra-data3:
Do you like this article? Please check out our Cassandra training and Kafka training. We specialize in AWS DevOps Automation for Cassandra and Kafka.
Quick Start Commands
# Start the cluster
docker-compose up -d
# Check cluster status
docker exec cassandra-node1 nodetool status
# Access CQL shell
docker exec -it cassandra-node1 cqlsh
# Scale cluster (add more nodes)
docker-compose up -d --scale cassandra-node2=2
Kubernetes Deployment (Production Ready)
For production-like local development, use Kubernetes:
# cassandra-namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: cassandra
---
# cassandra-service.yaml
apiVersion: v1
kind: Service
metadata:
name: cassandra
namespace: cassandra
spec:
clusterIP: None
selector:
app: cassandra
ports:
- port: 9042
name: cql
---
# cassandra-statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: cassandra
namespace: cassandra
spec:
serviceName: cassandra
replicas: 3
selector:
matchLabels:
app: cassandra
template:
metadata:
labels:
app: cassandra
spec:
terminationGracePeriodSeconds: 1800
containers:
- name: cassandra
image: cassandra:5.0
ports:
- containerPort: 7000
name: intra-node
- containerPort: 9042
name: cql
resources:
limits:
cpu: "2"
memory: 4Gi
requests:
cpu: "1"
memory: 2Gi
env:
- name: CASSANDRA_SEEDS
value: "cassandra-0.cassandra.cassandra.svc.cluster.local"
- name: CASSANDRA_CLUSTER_NAME
value: "test-cluster"
- name: CASSANDRA_DC
value: "datacenter1"
- name: CASSANDRA_RACK
value: "rack1"
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
volumeMounts:
- name: cassandra-data
mountPath: /var/lib/cassandra
volumeClaimTemplates:
- metadata:
name: cassandra-data
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 10Gi
Deploy to Kubernetes
# Create namespace and deploy
kubectl apply -f cassandra-namespace.yaml
kubectl apply -f cassandra-service.yaml
kubectl apply -f cassandra-statefulset.yaml
# Check status
kubectl get pods -n cassandra
kubectl exec -it cassandra-0 -n cassandra -- nodetool status
Testcontainers for Integration Testing
Modern development uses Testcontainers for integration testing:
// CassandraIntegrationTest.java
@SpringBootTest
@TestMethodOrder(OrderAnnotation.class)
class CassandraIntegrationTest {
@Container
static final CassandraContainer<?> cassandra = new CassandraContainer<>("cassandra:5.0")
.withExposedPorts(9042)
.withInitScript("init-schema.cql")
.withReuse(true);
@DynamicPropertySource
static void configureProperties(DynamicPropertyRegistry registry) {
registry.add("spring.data.cassandra.contact-points", cassandra::getHost);
registry.add("spring.data.cassandra.port", cassandra::getFirstMappedPort);
registry.add("spring.data.cassandra.local-datacenter", () -> "datacenter1");
}
@Test
@Order(1)
void testConnection() {
assertTrue(cassandra.isRunning());
assertEquals(9042, cassandra.getFirstMappedPort());
}
@Test
@Order(2)
void testCRUDOperations() {
// Test your Cassandra operations
CqlSession session = CqlSession.builder()
.addContactPoint(new InetSocketAddress(cassandra.getHost(), cassandra.getFirstMappedPort()))
.withLocalDatacenter("datacenter1")
.build();
// Execute test queries
session.execute("SELECT * FROM system.local");
session.close();
}
}
Modern Vagrant Alternative (When Containers Won’t Work)
For scenarios requiring full VM isolation:
# Vagrantfile for 2025
Vagrant.configure("2") do |config|
config.vm.box = "almalinux/9"
# Use libvirt or VMware instead of VirtualBox
config.vm.provider "libvirt" do |libvirt|
libvirt.memory = 4096
libvirt.cpus = 4
libvirt.nested = true
end
# Modern provisioning with Ansible
config.vm.provision "ansible" do |ansible|
ansible.playbook = "provision/cassandra.yml"
ansible.extra_vars = {
cassandra_version: "5.0",
cluster_name: "test-cluster",
enable_vector_search: true
}
end
# Define nodes with modern approach
(1..3).each do |i|
config.vm.define "cassandra-node#{i}" do |node|
node.vm.network "private_network", ip: "192.168.50.#{i+10}"
node.vm.hostname = "cassandra-node#{i}"
# Use cloud-init for configuration
node.vm.provision "shell", inline: <<-SHELL
cloud-init clean
/opt/cassandra/bin/cassandra-cloud \
-cluster-name test-cluster \
-client-address 192.168.50.#{i+10} \
-cluster-address 192.168.50.#{i+10} \
-cluster-seeds 192.168.50.11,192.168.50.12,192.168.50.13 \
-enable-vector-search true
SHELL
end
end
end
Cloud-Native Configuration Management
Modern Ansible Playbook
# provision/cassandra.yml
---
- name: Configure Cassandra 5.0 Cluster
hosts: all
become: yes
vars:
cassandra_version: "5.0"
cluster_name: "test-cluster"
enable_vector_search: true
tasks:
- name: Install OpenJDK 17
package:
name: java-17-openjdk
state: present
- name: Download Cassandra 5.0
get_url:
url: "https://downloads.apache.org/cassandra/{{ cassandra_version }}/apache-cassandra-{{ cassandra_version }}-bin.tar.gz"
dest: /tmp/cassandra.tar.gz
- name: Extract Cassandra
unarchive:
src: /tmp/cassandra.tar.gz
dest: /opt/
remote_src: yes
creates: /opt/apache-cassandra-{{ cassandra_version }}
- name: Create Cassandra symlink
file:
src: /opt/apache-cassandra-{{ cassandra_version }}
dest: /opt/cassandra
state: link
- name: Configure Cassandra
template:
src: cassandra.yaml.j2
dest: /opt/cassandra/conf/cassandra.yaml
- name: Create systemd service
template:
src: cassandra.service.j2
dest: /etc/systemd/system/cassandra.service
- name: Enable and start Cassandra
systemd:
name: cassandra
state: started
enabled: yes
daemon_reload: yes
Modern SystemD Service
# templates/cassandra.service.j2
[Unit]
Description=Apache Cassandra 5.0
After=network.target
[Service]
Type=notify
User=cassandra
Group=cassandra
ExecStart=/opt/cassandra/bin/cassandra -f
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
Restart=always
RestartSec=10
NotifyAccess=all
# Security enhancements
NoNewPrivileges=true
PrivateTmp=true
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=/var/lib/cassandra /var/log/cassandra
# Resource limits
LimitNOFILE=100000
LimitMEMLOCK=infinity
LimitAS=infinity
[Install]
WantedBy=multi-user.target
Monitoring and Observability (2025)
Prometheus Integration
# docker-compose.monitoring.yml
version: '3.8'
services:
cassandra-exporter:
image: instaclustr/cassandra-exporter:0.9.10
environment:
- CONFIG_FILE=/etc/cassandra-exporter/config.yml
ports:
- "9500:9500"
depends_on:
- cassandra-node1
prometheus:
image: prom/prometheus:latest
ports:
- "9090:9090"
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
grafana:
image: grafana/grafana:latest
ports:
- "3000:3000"
environment:
- GF_SECURITY_ADMIN_PASSWORD=admin
volumes:
- grafana-storage:/var/lib/grafana
volumes:
grafana-storage:
OpenTelemetry Configuration
# cassandra.yaml additions for 2025
diagnostic_events_enabled: true
full_query_logging_enabled: true
# OpenTelemetry configuration
jvm_opts:
- "-javaagent:/opt/cassandra/agents/opentelemetry-javaagent.jar"
- "-Dotel.service.name=cassandra"
- "-Dotel.exporter.otlp.endpoint=http://jaeger:14250"
Security Best Practices (2025)
Container Security
# Dockerfile for secure Cassandra image
FROM cassandra:5.0
# Create non-root user
RUN groupadd -r cassandra && useradd -r -g cassandra cassandra
# Security hardening
RUN apt-get update && apt-get install -y --no-install-recommends \
ca-certificates \
&& rm -rf /var/lib/apt/lists/*
# Set secure permissions
RUN chown -R cassandra:cassandra /var/lib/cassandra /var/log/cassandra
RUN chmod 700 /var/lib/cassandra
USER cassandra
EXPOSE 9042 7000 7001 7199
HEALTHCHECK --interval=30s --timeout=10s --start-period=60s --retries=3 \
CMD cqlsh -e "SELECT now() FROM system.local"
Network Security
# Network policies for Kubernetes
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: cassandra-network-policy
namespace: cassandra
spec:
podSelector:
matchLabels:
app: cassandra
policyTypes:
- Ingress
- Egress
ingress:
- from:
- podSelector:
matchLabels:
app: cassandra
ports:
- protocol: TCP
port: 9042
- protocol: TCP
port: 7000
egress:
- to:
- podSelector:
matchLabels:
app: cassandra
ports:
- protocol: TCP
port: 7000
Performance Testing and Validation
Modern Load Testing
# Using cassandra-stress with modern parameters
docker exec cassandra-node1 cassandra-stress write \
n=1000000 \
-mode native cql3 \
-rate threads=50 \
-node cassandra-node1,cassandra-node2,cassandra-node3
# Vector search testing (Cassandra 5.0)
docker exec cassandra-node1 cassandra-stress user \
profile=vector-search.yaml \
n=100000 \
-rate threads=10
Automated Testing Pipeline
# .github/workflows/cassandra-test.yml
name: Cassandra Integration Tests
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up JDK 17
uses: actions/setup-java@v4
with:
java-version: '17'
distribution: 'temurin'
- name: Start Cassandra with Docker Compose
run: |
docker-compose up -d
./scripts/wait-for-cassandra.sh
- name: Run integration tests
run: |
./mvnw test -Dspring.profiles.active=integration
- name: Collect logs
if: failure()
run: |
docker-compose logs > cassandra-logs.txt
- name: Upload logs
if: failure()
uses: actions/upload-artifact@v4
with:
name: cassandra-logs
path: cassandra-logs.txt
We hope this blog post on modern Cassandra cluster setup is useful. We find it essential for current DevOps practices. We also provide Cassandra consulting and Kafka consulting to get you setup fast in AWS with CloudFormation and CloudWatch. Check out our Cassandra training and Kafka training. Cloudurable specializes in AWS DevOps Automation for Cassandra and Kafka.
Migration from Legacy Vagrant Setup
Step-by-Step Migration
# 1. Export existing Vagrant cluster data
vagrant ssh node0 -c "nodetool snapshot"
vagrant ssh node0 -c "tar -czf /tmp/cassandra-backup.tar.gz /var/lib/cassandra/data"
# 2. Convert to Docker Compose
docker-compose up -d
# 3. Restore data to new cluster
docker cp /tmp/cassandra-backup.tar.gz cassandra-node1:/tmp/
docker exec cassandra-node1 tar -xzf /tmp/cassandra-backup.tar.gz -C /var/lib/cassandra/
# 4. Restart cluster
docker-compose restart
Configuration Comparison
Feature | Legacy Vagrant | Modern Docker | Kubernetes |
---|---|---|---|
Startup Time | 5-10 minutes | 30-60 seconds | 1-2 minutes |
Resource Usage | High (full VMs) | Low (containers) | Medium (pods) |
Networking | Complex NAT | Simple bridge | Service mesh |
Persistence | VM disk | Volumes | PVCs |
Scaling | Manual | Compose scale | Auto-scaling |
Cloud Deployment Integration
AWS EKS Integration
# cassandra-cluster-eks.yaml
apiVersion: helm.cattle.io/v1
kind: HelmChart
metadata:
name: cassandra
namespace: cassandra
spec:
chart: cassandra
repo: https://charts.bitnami.com/bitnami
valuesContent: |-
cluster:
name: test-cluster
datacenter: us-west-2
seedCount: 3
replicaCount: 3
resources:
requests:
memory: 4Gi
cpu: 2
limits:
memory: 8Gi
cpu: 4
persistence:
enabled: true
storageClass: gp3
size: 100Gi
Multi-Cloud Deployment
# terraform/main.tf
module "cassandra_cluster" {
source = "./modules/cassandra"
providers = {
aws = aws.us-west-2
gcp = google.us-central1
}
cluster_name = "multi-cloud-cluster"
replication_factor = 3
aws_config = {
instance_type = "r6g.2xlarge"
availability_zones = ["us-west-2a", "us-west-2b", "us-west-2c"]
}
gcp_config = {
machine_type = "n2-highmem-8"
zones = ["us-central1-a", "us-central1-b", "us-central1-c"]
}
}
Best Practices Summary
Development Environment Selection
- Small projects: Docker Compose for simplicity
- Microservices: Kubernetes for production parity
- Integration testing: Testcontainers for ephemeral clusters
- CI/CD: GitHub Actions with container-based testing
- Legacy systems: Modern Vagrant with updated provisioning
Performance Optimization
- Use SSD storage: Even in development environments
- Allocate sufficient memory: 4GB minimum per node
- Enable JVM tuning: G1GC with modern settings
- Monitor resource usage: Prometheus + Grafana
- Test with realistic data: Use cassandra-stress
Security Considerations
- Container security: Non-root users, minimal images
- Network isolation: Proper network policies
- Secrets management: Kubernetes secrets or Vault
- TLS everywhere: Client and internode encryption
- Regular updates: Keep base images current
The evolution from Vagrant to container-based development represents a significant improvement in developer productivity and operational consistency. While Vagrant remains useful for specific use cases, the modern approach emphasizes containers, orchestration, and cloud-native practices.
Cloudurable provides Cassandra training, Cassandra consulting, Cassandra support and helps setting up Cassandra clusters in AWS.
About Cloudurableâ„¢
Cloudurableâ„¢: streamline DevOps and DBA for the Cassandra Database running on AWS provides AMIs, CloudWatch Monitoring, CloudFormation templates and monitoring tools to support the Cassandra Database in production running in Amazon AWS.
We also teach advanced Cassandra Database courses which teaches how one could develop, perform DBA tasks, support and deploy Cassandra to production in AWS EC2.
More info
Please take some time to read the Advantage of using Cloudurableâ„¢ for Amazon Cassandra deployments.
Cloudurable provides:
- Subscription Cassandra Database support to streamline DevOps and DBA tasks (Support subscription pricing for the Cassandra Database and Kafka in AWS)
- Quickstart Mentoring Consulting for Developers and DevOps
- Architectural Analysis Consulting
- Training and mentoring for the Cassandra Database and Kafka
- We specialize in AWS Cassandra deployments for organizations that are setting up Cassandra as a Service.
Authors
Written by R. Hightower and JP Azar.
Feedback
We hope you enjoyed this article. Please provide [feedback](https://cloudurable.com/contact/index.html).
#### About Cloudurable Cloudurable provides [Cassandra training](https://cloudurable.com/cassandra-course/index.html "Onsite, Instructor-Led, Cassandra Training"), [Cassandra consulting](https://cloudurable.com/kafka-aws-consulting/index.html "Cassandra professional services"), [Cassandra support](https://cloudurable.com/subscription_support/index.html) and helps [setting up Cassandra clusters in AWS](https://cloudurable.com/services/index.html). Cloudurable also provides [Kafka training](https://cloudurable.com/kafka-training/index.html "Onsite, Instructor-Led, Kafka Training"), [Kafka consulting](https://cloudurable.com/kafka-aws-consulting/index.html), [Kafka support](https://cloudurable.com/subscription_support/index.html) and helps [setting up Kafka clusters in AWS](https://cloudurable.com/services/index.html).
Check out our new GoLang course. We provide onsite Go Lang training which is instructor led.
TweetApache Spark Training
Kafka Tutorial
Akka Consulting
Cassandra Training
AWS Cassandra Database Support
Kafka Support Pricing
Cassandra Database Support Pricing
Non-stop Cassandra
Watchdog
Advantages of using Cloudurable™
Cassandra Consulting
Cloudurable™| Guide to AWS Cassandra Deploy
Cloudurable™| AWS Cassandra Guidelines and Notes
Free guide to deploying Cassandra on AWS
Kafka Training
Kafka Consulting
DynamoDB Training
DynamoDB Consulting
Kinesis Training
Kinesis Consulting
Kafka Tutorial PDF
Kubernetes Security Training
Redis Consulting
Redis Training
ElasticSearch / ELK Consulting
ElasticSearch Training
InfluxDB/TICK Training TICK Consulting