Apache Cassandra Deployment Guide for AWS and Kubernetes

January 9, 2025

🚀 What’s New in This 2025 Update

Major Changes Since 2017

Cassandra 5.0 - ACID transactions, cost-based query optimizer, vector search for AI workloads
Cloud-Native Deployment - 85% of users now deploy on cloud with Kubernetes operators
Modern AWS Instances - Graviton (r6g), Im4gn series for better price/performance
Container Orchestration - Kubernetes operators (Cass Operator, K8ssandra) for automated management
Advanced Security - Encryption, RBAC, IAM integration, TLS automation
Modern Observability - Prometheus/Grafana integration with automated alerting

Key Improvements

✅ Better Performance - 50% improvement with modern instance types and storage
✅ Enhanced Security - Zero-trust architecture with comprehensive encryption
✅ Automated Operations - Kubernetes operators handle scaling, upgrades, and repairs
✅ AI-Ready - Vector search and machine learning workload support

Modern Cassandra Architecture Overview

Apache Cassandra 5.0 represents a significant evolution in distributed database technology, now supporting ACID transactions, advanced query optimization, and AI workloads. Modern deployments emphasize cloud-native patterns, containerization, and automated operations.

graph TB
    subgraph "Modern Cassandra Stack 2025"
        subgraph "Client Layer"
            APP[Applications]
            AI[AI/ML Workloads]
            ANALYTICS[Analytics]
        end
        
        subgraph "Load Balancer & Gateway"
            LB[Load Balancer]
            GW[API Gateway]
        end
        
        subgraph "Cassandra Cluster"
            C1[Cassandra Node 1<br/>r6g.2xlarge]
            C2[Cassandra Node 2<br/>r6g.2xlarge]
            C3[Cassandra Node 3<br/>r6g.2xlarge]
            C4[Cassandra Node 4<br/>r6g.2xlarge]
        end
        
        subgraph "Storage Layer"
            NVMe1[NVMe SSD<br/>Local Storage]
            NVMe2[NVMe SSD<br/>Local Storage]
            EBS1[EBS gp3<br/>Backup Storage]
            EBS2[EBS gp3<br/>Backup Storage]
        end
        
        subgraph "Monitoring & Observability"
            PROM[Prometheus]
            GRAF[Grafana]
            ALERTS[AlertManager]
        end
        
        subgraph "Security & Compliance"
            IAM[AWS IAM]
            KMS[AWS KMS]
            AUDIT[Audit Logging]
        end
    end
    
    APP --> LB
    AI --> LB
    ANALYTICS --> LB
    LB --> GW
    GW --> C1
    GW --> C2
    GW --> C3
    GW --> C4
    
    C1 --> NVMe1
    C2 --> NVMe2
    C3 --> EBS1
    C4 --> EBS2
    
    C1 --> PROM
    C2 --> PROM
    C3 --> PROM
    C4 --> PROM
    
    PROM --> GRAF
    PROM --> ALERTS
    
    C1 --> IAM
    C2 --> IAM
    C3 --> KMS
    C4 --> AUDIT

Cassandra 5.0 Key Features

ACID Transactions

Cassandra 5.0 introduces full ACID transaction support for multi-partition operations:

-- Example of ACID transaction in Cassandra 5.0
BEGIN TRANSACTION
    INSERT INTO users (id, name, email) VALUES (1, 'John Doe', 'john@example.com');
    INSERT INTO user_profiles (user_id, bio, created_at) VALUES (1, 'Software Engineer', toTimeStamp(now()));
    UPDATE user_stats SET total_users = total_users + 1 WHERE stat_type = 'global';
COMMIT;

Vector Search for AI Workloads

Native vector search capabilities for machine learning and AI applications:

-- Create table with vector column
CREATE TABLE product_embeddings (
    product_id UUID PRIMARY KEY,
    name TEXT,
    description TEXT,
    embedding VECTOR<FLOAT, 512>
);

-- Insert vector data
INSERT INTO product_embeddings (product_id, name, description, embedding)
VALUES (uuid(), 'Laptop', 'High-performance laptop', [0.1, 0.2, 0.3, ...]);

-- Vector similarity search
SELECT product_id, name, similarity_cosine(embedding, [0.15, 0.25, 0.35, ...]) as similarity
FROM product_embeddings
ORDER BY embedding ANN OF [0.15, 0.25, 0.35, ...]
LIMIT 10;

Cost-Based Query Optimizer

Improved query performance with intelligent execution planning:

-- Query optimization examples
SELECT * FROM orders 
WHERE customer_id = 123 
  AND order_date > '2025-01-01'
  AND status = 'completed'
ALLOW FILTERING;

-- The optimizer now chooses the best execution path
-- based on table statistics and data distribution

Modern AWS Instance Types for Cassandra

Graviton-Based Instances (Recommended)

AWS Graviton processors offer 20% better price-performance:

# Terraform configuration for Graviton instances
resource "aws_instance" "cassandra_nodes" {
  count                  = 4
  ami                    = "ami-0c02fb55956c7d316"  # Amazon Linux 2 ARM64
  instance_type          = "r6g.2xlarge"
  key_name              = var.key_name
  vpc_security_group_ids = [aws_security_group.cassandra.id]
  subnet_id             = aws_subnet.private[count.index % 2].id
  
  # Instance store for high-performance workloads
  ephemeral_block_device {
    device_name  = "/dev/sdb"
    virtual_name = "ephemeral0"
  }
  
  # EBS volume for data persistence
  ebs_block_device {
    device_name = "/dev/xvdf"
    volume_type = "gp3"
    volume_size = 500
    iops        = 3000
    throughput  = 125
    encrypted   = true
  }
  
  tags = {
    Name = "cassandra-node-${count.index + 1}"
    Role = "cassandra"
    Environment = "production"
  }
  
  user_data = base64encode(templatefile("${path.module}/cassandra-userdata.sh", {
    node_id = count.index + 1
    cluster_name = var.cluster_name
  }))
}

Storage Configuration

Modern storage patterns for optimal performance:

# Storage class for high-performance NVMe
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast-nvme
provisioner: kubernetes.io/aws-ebs
parameters:
  type: io2
  iopsPerGB: "50"
  fsType: xfs
  encrypted: "true"
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
---
# Storage class for general purpose
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: cassandra-storage
provisioner: ebs.csi.aws.com
parameters:
  type: gp3
  iops: "3000"
  throughput: "125"
  fsType: xfs
  encrypted: "true"
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true

Kubernetes Deployment with Operators

K8ssandra Operator

The modern approach to Cassandra on Kubernetes:

# Install K8ssandra operator
apiVersion: v1
kind: Namespace
metadata:
  name: k8ssandra-operator
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: k8ssandra-operator
  namespace: k8ssandra-operator
spec:
  channel: stable
  name: k8ssandra-operator
  source: community-operators
  sourceNamespace: olm
---
# Cassandra cluster definition
apiVersion: k8ssandra.io/v1alpha1
kind: K8ssandraCluster
metadata:
  name: production-cluster
  namespace: cassandra
spec:
  cassandra:
    serverVersion: "5.0.0"
    storageConfig:
      cassandraDataVolumeClaimSpec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 500Gi
        storageClassName: fast-nvme
    config:
      jvmOptions:
        heapSize: 8G
        additionalOptions:
          - -XX:+UseG1GC
          - -XX:G1HeapRegionSize=16m
          - -XX:+UnlockExperimentalVMOptions
          - -XX:+UseCGroupMemoryLimitForHeap
      cassandraYaml:
        num_tokens: 16
        authenticator: PasswordAuthenticator
        authorizer: CassandraAuthorizer
        role_manager: CassandraRoleManager
        endpoint_snitch: GossipingPropertyFileSnitch
        compaction_throughput_mb_per_sec: 64
        concurrent_reads: 32
        concurrent_writes: 32
        concurrent_counter_writes: 32
    networking:
      hostNetwork: false
    datacenters:
      - metadata:
          name: dc1
        size: 4
        resources:
          requests:
            cpu: 2000m
            memory: 16Gi
          limits:
            cpu: 4000m
            memory: 16Gi
        racks:
          - name: rack1
            nodeAffinityLabels:
              topology.kubernetes.io/zone: us-east-1a
          - name: rack2
            nodeAffinityLabels:
              topology.kubernetes.io/zone: us-east-1b
  stargate:
    size: 2
    heapSize: 2G
    resources:
      requests:
        cpu: 1000m
        memory: 2Gi
      limits:
        cpu: 2000m
        memory: 4Gi
  reaper:
    autoScheduling:
      enabled: true
    resources:
      requests:
        cpu: 500m
        memory: 1Gi
      limits:
        cpu: 1000m
        memory: 2Gi
  medusa:
    storageProperties:
      storageProvider: s3
      bucketName: cassandra-backups
      storageSecretRef:
        name: cassandra-backup-secret

Cassandra Configuration for Modern Workloads

# cassandra.yaml optimizations for 2025
apiVersion: v1
kind: ConfigMap
metadata:
  name: cassandra-config
  namespace: cassandra
data:
  cassandra.yaml: |
    # Cluster configuration
    cluster_name: 'Production Cluster'
    num_tokens: 16
    allocate_tokens_for_local_replication_factor: 3
    
    # Security
    authenticator: PasswordAuthenticator
    authorizer: CassandraAuthorizer
    role_manager: CassandraRoleManager
    
    # Network
    endpoint_snitch: GossipingPropertyFileSnitch
    rpc_address: 0.0.0.0
    broadcast_rpc_address: ${POD_IP}
    listen_address: ${POD_IP}
    broadcast_address: ${POD_IP}
    
    # Performance optimizations
    concurrent_reads: 32
    concurrent_writes: 32
    concurrent_counter_writes: 32
    concurrent_materialized_view_writes: 32
    
    # Memory settings
    memtable_allocation_type: heap_buffers
    memtable_heap_space_in_mb: 2048
    memtable_offheap_space_in_mb: 2048
    
    # Compaction
    compaction_throughput_mb_per_sec: 64
    compaction_large_partition_warning_threshold_mb: 1000
    
    # Timeouts
    read_request_timeout_in_ms: 10000
    range_request_timeout_in_ms: 20000
    write_request_timeout_in_ms: 10000
    counter_write_request_timeout_in_ms: 10000
    
    # Garbage collection
    gc_grace_seconds: 864000
    
    # Encryption
    server_encryption_options:
      internode_encryption: all
      keystore: /etc/cassandra/keystore.jks
      keystore_password: cassandra
      truststore: /etc/cassandra/truststore.jks
      truststore_password: cassandra
      protocol: TLS
      algorithm: SunX509
      store_type: JKS
      cipher_suites: [TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA]
      
    client_encryption_options:
      enabled: true
      optional: false
      keystore: /etc/cassandra/keystore.jks
      keystore_password: cassandra
      require_client_auth: false
      truststore: /etc/cassandra/truststore.jks
      truststore_password: cassandra
      protocol: TLS
      algorithm: SunX509
      store_type: JKS
      cipher_suites: [TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA]

Security Best Practices

TLS Configuration

# TLS certificate management with cert-manager
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: cassandra-tls
  namespace: cassandra
spec:
  secretName: cassandra-tls
  issuerRef:
    name: letsencrypt-prod
    kind: ClusterIssuer
  dnsNames:
  - cassandra.example.com
  - "*.cassandra.example.com"
---
# Network policy for secure communication
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: cassandra-network-policy
  namespace: cassandra
spec:
  podSelector:
    matchLabels:
      app: cassandra
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          name: application
    - podSelector:
        matchLabels:
          app: web-server
    ports:
    - protocol: TCP
      port: 9042
  - from:
    - podSelector:
        matchLabels:
          app: cassandra
    ports:
    - protocol: TCP
      port: 7000
    - protocol: TCP
      port: 7001
  egress:
  - to:
    - podSelector:
        matchLabels:
          app: cassandra
    ports:
    - protocol: TCP
      port: 7000
    - protocol: TCP
      port: 7001

IAM Integration

# Service account with IAM role
apiVersion: v1
kind: ServiceAccount
metadata:
  name: cassandra-service-account
  namespace: cassandra
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/CassandraServiceRole
---
# IAM role for Cassandra pods
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:DeleteObject"
      ],
      "Resource": [
        "arn:aws:s3:::cassandra-backups/*"
      ]
    },
    {
      "Effect": "Allow",
      "Action": [
        "kms:Decrypt",
        "kms:DescribeKey"
      ],
      "Resource": [
        "arn:aws:kms:us-east-1:123456789012:key/12345678-1234-1234-1234-123456789012"
      ]
    }
  ]
}

Monitoring and Observability

Prometheus Configuration

# Prometheus configuration for Cassandra metrics
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: cassandra-metrics
  namespace: cassandra
spec:
  selector:
    matchLabels:
      app: cassandra
  endpoints:
  - port: metrics
    interval: 30s
    path: /metrics
    scrapeTimeout: 10s
---
# Cassandra exporter deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: cassandra-exporter
  namespace: cassandra
spec:
  replicas: 1
  selector:
    matchLabels:
      app: cassandra-exporter
  template:
    metadata:
      labels:
        app: cassandra-exporter
    spec:
      containers:
      - name: cassandra-exporter
        image: criteord/cassandra_exporter:latest
        ports:
        - containerPort: 8080
          name: metrics
        env:
        - name: CASSANDRA_HOST
          value: "cassandra-service"
        - name: CASSANDRA_PORT
          value: "9042"
        - name: CASSANDRA_USER
          valueFrom:
            secretKeyRef:
              name: cassandra-credentials
              key: username
        - name: CASSANDRA_PASSWORD
          valueFrom:
            secretKeyRef:
              name: cassandra-credentials
              key: password
        resources:
          requests:
            cpu: 100m
            memory: 128Mi
          limits:
            cpu: 200m
            memory: 256Mi

Grafana Dashboard

{
  "dashboard": {
    "title": "Cassandra 5.0 Monitoring Dashboard",
    "panels": [
      {
        "title": "Cluster Health",
        "type": "stat",
        "targets": [
          {
            "expr": "up{job=\"cassandra\"}"
          }
        ]
      },
      {
        "title": "Read Latency",
        "type": "graph",
        "targets": [
          {
            "expr": "rate(cassandra_read_latency_seconds_sum[5m]) / rate(cassandra_read_latency_seconds_count[5m])"
          }
        ]
      },
      {
        "title": "Write Latency",
        "type": "graph",
        "targets": [
          {
            "expr": "rate(cassandra_write_latency_seconds_sum[5m]) / rate(cassandra_write_latency_seconds_count[5m])"
          }
        ]
      },
      {
        "title": "JVM Heap Usage",
        "type": "graph",
        "targets": [
          {
            "expr": "cassandra_jvm_heap_used_bytes / cassandra_jvm_heap_max_bytes * 100"
          }
        ]
      },
      {
        "title": "Disk Usage",
        "type": "graph",
        "targets": [
          {
            "expr": "cassandra_disk_used_bytes"
          }
        ]
      },
      {
        "title": "Connection Count",
        "type": "graph",
        "targets": [
          {
            "expr": "cassandra_connected_clients"
          }
        ]
      }
    ]
  }
}

Alerting Rules

# Prometheus alerting rules for Cassandra
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: cassandra-alerts
  namespace: cassandra
spec:
  groups:
  - name: cassandra
    rules:
    - alert: CassandraNodeDown
      expr: up{job="cassandra"} == 0
      for: 5m
      labels:
        severity: critical
      annotations:
        summary: "Cassandra node {{ $labels.instance }} is down"
        description: "Cassandra node {{ $labels.instance }} has been down for more than 5 minutes"
        
    - alert: CassandraHighReadLatency
      expr: rate(cassandra_read_latency_seconds_sum[5m]) / rate(cassandra_read_latency_seconds_count[5m]) > 0.1
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "High read latency on {{ $labels.instance }}"
        description: "Read latency is {{ $value }}s on {{ $labels.instance }}"
        
    - alert: CassandraHighWriteLatency
      expr: rate(cassandra_write_latency_seconds_sum[5m]) / rate(cassandra_write_latency_seconds_count[5m]) > 0.1
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "High write latency on {{ $labels.instance }}"
        description: "Write latency is {{ $value }}s on {{ $labels.instance }}"
        
    - alert: CassandraHighJVMHeapUsage
      expr: cassandra_jvm_heap_used_bytes / cassandra_jvm_heap_max_bytes * 100 > 80
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "High JVM heap usage on {{ $labels.instance }}"
        description: "JVM heap usage is {{ $value }}% on {{ $labels.instance }}"
        
    - alert: CassandraHighDiskUsage
      expr: cassandra_disk_used_bytes / cassandra_disk_total_bytes * 100 > 85
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "High disk usage on {{ $labels.instance }}"
        description: "Disk usage is {{ $value }}% on {{ $labels.instance }}"

Infrastructure as Code

Terraform Module

# Terraform module for Cassandra deployment
module "cassandra_cluster" {
  source = "./modules/cassandra"
  
  cluster_name     = "production-cassandra"
  node_count       = 4
  instance_type    = "r6g.2xlarge"
  volume_size      = 500
  volume_type      = "gp3"
  
  vpc_id              = module.vpc.vpc_id
  subnet_ids          = module.vpc.private_subnet_ids
  security_group_ids  = [aws_security_group.cassandra.id]
  
  backup_retention_days = 30
  backup_s3_bucket     = "cassandra-backups-${random_string.suffix.result}"
  
  monitoring_enabled = true
  encryption_enabled = true
  
  tags = {
    Environment = "production"
    Project     = "cassandra-cluster"
    Owner       = "platform-team"
  }
}

Ansible Playbook

# Ansible playbook for Cassandra configuration
---
- name: Configure Cassandra 5.0 on AWS
  hosts: cassandra_nodes
  become: yes
  vars:
    cassandra_version: "5.0.0"
    java_version: "11"
    heap_size: "8G"
    
  tasks:
    - name: Update system packages
      yum:
        name: "*"
        state: latest
        
    - name: Install Java
      yum:
        name: "java-{{ java_version }}-openjdk"
        state: present
        
    - name: Create cassandra user
      user:
        name: cassandra
        system: yes
        shell: /bin/bash
        home: /var/lib/cassandra
        
    - name: Download and install Cassandra
      get_url:
        url: "https://downloads.apache.org/cassandra/{{ cassandra_version }}/apache-cassandra-{{ cassandra_version }}-bin.tar.gz"
        dest: /tmp/
      register: cassandra_download
      
    - name: Extract Cassandra
      unarchive:
        src: "{{ cassandra_download.dest }}"
        dest: /opt/
        remote_src: yes
        creates: "/opt/apache-cassandra-{{ cassandra_version }}"
        
    - name: Create symlink
      file:
        src: "/opt/apache-cassandra-{{ cassandra_version }}"
        dest: /opt/cassandra
        state: link
        
    - name: Set ownership
      file:
        path: /opt/cassandra
        owner: cassandra
        group: cassandra
        recurse: yes
        
    - name: Configure system limits
      template:
        src: limits.conf.j2
        dest: /etc/security/limits.conf
        backup: yes
        
    - name: Configure sysctl
      template:
        src: sysctl.conf.j2
        dest: /etc/sysctl.d/99-cassandra.conf
        
    - name: Configure Cassandra
      template:
        src: cassandra.yaml.j2
        dest: /opt/cassandra/conf/cassandra.yaml
        owner: cassandra
        group: cassandra
        backup: yes
      notify: restart cassandra
      
    - name: Create systemd service
      template:
        src: cassandra.service.j2
        dest: /etc/systemd/system/cassandra.service
      notify:
        - reload systemd
        - restart cassandra
        
    - name: Start and enable Cassandra
      systemd:
        name: cassandra
        state: started
        enabled: yes
        
  handlers:
    - name: reload systemd
      systemd:
        daemon_reload: yes
        
    - name: restart cassandra
      systemd:
        name: cassandra
        state: restarted

Backup and Recovery

Automated Backup with Medusa

# Medusa backup configuration
apiVersion: v1
kind: ConfigMap
metadata:
  name: medusa-config
  namespace: cassandra
data:
  medusa.ini: |
    [storage]
    storage_provider = s3
    bucket_name = cassandra-backups
    key_file = /etc/medusa/credentials
    
    [cassandra]
    config_file = /etc/cassandra/cassandra.yaml
    cql_username = cassandra
    cql_password = cassandra
    
    [monitoring]
    enabled = True
    
    [logging]
    level = INFO    
---
# Backup CronJob
apiVersion: batch/v1
kind: CronJob
metadata:
  name: cassandra-backup
  namespace: cassandra
spec:
  schedule: "0 2 * * *"
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: medusa
            image: thelastpickle/cassandra-medusa:latest
            command:
            - /bin/sh
            - -c
            - |
              medusa backup --backup-name=backup-$(date +%Y%m%d-%H%M%S) --keyspace=system
              medusa backup --backup-name=backup-$(date +%Y%m%d-%H%M%S) --keyspace=myapp              
            volumeMounts:
            - name: medusa-config
              mountPath: /etc/medusa
            - name: aws-credentials
              mountPath: /root/.aws
            env:
            - name: AWS_DEFAULT_REGION
              value: "us-east-1"
          volumes:
          - name: medusa-config
            configMap:
              name: medusa-config
          - name: aws-credentials
            secret:
              secretName: aws-credentials
          restartPolicy: OnFailure

Testing and Validation

Performance Testing

#!/bin/bash
# Performance testing script using cassandra-stress

# Write test
cassandra-stress write n=1000000 \
  -node cassandra-0.cassandra,cassandra-1.cassandra,cassandra-2.cassandra \
  -rate threads=50 \
  -schema keyspace="test_ks" \
  -mode native cql3 \
  -col n=FIXED(10) size=FIXED(1024)

# Read test
cassandra-stress read n=1000000 \
  -node cassandra-0.cassandra,cassandra-1.cassandra,cassandra-2.cassandra \
  -rate threads=50 \
  -schema keyspace="test_ks" \
  -mode native cql3

# Mixed workload test
cassandra-stress mixed ratio\(write=1,read=3\) n=1000000 \
  -node cassandra-0.cassandra,cassandra-1.cassandra,cassandra-2.cassandra \
  -rate threads=50 \
  -schema keyspace="test_ks" \
  -mode native cql3

Health Check Script

#!/usr/bin/env python3
import subprocess
import json
import sys

def check_cassandra_health():
    """Check Cassandra cluster health"""
    
    # Check node status
    try:
        result = subprocess.run(['nodetool', 'status'], 
                              capture_output=True, text=True)
        print("Node Status:")
        print(result.stdout)
        
        # Check for down nodes
        if "DN" in result.stdout:
            print("ERROR: One or more nodes are down")
            return False
            
    except Exception as e:
        print(f"Error checking node status: {e}")
        return False
    
    # Check ring status
    try:
        result = subprocess.run(['nodetool', 'ring'], 
                              capture_output=True, text=True)
        print("Ring Status:")
        print(result.stdout)
        
    except Exception as e:
        print(f"Error checking ring status: {e}")
        return False
    
    # Check compaction stats
    try:
        result = subprocess.run(['nodetool', 'compactionstats'], 
                              capture_output=True, text=True)
        print("Compaction Stats:")
        print(result.stdout)
        
    except Exception as e:
        print(f"Error checking compaction stats: {e}")
        return False
    
    return True

if __name__ == "__main__":
    if check_cassandra_health():
        print("Cassandra cluster is healthy")
        sys.exit(0)
    else:
        print("Cassandra cluster health check failed")
        sys.exit(1)

Migration Guide

Upgrading from Cassandra 4.x to 5.0

#!/bin/bash
# Cassandra 5.0 upgrade script

echo "Starting Cassandra 5.0 upgrade process..."

# Pre-upgrade checks
echo "Running pre-upgrade checks..."
nodetool snapshot
nodetool describecluster
nodetool status

# Upgrade process (rolling upgrade)
NODES=("cassandra-0" "cassandra-1" "cassandra-2" "cassandra-3")

for node in "${NODES[@]}"; do
    echo "Upgrading node: $node"
    
    # Drain the node
    kubectl exec -n cassandra $node -- nodetool drain
    
    # Stop Cassandra
    kubectl patch statefulset cassandra -n cassandra --type='merge' -p='{"spec":{"replicas":3}}'
    
    # Update image
    kubectl set image statefulset/cassandra cassandra=cassandra:5.0.0 -n cassandra
    
    # Wait for pod to be ready
    kubectl wait --for=condition=ready pod/$node -n cassandra --timeout=300s
    
    # Upgrade SSTables
    kubectl exec -n cassandra $node -- nodetool upgradesstables
    
    # Verify upgrade
    kubectl exec -n cassandra $node -- nodetool version
    
    echo "Node $node upgraded successfully"
done

echo "Cassandra 5.0 upgrade completed"

Summary

Modern Cassandra deployment in 2025 emphasizes:

Cloud-Native Architecture - Kubernetes operators for automated management
Enhanced Security - Zero-trust with encryption, RBAC, and IAM integration
Advanced Features - ACID transactions, vector search, and cost-based optimization
Modern Infrastructure - Graviton instances, NVMe storage, and multi-region deployment
Comprehensive Monitoring - Prometheus/Grafana with automated alerting
Automated Operations - Infrastructure as Code and CI/CD integration
AI-Ready - Vector search and machine learning workload support

The shift to cloud-native patterns with Kubernetes operators significantly reduces operational complexity while improving reliability, security, and performance.

About Cloudurable

We hope you enjoyed this modernized Cassandra deployment guide. Please provide feedback.

Cloudurable provides:

Last updated: January 2025 for Apache Cassandra 5.0 and modern cloud-native deployment patterns

comments powered by Disqus

Apache Spark Training
Kafka Tutorial
Akka Consulting
Cassandra Training
AWS Cassandra Database Support
Kafka Support Pricing
Cassandra Database Support Pricing
Non-stop Cassandra
Watchdog
Advantages of using Cloudurable™
Cassandra Consulting
Cloudurable™| Guide to AWS Cassandra Deploy
Cloudurable™| AWS Cassandra Guidelines and Notes
Free guide to deploying Cassandra on AWS
Kafka Training
Kafka Consulting
DynamoDB Training
DynamoDB Consulting
Kinesis Training
Kinesis Consulting
Kafka Tutorial PDF
Kubernetes Security Training
Redis Consulting
Redis Training
ElasticSearch / ELK Consulting
ElasticSearch Training
InfluxDB/TICK Training TICK Consulting

🚀 What’s New in This 2025 Update

Major Changes Since 2017

Key Improvements

Modern Cassandra Architecture Overview

Cassandra 5.0 Key Features

ACID Transactions

Vector Search for AI Workloads

Cost-Based Query Optimizer

Modern AWS Instance Types for Cassandra

Graviton-Based Instances (Recommended)

Storage Configuration

Kubernetes Deployment with Operators

K8ssandra Operator

Cassandra Configuration for Modern Workloads

Security Best Practices

TLS Configuration

IAM Integration

Monitoring and Observability

Prometheus Configuration

Grafana Dashboard

Alerting Rules

Infrastructure as Code

Terraform Module

Ansible Playbook

Backup and Recovery

Automated Backup with Medusa

Testing and Validation

Performance Testing

Health Check Script

Migration Guide

Upgrading from Cassandra 4.x to 5.0

Summary

About Cloudurable

Search

Share

Follow

Categories

Tags

Apache Cassandra Deployment Guide for AWS and Kubernetes - 2025 Edition

🚀 What’s New in This 2025 Update

Major Changes Since 2017

Key Improvements

Modern Cassandra Architecture Overview

Cassandra 5.0 Key Features

ACID Transactions

Vector Search for AI Workloads

Cost-Based Query Optimizer

Modern AWS Instance Types for Cassandra

Graviton-Based Instances (Recommended)

Storage Configuration

Kubernetes Deployment with Operators

K8ssandra Operator

Cassandra Configuration for Modern Workloads

Security Best Practices

TLS Configuration

IAM Integration

Monitoring and Observability

Prometheus Configuration

Grafana Dashboard

Alerting Rules

Infrastructure as Code

Terraform Module

Ansible Playbook

Backup and Recovery

Automated Backup with Medusa

Testing and Validation

Performance Testing

Health Check Script

Migration Guide

Upgrading from Cassandra 4.x to 5.0

Summary

Related Resources

About Cloudurable

Search

Share

Follow

Categories

Tags