AWS Cassandra Cluster Tutorial 5 (2025): Modern Cassandra Deployment with CDK, EKS, and Infrastructure as Code

January 9, 2025

                                                                           

Cassandra Cluster Tutorial 5 (2025) - Modern AWS Cassandra Deployment with CDK, EKS, and Infrastructure as Code

This Cassandra tutorial is designed for developers and DevOps/SRE teams who want to deploy production-ready Cassandra clusters in AWS using modern practices and tools available in 2025.

What’s New in 2025

The landscape of deploying Cassandra on AWS has evolved significantly:

  1. AWS CDK v2 has become the standard for infrastructure as code, offering type-safe infrastructure definitions
  2. Kubernetes operators like K8ssandra provide production-ready Cassandra deployments
  3. AWS Graviton3 processors offer 40% better price-performance for Cassandra workloads
  4. Container-based deployments are now the norm, with EKS Anywhere for hybrid deployments
  5. Service mesh integration with AWS App Mesh provides advanced traffic management
  6. AWS Systems Manager replaces bastion hosts for secure access
  7. GitOps workflows with AWS CodeCommit and FluxCD for infrastructure management

Cloudurable provides Cassandra training, Cassandra consulting, Cassandra support and helps setting up Cassandra clusters in AWS.

Overview

This article covers modern approaches to deploying Cassandra on AWS:

  • AWS CDK v2 for infrastructure as code
  • EKS deployment with K8ssandra operator
  • EC2 deployment with modern automation
  • AWS Systems Manager for secure access (replacing bastion hosts)
  • GitOps workflows for continuous deployment
  • Cost optimization with Graviton3 and Spot instances
  • Multi-region deployment patterns
  • Observability with AWS Distro for OpenTelemetry

Prerequisites

# Install AWS CDK v2
npm install -g aws-cdk@latest

# Install kubectl
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"

# Install eksctl
curl --location "https://github.com/weaveworks/eksctl/releases/latest/download/eksctl_$(uname -s)_amd64.tar.gz" | tar xz -C /tmp

# Install Helm
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash

Modern Infrastructure with AWS CDK v2

Let’s create a modern VPC setup using CDK v2 instead of CloudFormation:

Initialize CDK Project

mkdir cassandra-cluster-cdk && cd cassandra-cluster-cdk
cdk init app --language typescript
npm install @aws-cdk/aws-ec2 @aws-cdk/aws-eks @aws-cdk/aws-iam

CDK Stack for Cassandra Infrastructure

// lib/cassandra-cluster-stack.ts
import * as cdk from 'aws-cdk-lib';
import * as ec2 from 'aws-cdk-lib/aws-ec2';
import * as eks from 'aws-cdk-lib/aws-eks';
import * as iam from 'aws-cdk-lib/aws-iam';
import { Construct } from 'constructs';

export class CassandraClusterStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    // Create VPC with modern best practices
    const vpc = new ec2.Vpc(this, 'CassandraVPC', {
      maxAzs: 3,
      ipAddresses: ec2.IpAddresses.cidr('10.0.0.0/16'),
      natGateways: 1, // Cost optimization
      subnetConfiguration: [
        {
          cidrMask: 24,
          name: 'Public',
          subnetType: ec2.SubnetType.PUBLIC,
        },
        {
          cidrMask: 24,
          name: 'Private',
          subnetType: ec2.SubnetType.PRIVATE_WITH_EGRESS,
        },
        {
          cidrMask: 24,
          name: 'Isolated',
          subnetType: ec2.SubnetType.PRIVATE_ISOLATED,
        }
      ],
      enableDnsHostnames: true,
      enableDnsSupport: true,
    });

    // Add VPC Flow Logs for security
    vpc.addFlowLog('VPCFlowLog', {
      destination: ec2.FlowLogDestination.toCloudWatchLogs(),
      trafficType: ec2.FlowLogTrafficType.ALL,
    });

    // Create EKS Cluster for Kubernetes deployment
    const cluster = new eks.Cluster(this, 'CassandraEKSCluster', {
      vpc,
      version: eks.KubernetesVersion.V1_28,
      defaultCapacity: 0, // We'll add our own node groups
      clusterName: 'cassandra-cluster',
      mastersRole: new iam.Role(this, 'MastersRole', {
        assumedBy: new iam.AccountRootPrincipal(),
      }),
    });

    // Add Graviton3-based node group for better price-performance
    const nodeGroup = cluster.addNodegroupCapacity('CassandraNodes', {
      instanceTypes: [
        new ec2.InstanceType('m7g.2xlarge'), // Graviton3
        new ec2.InstanceType('r7g.2xlarge'), // Graviton3 memory-optimized
      ],
      minSize: 3,
      maxSize: 9,
      desiredSize: 3,
      diskSize: 100,
      subnets: { subnetType: ec2.SubnetType.PRIVATE_WITH_EGRESS },
      taints: [
        {
          key: 'workload',
          value: 'cassandra',
          effect: eks.TaintEffect.NO_SCHEDULE,
        },
      ],
    });

    // Add Spot instances for cost savings (non-seed nodes)
    const spotNodeGroup = cluster.addNodegroupCapacity('CassandraSpotNodes', {
      instanceTypes: [
        new ec2.InstanceType('m6i.2xlarge'),
        new ec2.InstanceType('m5.2xlarge'),
      ],
      minSize: 0,
      maxSize: 6,
      desiredSize: 0,
      capacityType: eks.CapacityType.SPOT,
      subnets: { subnetType: ec2.SubnetType.PRIVATE_WITH_EGRESS },
    });

    // Security group for Cassandra nodes
    const cassandraSG = new ec2.SecurityGroup(this, 'CassandraSG', {
      vpc,
      description: 'Security group for Cassandra nodes',
      allowAllOutbound: true,
    });

    // Cassandra ports
    cassandraSG.addIngressRule(
      ec2.Peer.ipv4(vpc.vpcCidrBlock),
      ec2.Port.tcp(9042), // CQL
      'CQL port'
    );
    cassandraSG.addIngressRule(
      ec2.Peer.ipv4(vpc.vpcCidrBlock),
      ec2.Port.tcp(7000), // Inter-node
      'Inter-node communication'
    );
    cassandraSG.addIngressRule(
      ec2.Peer.ipv4(vpc.vpcCidrBlock),
      ec2.Port.tcp(7001), // SSL inter-node
      'SSL inter-node communication'
    );

    // Enable AWS Systems Manager for secure access (no bastion needed)
    const ssmRole = new iam.Role(this, 'SSMRole', {
      assumedBy: new iam.ServicePrincipal('ec2.amazonaws.com'),
      managedPolicies: [
        iam.ManagedPolicy.fromAwsManagedPolicyName('AmazonSSMManagedInstanceCore'),
      ],
    });

    // Output important values
    new cdk.CfnOutput(this, 'VPCId', { value: vpc.vpcId });
    new cdk.CfnOutput(this, 'ClusterName', { value: cluster.clusterName });
    new cdk.CfnOutput(this, 'ClusterEndpoint', { value: cluster.clusterEndpoint });
  }
}

Deploy K8ssandra on EKS

K8ssandra provides a production-ready Kubernetes operator for Cassandra:

Install K8ssandra Operator

# Add K8ssandra Helm repository
helm repo add k8ssandra https://helm.k8ssandra.io/stable
helm repo update

# Install cert-manager (required by K8ssandra)
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.0/cert-manager.yaml

# Install K8ssandra operator
helm install k8ssandra-operator k8ssandra/k8ssandra-operator -n k8ssandra-operator --create-namespace

Deploy Cassandra Cluster with K8ssandra

# k8ssandra-cluster.yaml
apiVersion: k8ssandra.io/v1alpha1
kind: K8ssandraCluster
metadata:
  name: cassandra-prod
  namespace: cassandra
spec:
  cassandra:
    serverVersion: "4.1.3"
    datacenters:
      - metadata:
          name: dc1
        size: 3
        storageConfig:
          cassandraDataVolumeClaimSpec:
            storageClassName: gp3
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 500Gi
        config:
          jvmOptions:
            heapSize: 16Gi
            heapNewGenSize: 4Gi
            gc: G1GC
        resources:
          requests:
            cpu: 4
            memory: 32Gi
          limits:
            cpu: 8
            memory: 32Gi
        racks:
          - name: rack1
            nodeAffinityLabels:
              topology.kubernetes.io/zone: us-east-1a
          - name: rack2
            nodeAffinityLabels:
              topology.kubernetes.io/zone: us-east-1b
          - name: rack3
            nodeAffinityLabels:
              topology.kubernetes.io/zone: us-east-1c
        tolerations:
          - key: workload
            value: cassandra
            effect: NoSchedule
    telemetry:
      prometheus:
        enabled: true
      mcac:
        enabled: true
  reaper:
    enabled: true
    autoScheduling:
      enabled: true
  medusa:
    enabled: true
    storageProperties:
      storageProvider: s3
      bucketName: cassandra-backups-2025
      region: us-east-1
      storageSecretRef:
        name: medusa-bucket-creds
  stargate:
    enabled: true
    size: 2
    heapSize: 2Gi

Modern EC2 Deployment with User Data and Systems Manager

For teams preferring EC2 deployment, here’s a modern approach:

CDK for EC2-based Cassandra

// lib/cassandra-ec2-stack.ts
import * as cdk from 'aws-cdk-lib';
import * as ec2 from 'aws-cdk-lib/aws-ec2';
import * as iam from 'aws-cdk-lib/aws-iam';
import * as autoscaling from 'aws-cdk-lib/aws-autoscaling';
import * as elbv2 from 'aws-cdk-lib/aws-elasticloadbalancingv2';

export class CassandraEC2Stack extends cdk.Stack {
  constructor(scope: Construct, id: string, props: CassandraEC2Props) {
    super(scope, id, props);

    const { vpc } = props;

    // Create launch template for Cassandra nodes
    const launchTemplate = new ec2.LaunchTemplate(this, 'CassandraLaunchTemplate', {
      instanceType: new ec2.InstanceType('m7g.2xlarge'), // Graviton3
      machineImage: new ec2.AmazonLinuxImage({
        generation: ec2.AmazonLinuxGeneration.AMAZON_LINUX_2023,
        cpuType: ec2.AmazonLinuxCpuType.ARM_64,
      }),
      role: new iam.Role(this, 'CassandraRole', {
        assumedBy: new iam.ServicePrincipal('ec2.amazonaws.com'),
        managedPolicies: [
          iam.ManagedPolicy.fromAwsManagedPolicyName('AmazonSSMManagedInstanceCore'),
          iam.ManagedPolicy.fromAwsManagedPolicyName('CloudWatchAgentServerPolicy'),
        ],
        inlinePolicies: {
          CassandraPolicy: new iam.PolicyDocument({
            statements: [
              new iam.PolicyStatement({
                actions: ['ec2:DescribeInstances', 'ec2:DescribeTags'],
                resources: ['*'],
              }),
              new iam.PolicyStatement({
                actions: ['s3:GetObject', 's3:PutObject'],
                resources: ['arn:aws:s3:::cassandra-config-2025/*'],
              }),
            ],
          }),
        },
      }),
      userData: ec2.UserData.custom(`#!/bin/bash
set -e

# Install dependencies
dnf update -y
dnf install -y docker amazon-cloudwatch-agent

# Install Cassandra using Docker
systemctl start docker
systemctl enable docker

# Pull and run Cassandra container
docker pull cassandra:4.1.3

# Get instance metadata
INSTANCE_ID=$(ec2-metadata --instance-id | cut -d " " -f 2)
AZ=$(ec2-metadata --availability-zone | cut -d " " -f 2)
PRIVATE_IP=$(ec2-metadata --local-ipv4 | cut -d " " -f 2)

# Configure Cassandra based on tags and metadata
aws ec2 describe-tags --filters "Name=resource-id,Values=$INSTANCE_ID" \
  --query 'Tags[?Key==`cassandra:seed`].Value' --output text > /tmp/is_seed

IS_SEED=$(cat /tmp/is_seed)
if [ "$IS_SEED" = "true" ]; then
  SEEDS="$PRIVATE_IP"
else
  # Discover seed nodes
  SEEDS=$(aws ec2 describe-instances \
    --filters "Name=tag:cassandra:seed,Values=true" \
              "Name=instance-state-name,Values=running" \
    --query 'Reservations[].Instances[].PrivateIpAddress' \
    --output text | tr '\t' ',')
fi

# Run Cassandra with proper configuration
docker run -d \
  --name cassandra \
  --restart always \
  --network host \
  -e CASSANDRA_CLUSTER_NAME=production \
  -e CASSANDRA_SEEDS="$SEEDS" \
  -e CASSANDRA_ENDPOINT_SNITCH=Ec2Snitch \
  -e CASSANDRA_DC=us-east \
  -e CASSANDRA_RACK="$AZ" \
  -e JVM_OPTS="-Xms16G -Xmx16G" \
  -v /data/cassandra:/var/lib/cassandra \
  cassandra:4.1.3

# Configure CloudWatch monitoring
cat > /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json << EOF
{
  "metrics": {
    "namespace": "Cassandra",
    "metrics_collected": {
      "cpu": {
        "measurement": [
          "cpu_usage_idle",
          "cpu_usage_iowait"
        ],
        "metrics_collection_interval": 60
      },
      "disk": {
        "measurement": [
          "used_percent"
        ],
        "metrics_collection_interval": 60,
        "resources": [
          "/data/cassandra"
        ]
      },
      "mem": {
        "measurement": [
          "mem_used_percent"
        ],
        "metrics_collection_interval": 60
      }
    }
  },
  "logs": {
    "logs_collected": {
      "files": {
        "collect_list": [
          {
            "file_path": "/var/log/cassandra/system.log",
            "log_group_name": "/aws/cassandra/system",
            "log_stream_name": "{instance_id}"
          }
        ]
      }
    }
  }
}
EOF

systemctl start amazon-cloudwatch-agent
`),
    });

    // Create Auto Scaling Groups for different node types
    const seedGroup = new autoscaling.AutoScalingGroup(this, 'SeedNodes', {
      vpc,
      launchTemplate,
      minCapacity: 3,
      maxCapacity: 3,
      desiredCapacity: 3,
      vpcSubnets: {
        subnetType: ec2.SubnetType.PRIVATE_WITH_EGRESS,
        onePerAz: true,
      },
    });

    // Tag seed nodes
    cdk.Tags.of(seedGroup).add('cassandra:seed', 'true');
    cdk.Tags.of(seedGroup).add('cassandra:role', 'seed');

    // Non-seed nodes with Spot instances for cost savings
    const dataGroup = new autoscaling.AutoScalingGroup(this, 'DataNodes', {
      vpc,
      mixedInstancesPolicy: {
        instancesDistribution: {
          onDemandPercentageAboveBaseCapacity: 0,
          spotAllocationStrategy: 'capacity-optimized',
        },
        launchTemplate,
        launchTemplateOverrides: [
          { instanceType: new ec2.InstanceType('m6i.2xlarge') },
          { instanceType: new ec2.InstanceType('m5.2xlarge') },
          { instanceType: new ec2.InstanceType('m5a.2xlarge') },
        ],
      },
      minCapacity: 0,
      maxCapacity: 20,
      desiredCapacity: 6,
      vpcSubnets: {
        subnetType: ec2.SubnetType.PRIVATE_WITH_EGRESS,
      },
    });

    cdk.Tags.of(dataGroup).add('cassandra:seed', 'false');
    cdk.Tags.of(dataGroup).add('cassandra:role', 'data');

    // Application Load Balancer for CQL access
    const alb = new elbv2.ApplicationLoadBalancer(this, 'CassandraALB', {
      vpc,
      internetFacing: false,
      subnets: { subnetType: ec2.SubnetType.PRIVATE_WITH_EGRESS },
    });

    // Note: For CQL you'd typically use Network Load Balancer
    const nlb = new elbv2.NetworkLoadBalancer(this, 'CassandraNLB', {
      vpc,
      internetFacing: false,
      subnets: { subnetType: ec2.SubnetType.PRIVATE_WITH_EGRESS },
    });

    const listener = nlb.addListener('CQLListener', {
      port: 9042,
      protocol: elbv2.Protocol.TCP,
    });

    listener.addTargets('CassandraTargets', {
      port: 9042,
      protocol: elbv2.Protocol.TCP,
      targets: [seedGroup, dataGroup],
      healthCheck: {
        enabled: true,
        protocol: elbv2.Protocol.TCP,
      },
    });
  }
}

GitOps Deployment with FluxCD

Modern deployments use GitOps for continuous deployment:

# flux-system/cassandra-source.yaml
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
  name: cassandra-config
  namespace: flux-system
spec:
  interval: 1m
  ref:
    branch: main
  url: https://github.com/yourorg/cassandra-config

---
# flux-system/cassandra-kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: cassandra-cluster
  namespace: flux-system
spec:
  interval: 10m
  path: ./clusters/production
  prune: true
  sourceRef:
    kind: GitRepository
    name: cassandra-config
  validation: client
  postBuild:
    substitute:
      cluster_name: "production"
      datacenter: "us-east-1"

Secure Access with AWS Systems Manager

No more bastion hosts! Use Systems Manager Session Manager:

# Connect to an instance
aws ssm start-session --target i-1234567890abcdef0

# Port forwarding for CQL access
aws ssm start-session \
  --target i-1234567890abcdef0 \
  --document-name AWS-StartPortForwardingSession \
  --parameters '{"portNumber":["9042"],"localPortNumber":["9042"]}'

# Connect with cqlsh through the tunnel
cqlsh localhost 9042

Multi-Region Deployment

// lib/multi-region-cassandra.ts
export class MultiRegionCassandraStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    // Create Global Accelerator for multi-region access
    const accelerator = new globalaccelerator.Accelerator(this, 'CassandraGA', {
      ipAddressType: globalaccelerator.IpAddressType.IPV4,
    });

    const listener = accelerator.addListener('Listener', {
      portRanges: [{ fromPort: 9042, toPort: 9042 }],
      protocol: globalaccelerator.Protocol.TCP,
    });

    // Add endpoints from multiple regions
    listener.addEndpointGroup('USEast1', {
      region: 'us-east-1',
      endpoints: [
        new globalaccelerator_endpoints.NetworkLoadBalancerEndpoint(
          nlbUsEast1
        ),
      ],
    });

    listener.addEndpointGroup('EUWest1', {
      region: 'eu-west-1',
      endpoints: [
        new globalaccelerator_endpoints.NetworkLoadBalancerEndpoint(
          nlbEuWest1
        ),
      ],
    });
  }
}

Observability with AWS Distro for OpenTelemetry

# otel-collector-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: otel-collector-config
data:
  config.yaml: |
    receivers:
      prometheus:
        config:
          scrape_configs:
            - job_name: 'cassandra'
              kubernetes_sd_configs:
                - role: pod
              relabel_configs:
                - source_labels: [__meta_kubernetes_pod_label_app]
                  action: keep
                  regex: cassandra
    processors:
      batch:
        timeout: 30s
        send_batch_size: 500
    exporters:
      awsprometheusremotewrite:
        endpoint: ${AWS_PROMETHEUS_ENDPOINT}
        aws_auth:
          region: us-east-1
          service: aps
      awsxray:
        region: us-east-1
    service:
      pipelines:
        metrics:
          receivers: [prometheus]
          processors: [batch]
          exporters: [awsprometheusremotewrite]
        traces:
          receivers: [otlp]
          processors: [batch]
          exporters: [awsxray]    

Cost Optimization Strategies

  1. Use Graviton3 instances - 40% better price-performance
  2. Implement data tiering - Move cold data to S3 with Apache Iceberg
  3. Use Spot instances for non-seed nodes
  4. Right-size instances based on actual workload
  5. Enable compression - LZ4 for better performance
  6. Use GP3 volumes - Better cost per IOPS

Security Best Practices for 2025

  1. Encryption everywhere:

    • TLS 1.3 for client-to-node
    • mTLS for node-to-node
    • EBS encryption with CMK
    • S3 encryption for backups
  2. Identity and Access:

    • IAM Roles for service accounts (IRSA)
    • AWS SSO for human access
    • No SSH keys or bastion hosts
  3. Network Security:

    • VPC endpoints for AWS services
    • AWS WAF for API protection
    • Network policies in Kubernetes

Conclusion

In 2025, deploying Cassandra on AWS has become more sophisticated with:

  • Infrastructure as Code using CDK v2
  • Kubernetes operators for easier management
  • GitOps for continuous deployment
  • Better cost optimization with Graviton3 and Spot
  • Enhanced security with Systems Manager and IRSA
  • Comprehensive observability with OpenTelemetry

The days of manual EC2 provisioning and bastion hosts are behind us. Modern Cassandra deployments leverage the full AWS ecosystem for a more reliable, secure, and cost-effective solution.

More about Cloudurableâ„¢

Cloudurable specializes in AWS DevOps automation for Cassandra, Kafka, and modern data platforms. We provide training, consulting, and implementation services to help organizations succeed with distributed systems in the cloud.

Consulting

Training

Feedback

We hope you found this updated guide helpful. Please provide feedback.

                                                                           
comments powered by Disqus

Apache Spark Training
Kafka Tutorial
Akka Consulting
Cassandra Training
AWS Cassandra Database Support
Kafka Support Pricing
Cassandra Database Support Pricing
Non-stop Cassandra
Watchdog
Advantages of using Cloudurable™
Cassandra Consulting
Cloudurable™| Guide to AWS Cassandra Deploy
Cloudurable™| AWS Cassandra Guidelines and Notes
Free guide to deploying Cassandra on AWS
Kafka Training
Kafka Consulting
DynamoDB Training
DynamoDB Consulting
Kinesis Training
Kinesis Consulting
Kafka Tutorial PDF
Kubernetes Security Training
Redis Consulting
Redis Training
ElasticSearch / ELK Consulting
ElasticSearch Training
InfluxDB/TICK Training TICK Consulting