AWS Cassandra

AWS Cassandra Cluster Tutorial 5 (2025): Modern Cassandra Deployment with CDK, EKS, and Infrastructure as Code

Cassandra Cluster Tutorial 5 (2025) - Modern AWS Cassandra Deployment with CDK, EKS, and Infrastructure as Code

This Cassandra tutorial is designed for developers and DevOps/SRE teams who want to deploy production-ready Cassandra clusters in AWS using modern practices and tools available in 2025.

What’s New in 2025

The landscape of deploying Cassandra on AWS has evolved significantly:

  1. AWS CDK v2 has become the standard for infrastructure as code, offering type-safe infrastructure definitions
  2. Kubernetes operators like K8ssandra provide production-ready Cassandra deployments
  3. AWS Graviton3 processors offer 40% better price-performance for Cassandra workloads
  4. Container-based deployments are now the norm, with EKS Anywhere for hybrid deployments
  5. Service mesh integration with AWS App Mesh provides advanced traffic management
  6. AWS Systems Manager replaces bastion hosts for secure access
  7. GitOps workflows with AWS CodeCommit and FluxCD for infrastructure management

Cloudurable provides Cassandra training, Cassandra consulting, Cassandra support and helps setting up Cassandra clusters in AWS.

Continue reading

Cassandra 5.0 AWS CPU Requirements: Graviton4, ZGC, and Performance Optimization

What’s New in 2025

Key Updates and Changes

  • Cassandra 5.0: Enhanced CPU utilization with improved compaction and streaming
  • Graviton4 Processors: 40% better performance for database workloads
  • ZGC Integration: Low-latency garbage collection for improved response times
  • Instance Types: New I8g, R8g, C8g families optimized for Cassandra workloads
  • Compaction Improvements: Better concurrent compactor defaults and tuning

Major Performance Enhancements

  • Unified Compaction: Reduced CPU overhead in Cassandra 5.0
  • Vector Search: CPU-intensive operations requiring additional cores
  • Streaming Performance: Improved parallel processing for data migration
  • Memory Management: Better allocation strategies reducing CPU pressure
  • ARM Optimization: Native ARM64 support for Graviton processors

Cassandra 5.0 CPU Requirements in AWS Cloud

Cassandra 5.0 is highly concurrent and can utilize as many CPU cores as available when configured correctly. Understanding CPU requirements is crucial for optimal performance on AWS EC2 instances.

Continue reading

Cassandra 5.0 AWS Storage Requirements: GP3, I4g Instances, and Performance Optimization

What’s New in 2025

Key Updates and Changes

  • EBS GP3 Volumes: 20% cost savings over GP2 with independent IOPS/throughput scaling
  • I4g Instances: Graviton2-powered with 30TB NVMe, 15% better compute performance
  • I4i vs I4g: 45-60% lower cost per TB with Im4gn/Is4gen families
  • Unified Compaction: Cassandra 5.0 reduces storage overhead and improves I/O patterns
  • EBS Optimization: Enhanced throughput up to 80 Gbps on latest instance types

Storage Performance Improvements

  • GP3 Baseline: 3,000 IOPS and 125 MiB/s regardless of volume size
  • GP3 Maximum: Up to 16,000 IOPS and 1,000 MiB/s (4x faster than GP2 max)
  • NVMe Performance: I4g delivers up to 7.6 million IOPS per instance
  • EBS Elastic Volumes: Live migration between volume types without downtime
  • Storage Classes: New archive and deep archive tiers for long-term retention

Cassandra 5.0 AWS Storage Requirements

Cassandra 5.0 performs extensive sequential disk I/O for commit logs and SSTable writes, while requiring random I/O for read operations. The enhanced Unified Compaction strategy in Cassandra 5.0 provides more predictable I/O patterns and reduced storage overhead.

Continue reading

Cassandra AWS System Memory Guidelines 2025: Optimizing for Modern Hardware and Workloads

System Memory Guidelines for Cassandra AWS - 2025 Edition

What’s New in 2025

The Cassandra memory landscape has evolved significantly:

  1. Modern JVMs - Java 21 LTS with ZGC and Shenandoah GC offer sub-millisecond pause times
  2. AWS Graviton3 - ARM-based processors with DDR5 memory provide 50% better memory bandwidth
  3. Larger heap sizes - Modern GCs handle 100GB+ heaps efficiently
  4. Container deployments - Memory management in Kubernetes requires different approaches
  5. Persistent memory - Intel Optane and similar technologies blur the line between RAM and storage
  6. Tiered storage - Hot data in memory, warm in NVMe, cold in S3
  7. Vector search workloads - New memory requirements for AI/ML applications

Cloudurable provides Cassandra training, Cassandra consulting, Cassandra support and helps setting up Cassandra clusters in AWS.

Continue reading

Cloud DevOps 2025: Packer, Ansible, SSH and AWS/EC2

What’s New in 2025

Key Updates and Changes

  • New EC2 Instance Types: M7i, C7i, and R7i families now available with up to 15% better price-performance
  • Packer Updates: Version 1.11 with predictable plugin loading and HCP integration
  • Ansible Best Practices: Enhanced aws_ec2 plugin with improved security and performance features
  • EBS Volume Evolution: GP3 volumes now standard, offering 20% cost savings over GP2
  • HashiCorp Updates: Terraform AWS Provider 6.0 with multi-region support
  • Security Enhancements: AWS Verified Access for SSH/RDP, enhanced IAM with ECR Policy v2

Deprecated Features and Migration Notes

  • GP2 to GP3 Migration: GP2 volumes should be migrated to GP3 for cost savings
  • EC2 Dynamic Inventory: Old ec2.py script deprecated in favor of aws_ec2 plugin
  • Instance Types: Consider upgrading from M6i to M7i instances for better performance
  • Packer AWS Builder: Continue using amazon-ebs builder with updated authentication methods

Cloud DevOps: Using Packer, Ansible/SSH and AWS command line tools to create and DBA manage EC2 Cassandra instances in AWS.

This article is useful for developers and DevOps/DBA staff who want to create AWS AMI images and manage those EC2 instances with Ansible. Although this article is part of a series about setting up the Cassandra Database images and doing DevOps/DBA with Cassandra clusters, the topics we cover apply to AWS DevOps in general - even if you don’t use Cassandra at all.

Continue reading

AWS Cassandra Cluster Tutorial 5: Setting up Cassandra Cluster in AWS/EC2

Cassandra Cluster Tutorial 5 - Cassandra AWS Cluster with CloudFormation, bastion host, Ansible and the aws-command line

This Cassandra tutorial is useful for developers and DevOps/DBA staff who want to launch a Cassandra cluster in AWS.

The cassandra-image project has been using Vagrant and Ansible to set up a Cassandra Cluster for local testing. Then we used Packer, Ansible and EC2. We used Packer to create AWS images in the last tutorial. In this tutorial, we will use CloudFormation to create a VPC, Subnets, security groups and more to launch a Cassandra cluster in EC2 using the AWS AMI image we created with Packer in the last article. The next two tutorials after this one, will set up Cassandra to work in multiple AZs and multiple regions using custom snitches for Cassandra.

Continue reading

Cassandra AWS System Memory Guidelines

System Memory Guidelines for Cassandra AWS

Basic guidelines for AWS Cassandra

Do not use less than 8GB of memory for the JVM. The more RAM the better. Use G1GC. SSTable are first stored in memory and then written to disk sequentially. The larger the SSTable the less scanning that needs to be done while reading and determining if a key is in an SSTable using a bloom filter. In the EC2 world this equates to an m4.xlarge (16GB of memory), and you need some memory for the OS, specifically the IO buffers. The i2.xlarge and d2.xlarge are the smallest in their family and exceed the min memory requirement (and then some).

Continue reading

Cassandra AWS CPU Guidelines

Cassandra CPU requirements in AWS Cloud

Cassandra is highly concurrent. Cassandra nodes can uses as many CPU cores as available if configured correctly.

What are vCPUs and ECUs?

An Amazon EC2 vCPU is a hyper thread, often referred to as a virtual core. Think of it as a physical thread of execution. It is able to run one thread at a time (which of course could be swapped out).

An Amazon ECU is some made up term that AWS used to use which was the power of the Intel Pentium chip that they used on the earliest incarnations of EC2. 50 ECU would be like 50 Pentium chips from a bygone era. Ignore ECUs.

Continue reading

Cassandra AWS Storage Requirements

Cassandra AWS Storage Requirements

Cassandra does a lot sequential disk IO for the commit log and writing out SSTable. You still need random I/O for read operations. The more read operations that are cache misses, the more your EBS volumes need IOPS.

Cassandra writes to four areas

  • commit logs
  • SSTable
  • an index file
  • a bloom filter

Consider EC2 instance store instead of EBS for Cassandra

AWS provides EC2 instance local storage called instance storage which is not available with all EC2 instance types, and Elastic Block Store (EBS). Instance storage does not have to go over a SAN or Intranet, instead it uses the local hardware bus. Instance storage is right there on the server you are renting. The downside of EC2 instance storage is the expense, and it is not as flexible as EBS. Due to historic problems with EBS, it used to be the only real option for running Cassandra in AWS. EBS has a reputation for degrading performance over time. Some of this has likely been fixed with enhanced EBS, but instance storage is more reliable.

Continue reading

Cloud DevOps: Packer, Ansible, SSH and AWS/EC2

Cloud DevOps: Using Packer, Ansible/SSH and AWS command line tools to create and DBA manage EC2 Cassandra instances in AWS.

This article is useful for developers and DevOps/DBA staff who want to create AWS AMI images and manage those EC2 instances with Ansible. Although this article is part of a series about setting up the Cassandra Database images and doing DevOps/DBA with Cassandra clusters, the topics we cover apply to AWS DevOps in general - even if you don’t use Cassandra at all.

Continue reading

Notes on Cassandra OS setup and optimizations for deploying in EC2/AWS

Notes on Cassandra OS setup and optimizations for deploying in EC2/AWS

Disk concerns

These are important concepts for developers and DevOps who are responsible for developing Cassandra based applications and services.

Cassandra writes to four areas

  • commit logs
  • SSTable
  • an index file
  • a bloom filter

The compaction process of SSTable data makes heavy use of the disk. LeveledCompactionStrategy may need 10 to 20% overhead. SizeTieredCompactionStrategy worse case is 50% overhead needed to perform compaction. Keep this in mind while sizing disks. If you are doing a high-update use case, LeveledCompactionStrategy is the best solution if you want to limit the total disk size used at any point in time and to optimize reads as the row will be spread across less (up to ten times less) SSTables. LeveledCompactionStrategy requires more IO and processing time for compactions. If in doubt, use LeveledCompactionStrategy.

Continue reading

                                                                           

Apache Spark Training
Kafka Tutorial
Akka Consulting
Cassandra Training
AWS Cassandra Database Support
Kafka Support Pricing
Cassandra Database Support Pricing
Non-stop Cassandra
Watchdog
Advantages of using Cloudurable™
Cassandra Consulting
Cloudurable™| Guide to AWS Cassandra Deploy
Cloudurable™| AWS Cassandra Guidelines and Notes
Free guide to deploying Cassandra on AWS
Kafka Training
Kafka Consulting
DynamoDB Training
DynamoDB Consulting
Kinesis Training
Kinesis Consulting
Kafka Tutorial PDF
Kubernetes Security Training
Redis Consulting
Redis Training
ElasticSearch / ELK Consulting
ElasticSearch Training
InfluxDB/TICK Training TICK Consulting