January 30, 2017
Notes on Cassandra OS setup and optimizations for deploying in EC2/AWS
These are important concepts for developers and DevOps who are responsible for developing Cassandra based applications and services.
Cassandra writes to four areas
- commit logs
- an index file
- a bloom filter
The compaction process of SSTable data makes heavy use of the disk.
LeveledCompactionStrategy may need 10 to 20% overhead.
SizeTieredCompactionStrategy worse case is 50% overhead needed to perform compaction.
Keep this in mind while sizing disks. If you are doing a high-update use case,
LeveledCompactionStrategy is the best solution if you want to limit the total disk size used at any point in time and to optimize reads as the row will be spread across less (up to ten times less) SSTables.
LeveledCompactionStrategy requires more IO and processing time for compactions. If in doubt, use
It makes sense if possible to have commit logs on a separate disk if using magnetic disks.
SSTables are written to in streams but are read from using random access if data is not found in cache.
SSTables can benefit from SSD drives due to random access.
Magnetic disks in EC2 have greater throughput but less IOPS which is good for
SSTables compaction but not good for random reads. If in doubt, use SSD volumes.
If you use RAID, RAID 0, which focuses on speed, is sufficient for Cassandra because Cassandra has replication and data-safety built-in. With Cassandra 3.x you should use JBOD (just a bunch of disks) instead of RAID 0 for throughput speed.
XFS is the preferred file system since it has very little size constraints (
sudo mkfs.xfs -K /dev/xvdb) and excels at writing in parallel. If you need data a rest encryption, use encrypted EBS volumes if running in EC2, and use dm-crypt file system if not.
Until recently using Cassandra and AWS EBS was not a good idea. The latest generation of EBS-optimized instances offer a good mix of performance and for many use cases rivaling instance storage. It is the best price for performance. If in doubt start with EBS-optimized instances.
One still has to keep an eye out for issues with EBS like poor throughput, performance degrading over time, and instances not cleanly dying. This is where system monitoring comes into play and one reason we are building these images which can be monitored using Amazon CloudWatch.
EC2 instances we use tend to be from the M4 family and the I2 family. M4 is AWS EC2s newest generation of general purpose instances with EBS optimized storage whilst the I2 family includes fast SSD-backed instance storage optimized for very high random I/O performance. I2s provide high IOPS at a low cost. For tiny read/writes benchmarking i2 EC2 instances are better instances than m4s (EC2 instances) at 8x the read speed. For medium read/writes, m4 are equivalent (EBS optimized) but at 8x less cost than i2s. There has been some reports of EBS storage degrading over time. But for 8x the cost, and with some monitoring and replication, you could automate the retirement of degrading EC2 instances using optimized EBS that are degrading.
An advantage of the M4 family is the ability to use EBS to create snapshots and easily spin up new instances by attaching EBS volume to a new instance. If you are not sure, start with
You can consider D2 family of EC2 instances for mostly write operations. The D2 family offers the highest throughput for cost. If you are keeping a lot of log based or even approaching big data uses cases, this might be a great option for high throughput (mostly writes and mostly batch reads).
You need at least 4 cores but prefer 8 cores for a production machine. Compaction, compression, key lookup based on bloom filters, will all need CPU resources. The
m4.xlarge falls a bit behind for this as it only has 4 vCPUs (4 cores). The
m4.2xlarge has 8 vCPUs which should be able to handle most production loads nicely. The
i2.xlarge (for high random read) and
d2.xlarge for high writes and long sequential reads are also a little light on CPU power. Consider
d2.2xlarge for production workloads as they have 8 vCPUs.
Do not use less than 8GB of memory for the JVM. The more RAM the better. SSTable are first stored in memory and then written to disk sequentially. The larger the SSTable the less scanning that needs to be done while reading and determining if a key is in an SSTable using a bloom filter. In the EC2 world this equates to an
m4.xlarge (16GB of memory), and you need some memory for the OS, specifically the IO buffers. The
d2.xlarge are the smallest in their family and exceed the min memory requirement (and then some).
AWS EC2 has placement groups and enhanced networking which allow high-speed throughput for clustered software like Cassandra. This is where things get tricky in EC2. Networking is important to Cassandra due to replication of data. However, with most deployments an AZ is treated like a rack, and Cassandra tries to store replica data on nodes that are in a different rack (in EC2’s case a different AZ). EC2 placement groups and enhanced networking only works per AZ. Thus the most common use case of Cassandra cluster network would not use enhanced networking (placement groups) at all. Now if you replicate higher than 2 then some replication will happen within the same AZ and placement groups (enhanced networking) could speed that up. Go ahead and use enhanced networking.
Cloudurable specialize in AWS DevOps Automation for Cassandra and Kafka
We hope this web page on Cassandra setup is helpful. We also provide Casandra consulting and Kafka consulting to get you setup fast in AWS with CloudFormation and CloudWatch. Support us by checking out our Casandra training and Kafka training.
Our images use CentOS7. A Linux variant based on RHEL. Most enterprises use RHEL and so will be familiar with CentOS. Amazon Linux is based on CentOS but not kept up with advances in systemd support.
limits.conf file should be as configured as such.
#<domain> <type> <item> <value> root soft nofile 32768 root hard nofile 32768 root soft memlock unlimited root hard memlock unlimited root soft as unlimited root hard as unlimited root hard nproc 32768 root soft nproc 32768 * soft nofile 32768 * hard nofile 32768 * soft memlock unlimited * hard memlock unlimited * soft as unlimited * hard as unlimited * soft nproc 32768 * hard nproc 32768
The key here is the increase in the number of open files and processes.
Next let’s setup the sysctl.conf.
net.core.rmem_max = 16777216 net.core.wmem_max = 16777216 net.core.rmem_default = 16777216 net.core.wmem_default = 16777216 net.core.optmem_max = 40960 net.ipv4.tcp_rmem = 4096 87380 16777216 net.ipv4.tcp_wmem = 4096 65536 16777216 vm.max_map_count = 1048575 xen.independent_wallclock = 1
The key here is to give TCP stack large buffers.
We also have a special script to turn off
transparent_hugepage which defeats a bug impacting the Cassandra Database using a lot of CPU when under no load.
More to come
A few things we are working on is base AMI, Docker and Vagrant images that preconfigures the Cassandra Database with the correct OS and Cassandra settings. You can see where we apply some of these notes in creating images. We are also working on cloud config ergonomics for Cassandra that use meta-data from EC2, Linux, etc. to configure Cassandra based on the type of server (or EC2 instance type). Check out more at our blog.
Slide deck that covers configuring AWS Cassandra
Cloudurable™ streamline DevOps for Cassandra running on AWS provides AMIs, CloudWatch Monitoring, CloudFormation templates and monitoring tools to support Cassandra in production running in EC2. We also teach advanced Cassandra courses which teaches how one could develop, support and deploy Cassandra to production in AWS EC2.
Please take some time to read the Advantage of using Cloudurable™.
- Subscription Cassandra Database support to streamline DevOps (Support subscription pricing for Cassandra Database and Kafka in AWS)
- Quickstart Mentoring Consulting for Developers and DevOps
- Architectural Analysis Consulting
- Training and mentoring for Cassandra for DevOps and Developers
- Training and mentoring for Kafka for DevOps and Developers
References and Resources:
- Azar, Jean-Paul (2016-11-7). Data safety with EBS - Backup/Recovery - Cloudurable Blog. Cloudurable Corp. http://cloudurable.com/blog/backup-ebs/index.html
- Azar, Jean-Paul (2016-11-7). Amazon Elastic Compute Cloud (Amazon EC2) - Cloudurable Blog. Cloudurable Corp. http://cloudurable.com/blog/aws-ec2/index.html
- Slater, Ben (2015-10-8). Cassandra on AWS EBS Infrastructure - Instaclustr Blog. https://www.instaclustr.com/blog/2015/10/28/cassandra-on-aws-ebs-infrastructure/
- Neeraj, Nishant (2015-03-26). Mastering Apache Cassandra - Second Edition. Packt Publishing. Kindle Edition. Chapter 4 Deploying a Cluster.
- unascribed, DataStax docs. (2017-01-10) Recommended production settings for Linux. Datastax Corp. https://docs.datastax.com/en/landing_page/doc/landing_page/recommendedSettingsLinux.html
- Alessandro, Pieri (2016-8-1) How To Setup A Highly Available Multi-AZ Cassandra Cluster On AWS EC2. http://highscalability.com/blog/2016/8/1/how-to-setup-a-highly-available-multi-az-cassandra-cluster-o.html
- Nitin Sharma / Jorge Rodriguez (2015-11-10) Global Cassandra on AWS EC2 at BloomReach. http://engineering.bloomreach.com/global-cassandra-on-aws-ec2-at-bloomreach/ (Perf problems and reliability problems with EC2MultiRegionSnitch and EC2Snitch).
Check out our new GoLang course. We provide onsite Go Lang training which is instructor led.Tweet
AWS Cassandra Database Support
Kafka Support Pricing
Cassandra Database Support Pricing
Advantages of using Cloudurable™
Cloudurable™| Guide to AWS Cassandra Deploy
Cloudurable™| AWS Cassandra Guidelines and Notes
Free guide to deploying Cassandra on AWS
Kafka Tutorial PDF
ElasticSearch / ELK Consulting
InfluxDB/TICK Training TICK Consulting