Cassandra CPU requirements in AWS Cloud Cassandra is highly concurrent. Cassandra nodes can uses as many CPU cores as available if configured correctly. What are vCPUs and ECUs? An Amazon EC2 vCPU is a hyper thread, often referred to as a virtual core. Think of it as a physical thread of execution. It is able to run one thread at a time (which of course could be swapped out).
Cassandra AWS Storage Requirements Cassandra does a lot sequential disk IO for the commit log and writing out SSTable. You still need random I/O for read operations. The more read operations that are cache misses, the more your EBS volumes need IOPS. Cassandra writes to four areas commit logs SSTable an index file a bloom filter Consider EC2 instance store instead of EBS for Cassandra AWS provides EC2 instance local storage called instance storage which is not available with all EC2 instance types, and Elastic Block Store (EBS).
What is Cassandra? Cassandra is a linearly scalable, open source NoSQL database. Cassandra uses log-structured merge-tree, which makes Cassandra one of the best NoSQL options for high-throughput writes. Cassandra delivers continuous availability, with operational simplicity. Unlike many other NoSQL solutions, Cassandra is a master-less, peer-to-peer, distributed clustered store. Each node knows about the cluster network topology via the gossip protocol. Cloudurable provides Cassandra training, Cassandra consulting, Cassandra support and helps setting up Cassandra clusters in AWS.
Cassandra Cluster Tutorial 3: Part 2 of 2 Setting up Ansible and SSH for our Cassandra Database Cluster for DevOps/DBA Tasks This tutorial series centers on how DevOps/DBA tasks with the Cassandra Database. As we mentioned before, Ansible and ssh are essential DevOps/DBA tools for common DBA/DevOps tasks whilst working with Cassandra Clusters. Please read part 1 before reading part 2. In part 1, we set up Ansible for our Cassandra Database Cluster to automate common DevOps/DBA tasks.
Cassandra Cluster Tutorial 3: Part 1 of 2 Setting up Ansible/SSH for our Cassandra Database Cluster for DevOps/DBA Tasks Ansible and ssh are essential DevOps/DBA tools for common DBA/DevOps tasks like managing backups, rolling upgrades to the Cassandra cluster in AWS/EC2, and so much more. An excellent aspect of Ansible is that it uses ssh, so you do not have to install an agent to use Ansible. This article series centers on how DevOps/DBA tasks with the Cassandra Database.
Cloud DevOps: Using Packer, Ansible/SSH and AWS command line tools to create and DBA manage EC2 Cassandra instances in AWS. This article is useful for developers and DevOps/DBA staff who want to create AWS AMI images and manage those EC2 instances with Ansible. Although this article is part of a series about setting up the Cassandra Database images and doing DevOps/DBA with Cassandra clusters, the topics we cover apply to AWS DevOps in general - even if you don’t use Cassandra at all.
Cassandra Cluster Tutorial: Setting up Ansible for our Cassandra Database Cluster to do DevOps tasks
Cassandra Tutorial: Setting up Ansible for our Cassandra Database Cluster for DevOps/DBA tasks Ansible is an essential DevOps/DBA tool for managing backups and rolling upgrades to the Cassandra cluster in AWS/EC2. An excellent aspect of Ansible is that it uses ssh, so you do not have to install an agent to use Ansible. This article series centers on how DevOps/DBA tasks with the Cassandra Database. However the use of Ansible for DevOps/DBA transcends its use with the Cassandra Database so this article is good information for any DevOps/DBA or Developer that needs to manage groups of instances, boxes, hosts whether they be on-prem bare-metal, dev boxes, or in the Cloud.
Introduction to BigData Analytics with Apache Spark Part 1 By Fadi Maalouli and R.H. Spark Overview Apache Spark, an open source cluster computing system, is growing fast. Apache Spark has a growing ecosystem of libraries and framework to enable advanced data analytics. Apache Spark’s rapid success is due to its power and and ease-of-use. It is more productive and has faster runtime than the typical MapReduce BigData based analytics. Apache Spark provides in-memory, distributed computing.
Analytics with Apache Spark Tutorial Part 2 : Spark SQL Using Spark SQL from Python and Java Combining Cassandra and Spark By Fadi Maalouli and R.H. Spark, a very powerful tool for real-time analytics, is very popular. In the first part of this series on Spark we introduced Spark. We covered Spark’s history, and explained RDDs (which are used to partition data in the Spark cluster). We also covered the Apache Spark Ecosystem.
In this part of Spark’s tutorial (part 3), we will introduce two important components of Spark’s Ecosystem: Spark Streaming and MLlib. Display - Edit Spark Streaming By Fadi Maalouli and R.H. Spark Streaming is a real-time processing tool, that has a high level API, is fault tolerant, and is easy to integrate with SQL DataFrames and GraphX. On a high level Spark Streaming works by running receivers that receive data from for example S3, Cassandra, Kafka etc… and it divides these data into blocks, then pushes these blocks into Spark, then Spark will work with these blocks of data as RDDs, from here you get your results.
Apache Spark Training
AWS Cassandra Database Support
Kafka Support Pricing
Cassandra Database Support Pricing
Advantages of using Cloudurable™
Cloudurable™| Guide to AWS Cassandra Deploy
Cloudurable™| AWS Cassandra Guidelines and Notes
Free guide to deploying Cassandra on AWS
Kafka Tutorial PDF
Kubernetes Security Training
ElasticSearch / ELK Consulting
InfluxDB/TICK Training TICK Consulting