Part 2 Setting up Ansible and ssh for Cassandra Database Cluster DevOps

March 11, 2017

                                                                           

Cassandra Cluster Tutorial 3: Part 2 of 2

Setting up Ansible and SSH for our Cassandra Database Cluster for DevOps/DBA Tasks

This tutorial series centers on how DevOps/DBA tasks with the Cassandra Database. As we mentioned before, Ansible and ssh are essential DevOps/DBA tools for common DBA/DevOps tasks whilst working with Cassandra Clusters. Please read part 1 before reading part 2.

In part 1, we set up Ansible for our Cassandra Database Cluster to automate common DevOps/DBA tasks. As part of this setup, we created an ssh key and then set up our instances with this key so we could use ssh, scp, and most importantly ansible. We also created an ansible playbook to install keys on our Cassandra nodes from a bastion host that we set up with Vagrant.

Let’s get to it.

For users who did not read any of the first articles on setting up the Cassandra Cluster

If you have not done so already navigate to the project root dir (which is ~/github/cassandra-image on my dev box), download the binaries. The source code is at Github Cassandra Image project.

Running setup scripts

## cd ~; mkdir github; cd github; git clone https://github.com/cloudurable/cassandra-image
$ cd ~/github/cassandra-image
$ pwd
~/github/cassandra-image
## Setup keys
$ bin/setupkeys-cassandra-security.sh
## Download binaries
$ bin/prepare_binaries.sh
## Bring Vagrant cluster up
$ vagrant up

Even if you read the first article note that bin/prepare_binaries.sh is something we added after the first two articles. It downloads the binaries needed for the provisioning, does a checksum of the files and then installs them as part of the provisioning process.

Where do you go if you have a problem or get stuck?

We set up a google group for this project and set of articles. If you just can’t get something to work or you are getting an error message, please report it here. Between the mailing list and the github issues, we can support you with quite a few questions and issues.

Running ansible commands from bastion

Let’s log into bastion and run ansible commands against the cassandra nodes.

Working with ansible from bastion and using ssh-agent

$ vagrant ssh bastion

So we don’t have to keep logging in, and passing our cert key, let’s start up an ssh-agent and add our cert key ssh-add ~/.ssh/test_rsa to the agent.

The ssh-agent is a utility to hold private keys used for public key authentication (RSA, DSA, ECDSA, Ed25519) so you don’t have to keep passing the keys around. The ssh-agent is usually started in the beginning of a login session. Other programs (scp, ssh, ansible) are started as clients to the ssh-agent utility.

Mastering ssh is essential for DevOps and needed for ansible. –R. Hightower

First set up ssh-agent and add keys to it with ssh-add.

Start ssh-agent and add keys

$ ssh-agent bash
$ ssh-add ~/.ssh/test_rsa

Now that the agent is running and our keys are added, we can use ansible without passing it the RSA private key.

Let’s verify connectivity, by pinging some of these machines. Let’s ping the node0 machine. Then let’s ping all of the nodes.

Let’s use the ansible ping module to ping the node0 server.

Ansible Ping the Cassandra Database node

$ ansible node0 -m ping

Output

node0 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}

To learn more about DevOps with ansible see this video on Ansible introduction. It covers a lot of the basics of ansible.

Now let’s ping all of the nodes.

Ansible Ping all Cassandra Database Cluster nodes

$ ansible nodes  -m ping

Output

node0 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}
node2 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}
node1 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}

Looks like bastion can run ansible against all of the servers.

Cloudurable specialize in AWS DevOps Automation for Cassandra and Kafka

We hope this web page on ansible bastion setup for Cassandra helps you. We also provide Casandra consulting and Kafka consulting to get you setup fast in AWS with CloudFormation and CloudWatch. Support us by checking out our Casandra training and Kafka training.

Setting up my MacOSX to run Ansible against Cassandra Database Cluster nodes

The script ~/github/cassandra-image/bin/setupkeys-cassandra-security.sh copies the test cluster key for ssh (secure shell) over to ~/.ssh/ (cp "$PWD/resources/server/certs/"* ~/.ssh). It was Run from the project root folder which is ~/github/cassandra-image on my box.

Move to the where you checked out the project.

cd ~/github/cassandra-image

In this folder is an ansible.cfg file and an inventory.ini file for local dev. Before you use these first modify your /etc/hosts file to configure entries for bastion, node0, node1, node2 servers.

Add bastion, node0, etc. to /etc/hosts

$ cat /etc/hosts

### Used for ansible/ vagrant
192.168.50.20  bastion
192.168.50.4  node0
192.168.50.5  node1
192.168.50.6  node2
192.168.50.7  node3
192.168.50.8  node4
192.168.50.9  node5

We can use ssh-keyscan just like we did before to add these hosts to our known_hosts file.

Add keys to known_hosts to avoid prompts

$ ssh-keyscan node0 node1 node2  >> ~/.ssh/known_hosts

Then just like before we can start up an ssh-agent and add our keys.

Start ssh-agent and add keys

$ ssh-agent bash
$ ssh-add ~/.ssh/test_rsa

Notice that the ansible.cfg and inventory.ini files are a bit different than on our bastion server because we have to add the user name.

Notice the ansible.cfg file and inventory.ini file in the project dir

$ cd ~/github/cassandra-image

$ cat ansible.cfg
[defaults]
hostfile = inventory.ini

cat inventory.ini
[nodes]
node0 ansible_user=vagrant
node1 ansible_user=vagrant
node2 ansible_user=vagrant

Ansible will use these.

From the project directory, you should be able to ping node0 and all of the nodes just like before.

Ping node0 with ansible.

Ansible Ping Cassandra Database node

$ ansible node0 -m ping

Output

node0 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}

Ping all of the Cassandra nodes with ansible.

Ansible Ping All Cassandra Database Cluster nodes

$ ansible nodes  -m ping

Output

node0 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}
node2 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}
node1 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}

In one of the next tutorials, we cover how to setup ~.ssh/config so you don’t have to remember to use ssh-agent.

Using ansible to run nodetool on Cassandra Cluster nodes

You may recall from the first Cassandra tutorial in this series that we would log into the servers (vagrant ssh node0) and then check that they could see the other nodes with nodetool describecluster. We could run this command with all three servers (from bastion or on our dev laptop) with ansible.

Let’s use ansible to run describecluster against all of the nodes.

Ansible running nodetool describecluster against all Cassandra Cluster nodes

$ ansible nodes -a "/opt/cassandra/bin/nodetool describecluster"

This command allows us to check the status of every Cassandra node quickly.

Let’s say that we wanted to update a schema or do a rolling restart of our Cassandra cluster nodes, which could be a very common task. Perhaps before the update, we want to decommission the node and back things up. To do this sort of automation, we could create an Ansible playbook.

Ansible Playbooks are more powerful than executing ad-hoc task execution and is especially powerful for managing a Cassandra cluster.

Ansible playbooks allow for configuration management and multi-machine deployment to manage complex tasks like a rolling Cassandra upgrade or Cassandra schema updates or perhaps a weekly backup of Cassandra nodes.

Ansible Playbooks are declarative configurations. Ansible Playbooks orchestrate steps into a simpler task. This automation gets rid of a lot of manually ordered process and allows for an immutable infrastructure (especially when combined with HashiCorp Packer, HashiCorp Terraform and AWS CloudFormation).

Cloudurable provides Cassandra training, Cassandra consulting, Cassandra support and helps setting up Cassandra clusters in AWS.

Our describe-cluster playbook for Cassandra Database Cluster nodes

Creating a complicated playbook is beyond the scope of this article. But let’s create a simple playbook and execute it. This playbook will run the nodetool describecluster on each Cassandra node.

Here is our playbook that runs Cassandra nodetool describecluster on each Cassandra node in our cluster.

playbooks/descibe-cluster.yml - simple ansible playbook that runs Cassandra nodetool describecluster

---
- hosts: nodes
  gather_facts: no
  remote_user: vagrant

  tasks:

  - name: Run NodeTool Describe Cluster command against each Cassandra Cluster node
    command: /opt/cassandra/bin/nodetool describecluster

To run this, we use ansible-playbook as follow.

Running describe-cluster playbook to describe our Cassandra cluster

$ ansible-playbook playbooks/describe-cluster.yml --verbose

Between this article and the last, we modified our Vagrantfile quite a bit. It now uses a for loop to create the Cassandra nodes, and it uses ansible provisioning.

For completeness, here is our new Vagrantfile with updates as follows.

Complete code listing of Vagrantfile that sets up our DevOps/DBA Cassandra Database Cluster

# -*- mode: ruby -*-
# vi: set ft=ruby :

numCassandraNodes = 3

Vagrant.configure("2") do |config|


  config.vm.box = "centos/7"


  # Define Cassandra Nodes
  (0..numCassandraNodes-1).each do |i|

        port_number = i + 4
        ip_address = "192.168.50.#{port_number}"
        seed_addresses = "192.168.50.4,192.168.50.5,192.168.50.6"
        config.vm.define "node#{i}" do |node|
            node.vm.network "private_network", ip: ip_address
            node.vm.provider "virtualbox" do |vb|
                   vb.memory = "2048"
                   vb.cpus = 4
            end


            node.vm.provision "shell", inline: <<-SHELL

                sudo /vagrant/scripts/000-vagrant-provision.sh



                sudo /opt/cloudurable/bin/cassandra-cloud -cluster-name test \
                -client-address     #{ip_address} \
                -cluster-address    #{ip_address} \
                -cluster-seeds      #{seed_addresses}

            SHELL

            node.vm.provision "ansible" do |ansible|
                  ansible.playbook = "playbooks/ssh-addkey.yml"
            end
        end
  end


  # Define Bastion Node
  config.vm.define "bastion" do |node|
            node.vm.network "private_network", ip: "192.168.50.20"
            node.vm.provider "virtualbox" do |vb|
                   vb.memory = "256"
                   vb.cpus = 1
            end


            node.vm.provision "shell", inline: <<-SHELL
                yum install -y epel-release
                yum update -y
                yum install -y  ansible

                mkdir /home/vagrant/resources
                cp -r /vagrant/resources/* /home/vagrant/resources/

                mkdir -p ~/resources
                cp -r /vagrant/resources/* ~/resources/

                mkdir  -p  /home/vagrant/.ssh/
                cp /vagrant/resources/server/certs/*  /home/vagrant/.ssh/

                sudo  /vagrant/scripts/002-hosts.sh

                ssh-keyscan node0 node1 node2  >> /home/vagrant/.ssh/known_hosts


                mkdir ~/playbooks
                cp -r /vagrant/playbooks/* ~/playbooks/
                sudo cp /vagrant/resources/home/inventory.ini /etc/ansible/hosts
                chown -R vagrant:vagrant /home/vagrant
            SHELL


  end



  #
  # View the documentation for the provider you are using for more
  # information on available options.

  # Define a Vagrant Push strategy for pushing to Atlas. Other push strategies
  # such as FTP and Heroku are also available. See the documentation at
  # https://docs.vagrantup.com/v2/push/atlas.html for more information.
  config.push.define "atlas" do |push|
     push.app = "cloudurable/cassandra"
  end


end

It has come a long way.

Part 1 & 2 Conclusion

In part 1, we set up Ansible for our Cassandra Database Cluster to do automate common DevOps/DBA tasks. We created an ssh key and then set up our instances with this key so we could use ssh, scp, and ansible. We set up a bastion server with Vagrant. We used ansible playbook (ssh-addkey.yml) from Vagrant to install our test cluster key on each server.

In part 2, we ran ansible ping against a single server. We ran ansible ping against many servers (nodes). We set up our local dev machine with ansible.cfg and inventory.ini so we could use ansible commands direct to node0 and nodes. We ran nodetool describecluster against all of the nodes from our dev machine. Lastly, we created a very simple playbook that can run nodetool describecluster. Ansible is a very powerful tool that can help you manage a cluster of Cassandra instances. In later articles, we will use Ansible to create more complex playbooks like backing up Cassandra nodes to S3.

In the next Cassandra cluster tutorial, we cover AWS Cassandra

The next tutorial picks up where this one left off, and it includes coverage of AWS Cassandra, cloud DevOps and using Packer, Ansible/SSH and AWS command line tools to create and manage EC2 Cassandra instances in AWS with Ansible. This next tutorial is a continuation of this one and is useful for developers and DevOps/DBA staff who want to create AWS AMI images and manage those EC2 instances with Ansible.

Check out our new GoLang course. We provide onsite Go Lang training which is instructor led.

The AWS Cassandra tutorial covers:

  • Creating images (EC2 AMIs) with Packer
  • Using Packer from Ansible to provision an image (AWS AMI)
  • Installing systemd services that depend on other services and will auto-restart on failure
  • AWS command line tools to launch an EC2 instance
  • Setting up ansible to manage our EC2 instance (ansible uses ssh)
  • Setting up a ssh-agent and adding ssh identities (ssh-add)
  • Setting ssh using ~/.ssh/config so we don’t have to pass credentials around
  • Using ansible dynamic inventory with EC2
  • AWS command line tools to manage DNS entries with Route 53

If you are doing DevOps with AWS, Ansible dynamic inventory management with EC2 is awesome. Also mastering ssh config is a must. You should also master the AWS command line tools to automate common tasks. This next article explores all of those topics.

About Cloudurable™

Cloudurable™: streamline DevOps/DBA for Cassandra running on AWS. Cloudurable™ provides AMIs, CloudWatch Monitoring, CloudFormation templates and monitoring tools to support Cassandra in production running in EC2. We also teach advanced Cassandra courses which teach how one could develop, support and deploy Cassandra to production in AWS EC2 for Developers and DevOps/DBA.

Follow Cloudurable™ at our LinkedIn page, Facebook page, Google plus or Twitter.

More info

Please take some time to read the Advantage of using Cloudurable™.

Cloudurable provides:

Authors

Written by R. Hightower and JP Azar.

Resources

Feedback


We hope you enjoyed this article. Please provide feedback.

About Cloudurable

Cloudurable provides Cassandra training, Cassandra consulting, Cassandra support and helps setting up Cassandra clusters in AWS. Cloudurable also provides Kafka training, Kafka consulting, Kafka support and helps setting up Kafka clusters in AWS.

Check out our new GoLang course. We provide onsite Go Lang training which is instructor led.

                                                                           

Apache Spark Training
Kafka Tutorial
Akka Consulting
Cassandra Training
AWS Cassandra Database Support
Kafka Support Pricing
Cassandra Database Support Pricing
Non-stop Cassandra
Watchdog
Advantages of using Cloudurable™
Cassandra Consulting
Cloudurable™| Guide to AWS Cassandra Deploy
Cloudurable™| AWS Cassandra Guidelines and Notes
Free guide to deploying Cassandra on AWS
Kafka Training
Kafka Consulting
DynamoDB Training
DynamoDB Consulting
Kinesis Training
Kinesis Consulting
Kafka Tutorial PDF
Kubernetes Security Training
Redis Consulting
Redis Training
ElasticSearch / ELK Consulting
ElasticSearch Training
InfluxDB/TICK Training TICK Consulting