Introduction to BigData Analytics with Apache Spark Part 1 By Fadi Maalouli and R.H. Spark Overview Apache Spark, an open source cluster computing system, is growing fast. Apache Spark has a growing ecosystem of libraries and framework to enable advanced data analytics. Apache Spark’s rapid success is due to its power and and ease-of-use. It is more productive and has faster runtime than the typical MapReduce BigData based analytics. Apache Spark provides in-memory, distributed computing.
Analytics with Apache Spark Tutorial Part 2 : Spark SQL Using Spark SQL from Python and Java Combining Cassandra and Spark By Fadi Maalouli and R.H. Spark, a very powerful tool for real-time analytics, is very popular. In the first part of this series on Spark we introduced Spark. We covered Spark’s history, and explained RDDs (which are used to partition data in the Spark cluster). We also covered the Apache Spark Ecosystem.
In this part of Spark’s tutorial (part 3), we will introduce two important components of Spark’s Ecosystem: Spark Streaming and MLlib. Display - Edit Spark Streaming By Fadi Maalouli and R.H. Spark Streaming is a real-time processing tool, that has a high level API, is fault tolerant, and is easy to integrate with SQL DataFrames and GraphX. On a high level Spark Streaming works by running receivers that receive data from for example S3, Cassandra, Kafka etc… and it divides these data into blocks, then pushes these blocks into Spark, then Spark will work with these blocks of data as RDDs, from here you get your results.
Setting up client and cluster SSL transport for a Cassandra cluster This articles is a Cassandra tutorial on Cassandra setup for SSL and CQL clients, as well as installing Cassandra with SSL configured on a series of Linux servers. Cassandra allows you to secure the client transport (CQL) as well as the cluster transport (storage transport). SSL/TLS have some overhead. This is especially true in the JVM world which is not as performant for handling SSL/TLS unless you are using Netty/OpenSSl integration.
Setting up a Cassandra cluster with cassandra image and cassandra cloud project with Vagrant for DevOps
The cassandra-image project creates CentOS Cassandra Database images for docker, virtualbox/vagrant and AWS/EC2 using best practices for Cassandra OS setup. It is nice to use vagrant and/or docker for local development. At this time it is hard to develop systemd services using Docker so we use Vagrant. Since we do a lot of that, we like to use Vagrant. Vagrant is important for developers and DevOps not to mention Cassandra DBAs.
Apache Spark Training
AWS Cassandra Database Support
Kafka Support Pricing
Cassandra Database Support Pricing
Advantages of using Cloudurable™
Cloudurable™| Guide to AWS Cassandra Deploy
Cloudurable™| AWS Cassandra Guidelines and Notes
Free guide to deploying Cassandra on AWS
Kafka Tutorial PDF
ElasticSearch / ELK Consulting
InfluxDB/TICK Training TICK Consulting