Blogs

What is Kafka? Kafka’s growth is exploding, more than 1⁄3 of all Fortune 500 companies use Kafka. These companies includes the top ten travel companies, 7 of top ten banks, 8 of top ten insurance companies, 9 of top ten telecom companies, and much more. LinkedIn, Microsoft and Netflix process four comma messages a day with Kafka (1,000,000,000,000). Kafka is used for real-time streams of data, used to collect big data or to do real time analysis or both).

Kafka, Avro Serialization and the Schema Registry

in Schema Registry

May 9, 2017

Kafka Tutorial: Kafka, Avro Serialization and the Schema Registry Confluent Schema Registry stores Avro Schemas for Kafka producers and consumers. The Schema Registry and provides RESTful interface for managing Avro schemas It allows the storage of a history of schemas which are versioned. the Confluent Schema Registry supports checking schema compatibility for Kafka. Cloudurable provides Kafka training, Kafka consulting, Kafka support and helps setting up Kafka clusters in AWS.

Understanding Apache Avro: Avro Introduction for Big Data and Data Streaming Architectures

in avro

May 8, 2017

Avro Introduction for Big Data and Data Streaming Architectures Apache Avro™ is a data serialization system. Avro provides data structures, binary data format, container file format to store persistent data, and provides RPC capabilities. Avro does not require code generation to use and integrates well with JavaScript, Python, Ruby, C, C#, C++ and Java. Avro gets used in the Hadoop ecosystem as well as by Kafka. Avro is similar to Thrift, Protocol Buffers, JSON, etc.

Kinesis vs. Kafka

in Kinesis

May 5, 2017

Kinesis vs. Kafka Kinesis works with streaming data. Stock prices Game data (scores from game) Social network data Geospatial data like Uber data where you are IOT sensors Kafka works with streaming data too. Kinesis Streams is like Kafka Core. Kinesis Analytics is like Kafka Streams. A Kinesis Shard is like Kafka Partition. They are similar and get used in similar use cases. Data is stored in Kinesis for default 24 hours, and you can increase that up to 7 days.

Kafka Broker Startup Scripts

in Kafka Training

April 27, 2017

Running a Kafka Broker Starting brokers in Kafka is pretty straightforward, here are some simple quick start instructions. But as developers, we want to do at least a little more than just the basics. For instance my first needs were to start multiple brokers on the same machine, and also to enable JMX. Out of the box, you can simply rely on the supplied server.properties Each broker needs a unique id and needs a unique port.

Kafka Tutorial with Examples

in Kafka Training

April 17, 2017

Kafka Tutorial Kafka Tutorial for the Kafka streaming platform. Covers Kafka Architecture with some small examples from the command line. Then we expand on this with a multi-server example. Lastly, we added some simple Java client examples for a Kafka Producer and a Kafka Consumer. We have started to expand on the Java examples to correlate with the design discussion of Kafka. We have also expanded on the Kafka design section and added references.

AWS Cassandra Cluster Tutorial 5: Setting up Cassandra Cluster in AWS/EC2

in Cassandra

April 8, 2017

Cassandra Cluster Tutorial 5 - Cassandra AWS Cluster with CloudFormation, bastion host, Ansible and the aws-command line This Cassandra tutorial is useful for developers and DevOps/DBA staff who want to launch a Cassandra cluster in AWS. The cassandra-image project has been using Vagrant and Ansible to set up a Cassandra Cluster for local testing. Then we used Packer, Ansible and EC2. We used Packer to create AWS images in the last tutorial.

Configuring metricsd to setup a disk alarm

in metricsd

April 7, 2017

What is MetricsD? Metricsd is a golang program that gathers metrics from instance an AWS EC2 node and reports these metrics to places such as AWS / CloudWatch. Metrics collected include disk space, cpu activity, memory allocation, Cassandra KPIs. MetricsD is most often run as a systemd process. Disk Gatherer reports to AWS / CloudWatch, sets alarms or sends emails. The Disk Gatherer reports disk state information to AWS / CloudWatch, sets alarms in AWS / CloudWatch or sends emails.

Cassandra AWS System Memory Guidelines

in Cassandra

March 15, 2017

System Memory Guidelines for Cassandra AWS Basic guidelines for AWS Cassandra Do not use less than 8GB of memory for the JVM. The more RAM the better. Use G1GC. SSTable are first stored in memory and then written to disk sequentially. The larger the SSTable the less scanning that needs to be done while reading and determining if a key is in an SSTable using a bloom filter. In the EC2 world this equates to an m4.

AWS Cassandra: Cassandra, NUMA and EC2

in Cassandra

March 14, 2017

AWS Cassandra and NUMA The i3.8xlarge, c4.8xlarge, m4.10xlarge, and above EC2 instance types use more than 1 CPU, which means NUMA controls are available. A good read on this is from Al Tolbert’s blog post. The quickest way to tell if a machine is NUMA is to run “numactl –hardware”. -Al Tobey blog post on Cassandra tuning NUMA stands for Non-Uniform Memory Architecture. Modern x86 CPUs contain an integrated memory controller.

Apache Spark Training
Kafka Tutorial
Akka Consulting
Cassandra Training
AWS Cassandra Database Support
Kafka Support Pricing
Cassandra Database Support Pricing
Non-stop Cassandra
Watchdog
Advantages of using Cloudurable™
Cassandra Consulting
Cloudurable™| Guide to AWS Cassandra Deploy
Cloudurable™| AWS Cassandra Guidelines and Notes
Free guide to deploying Cassandra on AWS
Kafka Training
Kafka Consulting
DynamoDB Training
DynamoDB Consulting
Kinesis Training
Kinesis Consulting
Kafka Tutorial PDF
Kubernetes Security Training
Redis Consulting
Redis Training
ElasticSearch / ELK Consulting
ElasticSearch Training
InfluxDB/TICK Training TICK Consulting

What is Apache Kafka?

Kafka, Avro Serialization and the Schema Registry

Understanding Apache Avro: Avro Introduction for Big Data and Data Streaming Architectures

Kinesis vs. Kafka

Kafka Broker Startup Scripts

Kafka Tutorial with Examples

AWS Cassandra Cluster Tutorial 5: Setting up Cassandra Cluster in AWS/EC2

Configuring metricsd to setup a disk alarm

Cassandra AWS System Memory Guidelines

AWS Cassandra: Cassandra, NUMA and EC2

Search

Share

Follow

Categories

Tags