Kafka Tutorial

Kafka Architecture: Producers

Kafka Producer Architecture - Picking the partition of records This article covers some lower level details of Kafka producer architecture. It is a continuation of the Kafka Architecture and Kafka Topic Architecture articles. This article covers Kafka Producer Architecture with a discussion of how a partition is chosen, producer cadence, and partitioning strategies. Kafka Producers Kafka producers send records to topics. The records are sometimes referred to as messages.

Continue reading

Kafka Topic Architecture

Kafka Topic Architecture - Replication, Failover and Parallel Processing This article covers some lower level details of Kafka topic architecture. It is a continuation of the Kafka Architecture article. This article covers Kafka Topic’s Architecture with a discussion of how partitions are used for fail-over and parallel processing. Kafka Topics, Logs, Partitions Recall that a Kafka topic is a named stream of records. Kafka stores topics in logs. A topic log is broken up into partitions.

Continue reading

Kafka vs. JMS

Kafka vs JMS, SQS, RabbitMQ Messaging Is Kafka a queue or a publish and subscribe system? Yes. It can be both. Kafka is like a queue for consumer groups, which we cover later. Basically, Kafka is a queue system per consumer group so it can do load balancing like JMS, RabbitMQ, etc. Kafka is like topics in JMS, RabbitMQ, and other MOM systems for multiple consumer groups. Kafka has topics and producers publish to the topics and the subscribers (Consumer Groups) read from the topics.

Continue reading

Kafka Architecture

If you are not sure what Kafka is, see What is Kafka?. Kafka Architecture Kafka consists of Records, Topics, Consumers, Producers, Brokers, Logs, Partitions, and Clusters. Records can have key (optional), value and timestamp. Kafka Records are immutable. A Kafka Topic is a stream of records ("/orders", "/user-signups"). You can think of a Topic as a feed name. A topic has a Log which is the topic’s storage on disk.

Continue reading

The Kafka Ecosystem - Kafka Core, Kafka Streams, Kafka Connect, Kafka REST Proxy, and the Schema Registry

The Kafka Ecosystem - Kafka Core, Kafka Streams, Kafka Connect, Kafka REST Proxy, and the Schema Registry The core of Kafka is the brokers, topics, logs, partitions, and cluster. The core also consists of related tools like MirrorMaker. The aforementioned is Kafka as it exists in Apache. The Kafka ecosystem consists of Kafka Core, Kafka Streams, Kafka Connect, Kafka REST Proxy, and the Schema Registry. Most of the additional pieces of the Kafka ecosystem comes from Confluent and is not part of Apache.

Continue reading

What is Apache Kafka?

What is Kafka? Kafka’s growth is exploding, more than 1⁄3 of all Fortune 500 companies use Kafka. These companies includes the top ten travel companies, 7 of top ten banks, 8 of top ten insurance companies, 9 of top ten telecom companies, and much more. LinkedIn, Microsoft and Netflix process four comma messages a day with Kafka (1,000,000,000,000). Kafka is used for real-time streams of data, used to collect big data or to do real time analysis or both).

Continue reading

Kafka, Avro Serialization and the Schema Registry

Kafka Tutorial: Kafka, Avro Serialization and the Schema Registry Confluent Schema Registry stores Avro Schemas for Kafka producers and consumers. The Schema Registry and provides RESTful interface for managing Avro schemas It allows the storage of a history of schemas which are versioned. the Confluent Schema Registry supports checking schema compatibility for Kafka. Cloudurable provides Kafka training, Kafka consulting, Kafka support and helps setting up Kafka clusters in AWS.

Continue reading

Understanding Apache Avro: Avro Introduction for Big Data and Data Streaming Architectures

Avro Introduction for Big Data and Data Streaming Architectures Apache Avro™ is a data serialization system. Avro provides data structures, binary data format, container file format to store persistent data, and provides RPC capabilities. Avro does not require code generation to use and integrates well with JavaScript, Python, Ruby, C, C#, C++ and Java. Avro gets used in the Hadoop ecosystem as well as by Kafka. Avro is similar to Thrift, Protocol Buffers, JSON, etc.

Continue reading

                                                                           

Apache Spark Training
Kafka Tutorial
Akka Consulting
Cassandra Training
AWS Cassandra Database Support
Kafka Support Pricing
Cassandra Database Support Pricing
Non-stop Cassandra
Watchdog
Advantages of using Cloudurable™
Cassandra Consulting
Cloudurable™| Guide to AWS Cassandra Deploy
Cloudurable™| AWS Cassandra Guidelines and Notes
Free guide to deploying Cassandra on AWS
Kafka Training
Kafka Consulting
DynamoDB Training
DynamoDB Consulting
Kinesis Training
Kinesis Consulting
Kafka Tutorial PDF
Kubernetes Security Training
Redis Consulting
Redis Training
ElasticSearch / ELK Consulting
ElasticSearch Training
InfluxDB/TICK Training TICK Consulting