Kafka Architecture

Kafka Tutorial

Kafka Tutorial

This comprehensive Kafka tutorial covers Kafka architecture and design. The Kafka tutorial has example Java Kafka producers and Kafka consumers. The Kafka tutorial also covers Avro and Schema Registry.

Kafka Training - Onsite, Instructor-led

Training for DevOps, Architects and Developers

This Kafka course teaches the basics of the Apache Kafka distributed streaming platform. The Apache Kafka distributed streaming platform is one of the most powerful and widely used reliable streaming platforms. Kafka is a fault tolerant, highly scalable and used for log aggregation, stream processing, event sources and commit logs. Kafka is used by LinkedIn, Yahoo, Twitter, Square, Uber, Box, PayPal, Etsy and more to enable stream processing, online messaging, facilitate in-memory computing by providing a distributed commit log, data collection for big data and so much more.

Continue reading

Kafka Architecture: Log Compaction

Kafka Architecture: Log Compaction

This post really picks off from our series on Kafka architecture which includes Kafka topics architecture, Kafka producer architecture, Kafka consumer architecture and Kafka ecosystem architecture.

This article is heavily inspired by the Kafka section on design around log compaction. You can think of it as the cliff notes about Kafka design around log compaction.

Kafka can delete older records based on time or size of a log. Kafka also supports log compaction for record key compaction. Log compaction means that Kafka will keep the latest version of a record and delete the older versions during a log compaction.

Continue reading

Kafka Architecture: Low Level

If you are not sure what Kafka is, see What is Kafka?.

Kafka Architecture: Low-Level Design

This post really picks off from our series on Kafka architecture which includes Kafka topics architecture, Kafka producer architecture, Kafka consumer architecture and Kafka ecosystem architecture.

This article is heavily inspired by the Kafka section on design. You can think of it as the cliff notes.


Kafka Design Motivation

LinkedIn engineering built Kafka to support real-time analytics. Kafka was designed to feed analytics system that did real-time processing of streams. LinkedIn developed Kafka as a unified platform for real-time handling of streaming data feeds. The goal behind Kafka, build a high-throughput streaming data platform that supports high-volume event streams like log aggregation, user activity, etc.

Continue reading

Kafka Architecture: Consumers

Kafka Consumer Architecture - Consumer Groups and subscriptions

This article covers some lower level details of Kafka consumer architecture. It is a continuation of the Kafka Architecture, Kafka Topic Architecture, and Kafka Producer Architecture articles.

This article covers Kafka Consumer Architecture with a discussion consumer groups and how record processing is shared among a consumer group as well as failover for Kafka consumers.

Cloudurable provides Kafka training, Kafka consulting, Kafka support and helps setting up Kafka clusters in AWS.

Continue reading

Kafka Architecture: Producers

Kafka Producer Architecture - Picking the partition of records

This article covers some lower level details of Kafka producer architecture. It is a continuation of the Kafka Architecture and Kafka Topic Architecture articles.

This article covers Kafka Producer Architecture with a discussion of how a partition is chosen, producer cadence, and partitioning strategies.

Kafka Producers

Kafka producers send records to topics. The records are sometimes referred to as messages.
The producer picks which partition to send a record to per topic. The producer can send records round-robin. The producer could implement priority systems based on sending records to certain partitions based on the priority of the record.

Continue reading

Kafka Topic Architecture

Kafka Topic Architecture - Replication, Failover and Parallel Processing

This article covers some lower level details of Kafka topic architecture. It is a continuation of the Kafka Architecture article.

This article covers Kafka Topic’s Architecture with a discussion of how partitions are used for fail-over and parallel processing.


Kafka Topics, Logs, Partitions

Recall that a Kafka topic is a named stream of records. Kafka stores topics in logs. A topic log is broken up into partitions. Kafka spreads log’s partitions across multiple servers or disks. Think of a topic as a category, stream name or feed.

Continue reading

Kafka Architecture

If you are not sure what Kafka is, see What is Kafka?.

Kafka Architecture

Kafka consists of Records, Topics, Consumers, Producers, Brokers, Logs, Partitions, and Clusters. Records can have key (optional), value and timestamp. Kafka Records are immutable. A Kafka Topic is a stream of records ("/orders", "/user-signups"). You can think of a Topic as a feed name. A topic has a Log which is the topic’s storage on disk. A Topic Log is broken up into partitions and segments. The Kafka Producer API is used to produce streams of data records. The Kafka Consumer API is used to consume a stream of records from Kafka. A Broker is a Kafka server that runs in a Kafka Cluster. Kafka Brokers form a cluster. The Kafka Cluster consists of many Kafka Brokers on many servers. Broker sometimes refer to more of a logical system or as Kafka as a whole.

Continue reading

The Kafka Ecosystem - Kafka Core, Kafka Streams, Kafka Connect, Kafka REST Proxy, and the Schema Registry

The Kafka Ecosystem - Kafka Core, Kafka Streams, Kafka Connect, Kafka REST Proxy, and the Schema Registry

The core of Kafka is the brokers, topics, logs, partitions, and cluster. The core also consists of related tools like MirrorMaker. The aforementioned is Kafka as it exists in Apache.

The Kafka ecosystem consists of Kafka Core, Kafka Streams, Kafka Connect, Kafka REST Proxy, and the Schema Registry. Most of the additional pieces of the Kafka ecosystem comes from Confluent and is not part of Apache.

Continue reading

What is Apache Kafka?

What is Kafka?

Kafka’s growth is exploding, more than 1/3 of all Fortune 500 companies use Kafka. These companies includes the top ten travel companies, 7 of top ten banks, 8 of top ten insurance companies, 9 of top ten telecom companies, and much more. LinkedIn, Microsoft and Netflix process four comma messages a day with Kafka (1,000,000,000,000). Kafka is used for real-time streams of data, used to collect big data or to do real time analysis or both). Kafka is used with in-memory microservices to provide durability and it can be used to feed events to CEP (complex event streaming systems), and IOT/IFTTT style automation systems.

Continue reading

                                                                           

Apache Spark Training
Kafka Tutorial
Akka Consulting
Cassandra Training
AWS Cassandra Database Support
Kafka Support Pricing
Cassandra Database Support Pricing
Non-stop Cassandra
Watchdog
Advantages of using Cloudurable™
Cassandra Consulting
Cloudurable™| Guide to AWS Cassandra Deploy
Cloudurable™| AWS Cassandra Guidelines and Notes
Free guide to deploying Cassandra on AWS
Kafka Training
Kafka Consulting
DynamoDB Training
DynamoDB Consulting
Kinesis Training
Kinesis Consulting
Kafka Tutorial PDF
Kubernetes Security Training
Redis Consulting
Redis Training
ElasticSearch / ELK Consulting
ElasticSearch Training
InfluxDB/TICK Training TICK Consulting