Apache Kafka – What Is It?

The Kafka project, which was created by LinkedIn in 2012 and adopted to by Apache, is a public subscribe distributed message system. This post will provide an overview of Kafka, focusing on the ideas related producers, topic, brokers, and consumers.
Introduction to Kafka
Scala’s Kafka is a high-throughput, partitioned, scalable log system that can be written in Scala. It was originally created by LinkedIn to handle live feeds from all social media channels, including Twitter, Facebook, and LinkedIn. It was later made open-source so that other organizations could also adopt it. Like other messaging systems messages can be written to and read by the server, but Kafka clusters speeds this up.
Kafka is a “public subscribed distributed messaging system” and not a “queue” system, since the message is received by the producer and broadcast to a group rather than one consumer.
Architecture of Kafka
Let’s now look at the architecture of Kafka after we have reviewed its history. These are the fundamental terms that are associated with Kafka architecture: producer, broker/consumer, topic, and so on.
Producer:
Different producers such as Apps, DBMS and NoSQL write data into the Kafka cluster. Many “brokers” make up the Kafka cluster. In layman’s terms, each “broker” is a “server”. Each message is assigned a key that ensures that all messages with the exact same key reach the same partition. The Kafka cluster producer continues to send messages to the Kafka cluster, without waiting for acknowledgement. This asynchronous way of creating and adding messages to the Kafka cluster is what gives Kafka its incredible speed, which is a necessity in today’s social media world.
Topic:
Messages of the same type are considered a ‘Topic. A ‘Topic is similar to a File structure. Messages are published to a Topic and each Topic has a partition.
Brokers:
Kafka’s “broker”, as it is called, is very similar to a traditional broker. It contains the messages that were written by the producer prior to being consumed by the consumer’.
Kafka cluster has many “brokers” and “servers”. Each “broker”, or “server”, has a partition. As mentioned, each partition is associated to a ‘Topic. The messages are received by the brokers and stored in the “brokers’ for a ‘n’ amount of days (which can also be configured). The messages are discarded after the expiry of the “n” number of days. Kafka doesn’t verify that each consumer has read the messages.
Consumer:
The message is then read by the consumers after the “producers”, or the Kafka brokers, have sent it. Each “consumer” (or “consumer group”) is subscribed for different “topics”. They then read from the “partitions” for those “topics”. If one broker goes down, the other brokers support it and make sure it runs smoothly.
ZooKeeper:
The Zookeeper is responsible for coordination with all components of the Kafka cluster. The producer gives the message to the “broker lead”, who copies it onto other brokers. Kafka has been adopted by many organizations, including LinkedIn, Yahoo!, Twitter, Pinterest and Tumblr.
This post provided an overview of Kafka, followed by its architecture. As time passes, Kafka will be adopted by more organizations.
Kafka.apache.org has more information.

Related Posts

Microsoft Power Platform Functional Consultant (PL-202) Certification – Practice Test Launched

Companies are thriving in a data-reliant world because they have millions of data that is recorded with every global sale. Data is created for a purpose. They…

Microsoft Power Platform App Maker (PL-100) Certification Preparation Guide

The Microsoft Power Platform App Maker Certification (PL-100), helps individuals to develop app-making skills that allow them to create solutions for transforming, automating, or simplifying their respective…

Microsoft Power Automate – Your Complete Guide

It doesn’t matter if you are an IT professional or a business user, it is crucial that you create efficient automated processes to increase productivity with Microsoft…

Knowledge Management

It is important to understand the many sources of information within an organization. Knowledge Management is the process of gathering, organizing and refining information within an organization….

Lori MacVittie: Exclusive Interview with Our Cloud Thought Leader – Know What You Know, and Know What It’s Not – Lori MacVittie

Lori MacVittie serves as the Principal Technical Evangelist for F5 Networks. F5 Networks has been her employer for 14 years. Currently, she focuses on how emerging technologies…

Keys to Effective Project Meetings

Meetings and the agenda that drives them should be organized in priority order. This ensures that the most important stuff gets done and that lower priority items…