How to install Apache Kafka on Ubuntu 18.04

Introduction

Apache Kafka is a popular distributed message broker designed to effectively process large amounts of real-time data. Kafka cluster is not only highly scalable and fault-tolerant, but also has higher throughput compared with other message brokers such as ActiveMQ and RabbitMQ). Although it is commonly used as a publish/subscribe messaging system, many organizations also use it for log aggregation because it provides persistent storage for published messages.

The publish/subscribe messaging system allows one or more producers to publish messages regardless of the number of consumers or how they will process the messages. The subscribed clients will be notified automatically about the update and the creation of new messages. This system is more efficient and scalable than a system where the client periodically polls to determine whether new messages are available.

In this tutorial, you will install and use Apache Kafka 1.1.0 on Ubuntu 18.04.

Course Preparation

To continue, you will need:

Step 1-Create User for Kafka

Since Kafka can handle requests over the network, you should create a dedicated user for it. If the Kafka server is compromised, this can minimize the damage to the Ubuntu computer. We will create a dedicated kafka user in this step, but you should create a different non-root user to perform other tasks on this server after completing Kafka setup.

Log in as a non-root sudo user and use the following useradd command to create a user named kafka:

sudo useradd kafka -m

The -m flag ensures that a home directory will be created for the user. This /home/kafka home directory will serve as our workspace directory for executing the commands in the following sections.

Use passwd to set passwords for:

sudo passwd kafka

Use this command to add the kafka user to the sudo group adduser so that it has the necessary permissions to install Kafka dependencies:

sudo adduser kafka sudo

Your kafka user is now ready. Use the following su method to log in to this account:

su -l kafka

Now that we have created Kafka-specific users, we can proceed to download and decompress Kafka binaries.

Step 2-Download and extract Kafka binary files

Let's download and unzip the kafka binary file into a dedicated folder in our kafka user home directory.

First, create a directory named Downloads in /home/kafka to store your downloads:

mkdir ~/Downloads

Use curl to download Kafka's binary files:

curl "http://www-eu.apache.org/dist/kafka/1.1.0/kafka_2.12-1.1.0.tgz"-o ~/Downloads/kafka.tgz

Create a directory called kafka and change to this directory. This will be the base directory for Kafka installation:

mkdir ~/kafka && cd ~/kafka

Use the following tar command to extract the downloaded archive:

tar -xvzf ~/Downloads/kafka.tgz --strip 1

We specify the --strip 1 flag to ensure that the content of the archive ~/kafka/ itself is extracted and not in another directory inside it (for example, ~/kafka/kafka_2.12-1.1.0/) extract.

Now that we have successfully downloaded and decompressed the binary file, we can continue to configure Kafka to allow the topic to be deleted.

Step 3-Configure Kafka server

The default behavior of Kafka will not allow us to delete the topic, category, group or feed name that can publish messages. To modify it, let's edit the configuration file.

The configuration options of Kafka are specified in server.properties. Open this file with nano or another editor of your choice:

nano ~/kafka/config/server.properties

Let's add a setting that allows us to delete Kafka topics. Add the following to the bottom of the file:

delete.topic.enable =true

Save the file and exit nano. Now that we have configured Kafka, we can proceed to create the systemd unit file to run and enable it at startup.

Step 4-Create system unit file and start Kafka server

In this section, we will create a [systemd unit file] (https://www.digitalocean.com/community/tutorials/understanding-systemd-units-and-unit-files) for the Kafka service. This will help us perform common service operations, such as starting, stopping and restarting Kafka in a manner consistent with other Linux services.

Zookeeper is a service used by Kafka to manage its cluster status and configuration. It is usually used as an indispensable component in many distributed systems. If you want to learn more, please visit the official Zookeeper document.

Create a unit file for zookeeper:

sudo nano /etc/systemd/system/zookeeper.service

Enter the following unit definition in the file:

[ Unit]
Requires=network.target remote-fs.target
After=network.target remote-fs.target
​
[ Service]
Type=simple
User=kafka
ExecStart=/home/kafka/kafka/bin/zookeeper-server-start.sh /home/kafka/kafka/config/zookeeper.properties
ExecStop=/home/kafka/kafka/bin/zookeeper-server-stop.sh
Restart=on-abnormal
​
[ Install]
WantedBy=multi-user.target

The [Unit] part specifies that Zookeeper needs a network and the file system is ready before starting.

The [Service] section specifies that systemd should use the zookeeper-server-start.sh and zookeeper-server-stop.sh shell files to start and stop the service. It also specifies that Zookeeper should automatically restart if it exits abnormally.

Next, create a systemd service file for the following kafka content:

sudo nano /etc/systemd/system/kafka.service

Enter the following unit definition in the file:

[ Unit]
Requires=zookeeper.service
After=zookeeper.service
​
[ Service]
Type=simple
User=kafka
ExecStart=/bin/sh -c '/home/kafka/kafka/bin/kafka-server-start.sh /home/kafka/kafka/config/server.properties > /home/kafka/kafka/kafka.log 2>&1'
ExecStop=/home/kafka/kafka/bin/kafka-server-stop.sh
Restart=on-abnormal
​
[ Install]
WantedBy=multi-user.target

The [Unit] section specifies that this unit file depends on zookeeper.service. This will ensure that zookeeper is automatically started when the kafa service starts.

The [Service] section specifies that systemd should use the kafka-server-start.sh and kafka-server-stop.sh shell files to start and stop the service. It also specifies that Kafka should automatically restart if it exits abnormally.

Now that the unit has been defined, use the following command to start Kafka:

sudo systemctl start kafka

To make sure that the server has started successfully, check the log of the device kafka:

journalctl -u kafka

You should see output similar to the following:

Jul 1718:38:59 kafka-ubuntu systemd[1]: Started kafka.service.

You now have a Kafka server listening on port 9092.

Although we have started the kafka service, if we restart the server, it will not start automatically. To enable kafka when the server starts, run:

sudo systemctl enable kafka

Now that we have started and enabled the service, let's check the installation.

Step 5-Test Installation

Let us publish and use the "Hello World" message to ensure that the Kafka server is running properly. Publishing messages in Kafka requires:

First, create a TutorialTopic theme by typing the following name:

~ /kafka/bin/kafka-topics.sh --create --zookeeper localhost:2181--replication-factor 1--partitions 1--topic TutorialTopic

You can use the kafka-console-producer.sh script to create a generator from the command line. It expects the hostname, port and topic name of the Kafka server as parameters.

Type the following to publish the string "Hello, World" to the TutorialTopic topic:

echo "Hello, World"|~/kafka/bin/kafka-console-producer.sh --broker-list localhost:9092--topic TutorialTopic >/dev/null

Next, you can use the kafka-console-consumer.sh script to create Kafka consumers. It expects the host name and port of the ZooKeeper server, and the theme name as parameters.

The following command uses the message from TutorialTopic. Please note the use of the --from-beginning flag, which allows the consumption of messages published before the consumer starts:

~ /kafka/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092--topic TutorialTopic --from-beginning

If there is no configuration problem, you should see Hello, World in the terminal:

Hello, World

The script will continue to run, waiting for more messages to be posted to the topic. Feel free to open a new terminal and start a producer to post more messages. You should be able to see them in the consumer's output.

After completing the test, press CTRL+C to stop the user script. Now that we have tested the installation, let's proceed to install KafkaT.

Step 6-Install KafkaT (optional)

KafkaT is a tool from Airbnb that allows you to more easily view detailed information about Kafka clusters and perform certain management tasks from the command line. Because it is a Ruby gem, you need Ruby to use it. You also need this build-essential package to build other gems it depends on. Install with apt:

sudo apt install ruby ruby-dev build-essential

You can now install KafkaT using the gem command:

sudo gem install kafkat

KafkaT uses the .kafkatcfg configuration file to determine the installation and log directory of the Kafka server. It should also have an entry pointing KafkaT to your ZooKeeper instance.

Create a new file named .kafkatcfg:

nano ~/.kafkatcfg

Add the following lines to specify the required information about the Kafka server and Zookeeper instance:

{" kafka_path":"~/kafka","log_path":"/tmp/kafka-logs","zk_path":"localhost:2181"}

You can now use KafkaT. First, you can use it to view detailed information about all Kafka partitions:

kafkat partitions

You will see the following output:

Topic                 Partition   Leader      Replicas        ISRs    
TutorialTopic         00[0][0]
__ consumer_offsets    00[0][0]......

You will see TutorialTopic, and __consumer_offsets which are internal topics of Kafka used to store client-related information. You can safely ignore the lines starting with __consumer_offsets.

To learn more about KafkaT, please refer to its GitHub Repository.

Step 7-Set up a multi-node cluster (optional)

If you want to use more Ubuntu 18.04 computers to create a multi-agent cluster, you should repeat step 1, step 4, and step 5 on each new computer. In addition, you should make each change in the server.properties file:

If you want to set up multiple ZooKeeper instances for the cluster, the value of the zookeeper.connect property on each node should be the same comma-separated string, which lists the IP addresses and port numbers of all ZooKeeper instances.

Step 8-Restrict Kafka users

Now that all installations are complete, you can delete the administrator privileges of the kafka user. Before doing this, log out and log back in as any other non-root sudo user. If you are still running the same shell session, then just type exit to start this tutorial.

Delete the kafka user from the sudo group:

sudo deluser kafka sudo

To further improve the security of the Kafka server, use this command to lock the password passwd of the kafka user. This ensures that no one can log into the server directly with this account:

sudo passwd kafka -l

At this time, only root or sudo users can log in as kafka by typing the following command:

sudo su - kafka

In the future, if you want to unlock, please use the following -u option of passwd:

sudo passwd kafka -u

You have now successfully restricted the admin rights of the kafka user.

in conclusion

You can now safely run Apache Kafka on the Ubuntu server. You can use Kafka client (available for most programming languages) to create Kafka producers and consumers to use it in your project. To learn more about Kafka, you can also refer to its document.

For more Ubuntu tutorials, please go to [Tencent Cloud + Community] (https://cloud.tencent.com/developer?from=10680) to learn more.


Reference: "How To Install Apache Kafka on Ubuntu 18.04"

Recommended Posts

How to install Apache Kafka on Ubuntu 18.04
How to install Apache on Ubuntu 20.04
How to install Apache Maven on Ubuntu 20.04
How to install Apache Tomcat 8 on Ubuntu 16.04
How to install Ruby on Ubuntu 20.04
How to install Java on Ubuntu 20.04
How to install VirtualBox on Ubuntu 20.04
How to install Elasticsearch on Ubuntu 20.04
How to install Protobuf 3 on Ubuntu
How to install Nginx on Ubuntu 20.04
How to install Git on Ubuntu 20.04
How to install Node.js on Ubuntu 16.04
How to install Vagrant on Ubuntu 20.04
How to install Bacula-Web on Ubuntu 14.04
How to install PostgreSQL on Ubuntu 16.04
How to install Git on Ubuntu 20.04
How to install Anaconda3 on Ubuntu 18.04
How to install Memcached on Ubuntu 18.04
How to install Jenkins on Ubuntu 16.04
How to install MemSQL on Ubuntu 14.04
How to install Go on Ubuntu 20.04
How to install MongoDB on Ubuntu 16.04
How to install Mailpile on Ubuntu 14.04
How to install PrestaShop on Ubuntu 16.04
How to install Skype on Ubuntu 20.04
How to install Jenkins on Ubuntu 20.04
How to install KVM on Ubuntu 18.04
How to install KVM on Ubuntu 20.04
How to install opencv3.0.0 on ubuntu14.04
How to install Prometheus on Ubuntu 16.04
How to install Jenkins on Ubuntu 18.04
How to install R on Ubuntu 20.04
How to install Moodle on Ubuntu 16.04
How to install Solr 5.2.1 on Ubuntu 14.04
How to install Teamviewer on Ubuntu 16.04
How to install MariaDB on Ubuntu 20.04
How to install Nginx on Ubuntu 20.04
How to install Mono on Ubuntu 20.04
How to install Go on Ubuntu 20.04
How to install Zoom on Ubuntu 20.04
How to install Nginx on Ubuntu 16.04
How to install Apache on CentOS 8
How to install OpenCV on Ubuntu 20.04
How to install Spotify on Ubuntu 20.04
How to install Postman on Ubuntu 18.04
How to install Go 1.6 on Ubuntu 16.04
How to install Go on Ubuntu 18.04
How to install MySQL on Ubuntu 14.04
How to install PostgreSQL on Ubuntu 20.04
How to install VLC on Ubuntu 18.04
How to install TeamViewer on Ubuntu 20.04
How to install Webmin on Ubuntu 20.04
How to install Docker Compose on Ubuntu 18.04
How to install Ubuntu on Raspberry Pi
How to install Bacula Server on Ubuntu 14.04
How to install Apache Maven on CentOS 8
How to install MySQL on Ubuntu 18.04 (linux)
How to install Python2 on Ubuntu20.04 ubuntu/focal64
How to install GCC compiler on Ubuntu 18.04
How to install Graylog 1.x on Ubuntu 14.04.
How to install Zabbix on Ubuntu 16.04 Server