![]() The Monitoring Kafka metrics article by DataDog and How to monitor Kafka by Server Density provides guidance on key Kafka and Prometheus metrics, reasoning to why you should care about them and suggestions on thresholds to trigger alerts. But you are still left to figure out which ones you want to actively monitor and the ones that you want to be actively alerted.Īn simple way to get started would be to start with the Grafana’s sample dashboards for the Prometheus exporters you chose to use and then modify them as you learn more about the available metrics and/or your environment. The easiest way to see the available metrics is to fire up jconsole and point it at a running kafka client or Kafka/Prometheus server this will allow browsing all metrics with JMX. What to monitor ¶Ī long list of metrics is made available by Kafka ( here) and Zookeeper ( here). ![]() Alternatively, you can consider writing your own custom exporter. Kafka exporter, Kafka Zookeeper Exporter by CloudFlare, and others). To monitor Kafka, for example, the JMX exporter is often used to provide broker level metrics, while community exporters claim to provide more accurate cluster level metrics (e.g. Some of them can be used in addition to the JMX export. There is also a number of exporters maintained by the community to explore. Kafka Broker, Zookeeper and Java clients (producer/consumer) expose metrics via JMX (Java Management Extensions) and can be configured to report stats back to Prometheus using the JMX exporter maintained by Prometheus. Producer(s) / Consumer(s), in general sense, which includes Kafka Connector cluster.ZooKeeper metrics as Kafka relies on it to maintain its state.Kafka Monitoring Kafka Monitoring Table of contentsĪ comprehensive Kafka monitoring plan should collect metrics from the following components:.In this other example, we can see a particular consumer group status. This example was taken from a Kafka-Connect that commits every 30 minutes if everything goes well, if we analyze it we can say that some consumer groups by some reason fail on commit.Ĭonsumers group lag in seconds and offsets In this graph we can see the time between last commit and current time, this is also known as lag in seconds, and we can see it by consumer group. ![]() It's perfect if you are facing an issue in production and need more visibility about what is happening internally in Kafka. It aims to provide a quick installation for troubleshooting and not a final installation for permanent monitoring. If we had not made a docker compose, we would have to install and configure each part separately, which could be a bit cumbersome when faced with an incident that needs to be resolved quickly. This is particularly useful when you don't have enough monitoring on your Kafka yet. The main idea was to have a docker compose with Kafka Lag Exporter, Prometheus and Grafana together, so that it can be quick and easy to get a dashboard for analyzing the consumer groups of a Kafka deployment. In this post I want to tell you what it is about and how it can be useful for troubleshooting in case you don't have much visibility of your Kafka consumer groups. Docker Compose with Kafka Lag Exporter + Grafana + Prometheus.
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |