问题
I have a kafka setup that includes a jmx exporter to prometheus. I'm looking for a metric, that gives the offset lag based on topic and groupid. I'm running kafka 2.2.0.
Some resources online point to a metric called kafka.consumer
, but I have no such metric in my setup.
From my jmxterminal:
$>domains
#following domains are available
JMImplementation
com.sun.management
java.lang
java.nio
java.util.logging
jdk.management.jfr
kafka
kafka.cluster
kafka.controller
kafka.coordinator.group
kafka.coordinator.transaction
kafka.log
kafka.network
kafka.server
kafka.utils
I am, however, able to see the data I need by using the following command:
root@kafka-0:/kafka# bin/kafka-consumer-groups.sh --describe --group benchmark_consumer_group --bootstrap-server localhost:9092
Consumer group 'benchmark_consumer_group' has no active members.
TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
benchmark_topic_10B 2 2795128 54223220 51428092 - - -
benchmark_topic_10B 9 4 4 0 - - -
benchmark_topic_10B 6 7 7 0 - - -
benchmark_topic_10B 7 5 5 0 - - -
benchmark_topic_10B 0 2834028 54224939 51390911 - - -
benchmark_topic_10B 1 15342331 54222342 38880011 - - -
benchmark_topic_10B 4 5 5 0 - - -
benchmark_topic_10B 5 6 6 0 - - -
benchmark_topic_10B 8 8 8 0 - - -
benchmark_topic_10B 3 4 4 0 - - -
But that does not help since I need to track if from a metric. Also, this command takes about 25 seconds to execute, making it unreasonable to use as a source for metrics.
My guess is that the metric kafka.consumer
does not exist in version 2.2.0 and was replaced with another. Although, I can't find any resources online with up-to-date information on how and where to get that metric
回答1:
The kafka.consumer
JMX metrics are only present on the consumer processes themselves, not on the Kafka broker processes. Note that you would not get the kafka.consumer
metric from consumers using a consumer library other than the Java one.
Currently, there are no available JMX metrics for consumer lag from the Kafka broker itself. There are other solutions that are commonly used for monitoring consumer lag, such as Burrow by LinkedIn. There are also a few open source projects such as kafka9.offsets that expose consumer lag metrics via JMX, but may not be updated to work with the latest Kafka.
回答2:
You can give Kafka Minion ( https://github.com/cloudworkz/kafka-minion ) a try. While Kafka Minion internally works similiarly as Burrow (consumes __consumer_offsets topic for Consumer Group Offsets) it has several advantages for your use case
Advantages of Kafka Minion over Burrow for your case:
- Has native prometheus support (no additional deployment necessary to just expose metrics to prometheus)
- Has a sample Grafana dashboard
- Has additional metrics (such as last commit timestamp for a consumergroup:topic:partition combination, commitrates, info about cleanup policy, you can list all consumer groups for a given topic, etc)
- No zookeeper dependency included (which also means that consumers who still commit offsets to zookeeper are not supported)
- High Availability support (!!). Burrow has the problem that it will always expose metrics, which will be wrong when it just has started consuming the __consumer_offsets topic. Therefore you cannot run it in a HA mode. This is a problem when you want to setup alerts based on consumer group lags
- Kafka Minion does not support multiple clusters, which reduces complexity in code and as enduser. You can obviously still deploy Kafka Minion per cluster
Disclaimer: I am the author of Kafka Minion, and I am still looking for more feedback from other users. I intend to actively maintain and develop the exporter for my projects, the company I am working for and for the community.
To answer your question regarding what you are seeing using the kafka-consumer-groups.sh
shell script. This won't work as it cannot report lags for inactive consumers which is a bit counterproductive.
来源:https://stackoverflow.com/questions/55559905/how-to-monitor-consumer-lag-in-kafka-via-jmx