How to dynamically add consumers in consumer group kafka

落花浮王杯 提交于 2020-07-09 15:00:49

问题


How should I know when i have to scale the consumer in consumer group . What are the triggers for the consumers to scale when there is a fast producer ?


回答1:


In Kafka while creating a topic, need to provide number of partitions and replication factor.

Let say there is one topic called TEST with 10 partitions, for parallel consumption of data need to create consumer group with 10 consumers, where each consumer will be consuming the data from the respective partition.

Here is the catch, if the topic is having 10 partitions and consumer group is having 12 consumers then two consumer remain idle until one of the consumer dies.

if the topic is having 10 partitions and consumer group has 8 consumers then 6 consumers will consume the data from 6 partitions (one consumer->one partition) whereas remaining two consumers will be responsible for consuming the data from two partitions (one consumer-> 2 partitions). its means last two-consumers consumes the data from four partitions.

Hence first thing is to decide number of partition for your kafka topic, more partitions means more parallelism.

whenever any new consumer is added or removed to the consumer group rebalacing is taken care by kafka.




回答2:


Actually auto-scale is not a good idea because in Kafka message order is guaranteed in partition.

From Kafka docs:

  • Messages sent by a producer to a particular topic partition will be appended in the order they are sent. That is, if a record M1 is sent
    by the same producer as a record M2, and M1 is sent first, then M1
    will have a lower offset than M2 and appear earlier in the log.
  • A consumer instance sees records in the order they are stored in the log.

If you add more partitions and more consumers with respect to number of partitions, then you cannot satisfy ordering guarantee of messages.

Suppose that you have 10 partitions and your number of key is 102, then this message will be sent to partition: 102 % 10 = 2

But if you increase number of partitions to 15 for instance, then messages with same key (102) will be sent to a different partition: 102 % 15 = 12

As you see with this approach it is impossible to guarantee ordering of the messages with same keys.

Note: By the way Kafka uses murmur2(record.key())) % num partitions algorithm by default. The calculations above is just an example.




回答3:


One straight forward approach would be to get the consumer lag(this can be computed as the difference between committed offset and beginning_offset) and if the lag computed in the last n times is increasing you can scale up and vice versa. You might've to consider some edge cases for example in case consumers have gone down and lag would be increasing and the auto-scaling function might spawn more threads/machines).



来源:https://stackoverflow.com/questions/60550839/how-to-dynamically-add-consumers-in-consumer-group-kafka

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!