Apache Kafka message consumption when partitions outnumber consumers

前端 未结 1 561
小鲜肉
小鲜肉 2020-12-28 17:23

If I\'m running a Kafka cluster with more partitions than my lone consumer group has consumers. Are there any guarantees made on ordering of messages, or on-time delivery o

1条回答
  •  有刺的猬
    2020-12-28 18:22

    Ordering guarantees

    Kafka provides ordering guarantees only within a partition. In your example, Message 2 might be consumed either before Message 1, after Message 1 or after Message 3. That's only depends on the performance of the consumer. More information on this is available in the documentation: https://kafka.apache.org/documentation.html#introduction ('Consumers' and 'Guarantees' topics).

    Slow consumption

    Kafka broker is not aware of the consumers. It stores the messages in log segments until corresponding log segment gets deleted. Consumers may attach to the broker at any moment and start consumption from the oldest log segment. Minimum message retention time is controlled by two configuration properties: log.retention.hours and log.retention.bytes (with possible overrides per topic). More on this in documentation: https://kafka.apache.org/documentation.html#brokerconfigs.

    Answering your question: if the consumer eventually gets slower than producer, it has some time to catch up (1 week by default). If it doesn't, some non-consumed messages will be deleted forever.

    Consuming multiple partitions

    High-level consumer creates several KafkaStream objects, each providing data from one or multiple partitions. It's up to you how to consume these streams: in separate threads, round robin, etc. It's also possible to fetch timestamps of messages and merge the streams into a single stream restoring message order.

    0 讨论(0)
提交回复
热议问题