If I\'m running a Kafka cluster with more partitions than my lone consumer group has consumers. Are there any guarantees made on ordering of messages, or on-time delivery o
Kafka provides ordering guarantees only within a partition. In your example, Message 2 might be consumed either before Message 1, after Message 1 or after Message 3. That's only depends on the performance of the consumer. More information on this is available in the documentation: https://kafka.apache.org/documentation.html#introduction ('Consumers' and 'Guarantees' topics).
Kafka broker is not aware of the consumers. It stores the messages in log segments until corresponding log segment gets deleted. Consumers may attach to the broker at any moment and start consumption from the oldest log segment. Minimum message retention time is controlled by two configuration properties: log.retention.hours
and log.retention.bytes
(with possible overrides per topic). More on this in documentation: https://kafka.apache.org/documentation.html#brokerconfigs.
Answering your question: if the consumer eventually gets slower than producer, it has some time to catch up (1 week by default). If it doesn't, some non-consumed messages will be deleted forever.
High-level consumer creates several KafkaStream
objects, each providing data from one or multiple partitions. It's up to you how to consume these streams: in separate threads, round robin, etc. It's also possible to fetch timestamps of messages and merge the streams into a single stream restoring message order.