I create a kafka topic with below properties
min.cleanable.dirty.ratio=0.01,delete.retention.ms=100,segment.ms=100,cleanup.policy=compact
Le
Tombstone records are preserved longer by design. The reason is, that brokers don't track consumers. Assume, that a consumer goes offline for some time after reading the first record. While the consumer is down, log compaction kicks. If log compaction would delete the tombstone record, the consumer would never learn about the fact, that the record was deleted. If the consumer implements a cache, it could happen that the record gets never deleted. Thus, tombstone are preserved longer to allow offline consumer to receive all tombstones for local cleanup.
Tombstone will be deleted only after delete.retention.ms
(default value is 1 day). Note: this is a topic level configuration and there is no broker level configuration for it. Thus, you need to set the config per topic if you want to change it.
Compacted topic has two portions:
1) Cleaned portion: Portion of kafka log cleaned by kafka cleaner at least once.
2) Dirty portion: Portion of kafka log not cleaned by kafka cleaner even once until now. Kafka maintains dirty offset. All messages with offset >= dirty offset belong to dirty portion.
Note: Kafka cleaner cleans all segments (irrespective of whether segment is in cleaned/dirty portion) and re-copies them every time dirty ratio reaches min.cleanable.dirty.ratio.
Tombstones are deleted segment wise. Tombstones in a segment are deleted if segment satisfies below conditions:
Segment should be in cleaned portion of log.
Last modified time of segment should be <= (Last modified time of segment containing a message with offset=(dirty offset - 1)) - delete.retention.ms.
It is difficult to elaborate second point but in simple terms, Second point implies => Segment size should be equal to log.segment.bytes/segment.bytes (1GB by default). For segment size (in cleaner portion) to be equal to 1GB, you need to produce large number of messages with distinctive keys. But you produced only 4 messages with 3 messages having same key. That is why tombstones are not deleted in segment containing 1111:null message (Segment doesn't satisfy second point I mentioned above).
You have two options to delete tombstones with 4 messages:
Source Code (Extra Reading): https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/log/LogCleaner.scala
try {
// clean segments into the new destination segment
for (old <- segments) {
val retainDeletes = old.lastModified > deleteHorizonMs
info("Cleaning segment %s in log %s (largest timestamp %s) into %s, %s deletes."
.format(old.baseOffset, log.name, new Date(old.largestTimestamp), cleaned.baseOffset, if(retainDeletes) "retaining" else "discarding"))
cleanInto(log.topicPartition, old, cleaned, map, retainDeletes, log.config.maxMessageSize, stats)
}
The algorithm for removing the tombstone in a compacted is supposed to be the following.
It's possible that the tombstones are still in the dirty portion of the log and hence not cleared. Triggering a few more messages of different keys should push the tombstones into the cleaned portion of the log and delete them.