I have a question about Kafka Topic cleanup policies and their interaction of log.retention....
For example, if I set cleanup.policy to compact, compaction will only
Log segments can be deleted or compacted, or both, to manage their size. The topic-level configuration cleanup.policy
determines the way the log segments for the topic are managed.
Log cleanup by compaction
If the topic-level configuration cleanup.policy
is set to compact
,the log for the topic is compacted periodically in the background by the log cleaner.
In a compacted topic,the log only needs to contain the most recent message for each key while earlier messages can be discarded.
There is no need to set log.retention to -1 or any other value. Your topics will be compacted and old messages never deleted (as per compaction rules).
Note that only the inactive file segment can be compacted; active segment will never be compacted.
Log cleanup by using both
You can specify both delete
and compact
values for the cleanup.policy
configuration at the same time. In this case, the log is compacted, but the cleanup process also follows the retention time
or size limit
settings.
I would suggest you to go through the following links
https://ibm.github.io/event-streams/installing/capacity-planning/
https://kafka.apache.org/documentation/#compaction
https://cwiki.apache.org/confluence/display/KAFKA/KIP-71%3A+Enable+log+compaction+and+deletion+to+co-exist