Kafka optimal retention and deletion policy

前端未结

关注

 1  1040

I am fairly new to kafka so forgive me if this question is trivial. I have a very simple setup for purposes of timing tests as follows:

Machine A -> writes to topic

相关标签:

1条回答

暗喜

2020-12-28 21:41
Apache Kafka uses Log data structure to manage its messages. Log data structure is basically an ordered set of Segments whereas a Segment is a collection of messages. Apache Kafka provides retention at Segment level instead of at Message level. Hence, Kafka keeps on removing Segments from its end as these violate retention policies.

Apache Kafka provides us with the following retention policies -
1. Time Based Retention
Under this policy, we configure the maximum time a Segment (hence messages) can live for. Once a Segment has spanned configured retention time, it is marked for deletion or compaction depending on configured cleanup policy. Default retention time for Segments is 7 days.

Here are the parameters (in decreasing order of priority) that you can set in your Kafka broker properties file:

Configures retention time in milliseconds

log.retention.ms=1680000

Used if log.retention.ms is not set

log.retention.minutes=1680

Used if log.retention.minutes is not set

log.retention.hours=168
1. Size based Retention
In this policy, we configure the maximum size of a Log data structure for a Topic partition. Once Log size reaches this size, it starts removing Segments from its end. This policy is not popular as this does not provide good visibility about message expiry. However it can come handy in a scenario where we need to control the size of a Log due to limited disk space.

Here are the parameters that you can set in your Kafka broker properties file:

Configures maximum size of a Log

log.retention.bytes=104857600

So according to your use case you should configure log.retention.bytes so that your disk should not get full.
0 讨论(0)
发布评论:

提交评论
- 加载中...