Kafka optimal retention and deletion policy

前端 未结 1 1040
后悔当初
后悔当初 2020-12-28 20:47

I am fairly new to kafka so forgive me if this question is trivial. I have a very simple setup for purposes of timing tests as follows:

Machine A -> writes to topic

相关标签:
1条回答
  • 2020-12-28 21:41

    Apache Kafka uses Log data structure to manage its messages. Log data structure is basically an ordered set of Segments whereas a Segment is a collection of messages. Apache Kafka provides retention at Segment level instead of at Message level. Hence, Kafka keeps on removing Segments from its end as these violate retention policies.

    Apache Kafka provides us with the following retention policies -

    1. Time Based Retention

    Under this policy, we configure the maximum time a Segment (hence messages) can live for. Once a Segment has spanned configured retention time, it is marked for deletion or compaction depending on configured cleanup policy. Default retention time for Segments is 7 days.

    Here are the parameters (in decreasing order of priority) that you can set in your Kafka broker properties file:

    Configures retention time in milliseconds

    log.retention.ms=1680000

    Used if log.retention.ms is not set

    log.retention.minutes=1680

    Used if log.retention.minutes is not set

    log.retention.hours=168

    1. Size based Retention

    In this policy, we configure the maximum size of a Log data structure for a Topic partition. Once Log size reaches this size, it starts removing Segments from its end. This policy is not popular as this does not provide good visibility about message expiry. However it can come handy in a scenario where we need to control the size of a Log due to limited disk space.

    Here are the parameters that you can set in your Kafka broker properties file:

    Configures maximum size of a Log

    log.retention.bytes=104857600

    So according to your use case you should configure log.retention.bytes so that your disk should not get full.

    0 讨论(0)
提交回复
热议问题