How to read data using Kafka Consumer API from beginning?

前端 未结 10 2009
清歌不尽
清歌不尽 2020-12-05 02:06

Please can anyone tell me how to read messages using the Kafka Consumer API from the beginning every time when I run the consumer.

相关标签:
10条回答
  • 2020-12-05 02:23

    Another option is to leave your Consumer code simple and steer the offset management from outside using the command line tool kafka-consumer-groups that comes with Kafka.

    Each time, before starting the consumer, you would call

    bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 \
     --execute --reset-offsets \
     --group myConsumerGroup \
     --topic myTopic \
     --to-earliest
    

    Depending on your requirement you can reset the offsets for each partition of the topic with that tool. The help function or documentation explain the options:

    --reset-offsets also has following scenarios to choose from (atleast one scenario must be selected):
    
    --to-datetime <String: datetime> : Reset offsets to offsets from datetime. Format: 'YYYY-MM-DDTHH:mm:SS.sss'
    --to-earliest : Reset offsets to earliest offset.
    --to-latest : Reset offsets to latest offset.
    --shift-by <Long: number-of-offsets> : Reset offsets shifting current offset by 'n', where 'n' can be positive or negative.
    --from-file : Reset offsets to values defined in CSV file.
    --to-current : Resets offsets to current offset.
    --by-duration <String: duration> : Reset offsets to offset by duration from current timestamp. Format: 'PnDTnHnMnS'
    --to-offset : Reset offsets to a specific offset.
    
    0 讨论(0)
  • 2020-12-05 02:26

    One option to do this would be to have a unique group id each time you start which will mean that Kafka would send you the messages in the topic from the beginning. Do something like this when you set your properties for KafkaConsumer:

    properties.put(ConsumerConfig.GROUP_ID_CONFIG, UUID.randomUUID().toString());
    

    The other option is to use consumer.seekToBeginning(consumer.assignment()) but this will not work unless Kafka first gets a heartbeat from your consumer by making the consumer call the poll method. So call poll(), then do a seekToBeginning() and then again call poll() if you want all the records from the start. It's a little hackey but this seems to be the most reliable way to do it as of the 0.9 release.

    // At this point, there is no heartbeat from consumer so seekToBeinning() wont work
    // So call poll()
    consumer.poll(0);
    // Now there is heartbeat and consumer is "alive"
    consumer.seekToBeginning(consumer.assignment());
    // Now consume
    ConsumerRecords<String, String> records = consumer.poll(0);
    
    0 讨论(0)
  • 2020-12-05 02:27

    while using the High Level consumer set props.put("auto.offset.reset", "smallest"); in times of creating the ConsumerConfig

    0 讨论(0)
  • 2020-12-05 02:34

    1) https://stackoverflow.com/a/17084401/3821653

    2) http://mail-archives.apache.org/mod_mbox/kafka-users/201403.mbox/%3CCAOG_4QYz2ynH45a8kXb8qw7xw4vDRRwNqMn5j9ERFxJ8RfKGCg@mail.gmail.com%3E

    To reset the consumer group, you can delete the Zookeeper group id

     import kafka.utils.ZkUtils;
     ZkUtils.maybeDeletePath(<zkhost:zkport>, </consumers/group.id>);`
    
    0 讨论(0)
提交回复
热议问题