kafka-python - How do I commit a partition?

ε祈祈猫儿з 提交于 2019-12-04 10:46:45

So it looks like I may have figured it out, funny how that happens when you write down your questions. This seems to work:

meta = consumer.partitions_for_topic(topic)
options = {}
options[partition] = OffsetAndMetadata(message.offset + 1, meta)
consumer.commit(options)

More testing is needed, but will update if anything changes.

Is not necessary to use the metadata. Look this example:

from kafka import TopicPartition
from kafka.structs import OffsetAndMetadata
...
topic = 'your_topic'
partition = 0
tp = TopicPartition(topic,partition)
kafkaConsumer = createKafkaConsumer()
kafkaConsumer.assign([tp])
offset = 15394125
kafkaConsumer.commit({
    tp: OffsetAndMetadata(offset, None)
})

Hope this helps.

from kafka import KafkaConsumer
from kafka import TopicPartition

TOPIC = "test_topic"
PARTITION = 0

consumer = KafkaConsumer(
    group_id=TOPIC,
    auto_offset_reset="earliest",
    bootstrap_servers="localhost:9092",
    request_timeout_ms=100000,
    session_timeout_ms=99000,
    max_poll_records=100,
)
topic_partition = TopicPartition(TOPIC, PARTITION)
# format: topic, partition
consumer.assign([topic_partition])
consumer.seek(topic_partition, 1660000)
# format: TopicPartition, offset. 1660000 is the offset been set.
for message in consumer:
    # do something
  1. This only assigns one partition and sets offset for that partition, if there are more that one partition, you need to assign one for each of them and then set the offset.
  2. aalmeida88's answer works for me sometimes, when in some situations, it does work, and aalmeida88 gave me ideas to seek and it seems it is also a useful method.
  3. Another thing you may need to notice is that when you assign partitions by yourself, it seems that kafka manager could not get the consumer info, this might be because when you assign partitions, you set it in kafka instead of zookeeper, so kafka manager may not get that info. Hope it helps!

---edit-----

Find a better way to do it.

topic_partition = TopicPartition(TOPIC,
                                 message.partition)
consumer.seek(topic_partition, offset_value)
consumer.commit()

This will extract the partition info from message obtained from kafka and save the clause to assign partition manually, thus brings convenience when there are more than one partitions' offset (not uncommon) need to be set in program.

ps: in order to ensure that one partition is set only once, a flag should be set according to your application.

Just need to call consumer.commit()

from kafka import KafkaConsumer

KAFKA_TOPIC_NAME='KAFKA_TOPIC_NAME'
KAFKA_CONSUMER_GROUP='KAFKA_CONSUMER_GROUP'
consumer = KafkaConsumer(
    KAFKA_TOPIC_NAME,
    bootstrap_servers=['localhost:9092'],
    auto_offset_reset='earliest',
    enable_auto_commit=False,
    group_id=KAFKA_CONSUMER_GROUP
)
for message in consumer:
    print(message.value)
    consumer.commit()    # <--- This is what we need
    # Optionally, To check if everything went good
    from kafka import TopicPartition
    print('New Kafka offset: %s' % consumer.committed(TopicPartition(KAFKA_TOPIC_NAME, message.partition)))
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!