Kafka Streams State Store Unrecoverable from Change Log Topic

孤街浪徒 提交于 2020-06-16 03:37:26

问题


When our kafka stream application attempts to recover state from the changelog topic our rocksdb state store directory continually grows (10GB+) until we run out of disk space and never actually recovers.

How I can reproduce.

  1. I start up our application with a brand new changelog topic.
  2. I push a few hundred thousand records through. I note my RocksDb state store is around 100mb.
  3. I gracefully shutdown the application and restart it.
  4. I see the restore consumers logging and stating they are rebuilding the statestore from the beginning. I then watch my RocksDb state store directory size increase until I run out of disk space (10s of GB).

How is a RocksDB state store that is in the 100s of MB generating a RocksDb state store some unknown number above 10 GBs when recovering from the change log topic? Is there some compression/compaction that happens during normal operation that doesn't happen during recovery? Is my changelog topic not setup properly (we have to create the topic ahead of time due to security requirements; cleanup.policy is set to compact)?

I will note that we have relatively a few number of keys related to the number of records we pass into our streams application. Most of them are updates to existing keys.

来源:https://stackoverflow.com/questions/56726224/kafka-streams-state-store-unrecoverable-from-change-log-topic

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!