Apache Flink: Where do State Backends keep the state?

馋奶兔 提交于 2019-12-09 14:16:20

问题


I got a statement below:

"Depending on your state backend, Flink can also manage the state for the application, meaning Flink deals with the memory management (possibly spilling to disk if necessary) to allow applications to hold very large state."

https://ci.apache.org/projects/flink/flink-docs-master/dev/stream/state/state_backends.html

Does it mean that only when the state backends is configured to RocksDBStateBackend, the state would keep in memory and possibly spilling to disk if necessary?

However if configured to MemoryStateBackend or FsStateBackend, the state only keep in memory and would never be spilled to disk.


回答1:


Yes in general you are right. Only with RocksDBStateBackend there will be spilling data to disk.

In case of both MemoryStateBackend and FsStateBackend the state is always kept in TaskManagers memory and thus must fit in there. The difference between those two backends is the way they checkpoint data.

  • In case of MemoryStateBackend the checkpoint data is sent to JobManager and kept also in memory there.

  • The FsStateBackend stores data upon checkpoint in FileSystem and sends only small metadata to JobManager (or in HA scenario stores in metadata folder)

Therefore for any production use-cases the RocksDBStateBackend is highly encouraged. More in-depth information you can find here.



来源:https://stackoverflow.com/questions/50548776/apache-flink-where-do-state-backends-keep-the-state

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!