Flink taskmanager out of memory and memory configuration

问题

We are using Flink streaming to run a few jobs on a single cluster. Our jobs are using rocksDB to hold a state. The cluster is configured to run with a single Jobmanager and 3 Taskmanager on 3 separate VMs. Each TM is configured to run with 14GB of RAM. JM is configured to run with 1GB.

We are experiencing 2 memory related issues: - When running Taskmanager with 8GB heap allocation, the TM ran out of heap memory and we got heap out of memory exception. Our solution to this problem was increasing heap size to 14GB. Seems like this configuration solved the issue, as we no longer crash due to out of heap memory. - Still, after increasing heap size to 14GB (per TM process) OS runs out of memory and kills the TM process. RES memory is rising over time and reaching ~20GB per TM process.

1. The question is how can we predict the maximal total amount of physical memory and heap size configuration?

2. Due to our memory issues, is it reasonable to use a non default values of Flink managed memory? what will be the guideline in such case?

Further details: Each Vm is configured with 4 CPUs and 24GB of RAM Using Flink version: 1.3.2

回答1:

The total amount of required physical and heap memory is quite difficult to compute since it strongly depends on your user code, your job's topology and which state backend you use.

As a rule of thumb, if you experience OOM and are still using the FileSystemStateBackend or the MemoryStateBackend, then you should switch to RocksDBStateBackend, because it can gracefully spill to disk if the state grows too big.

If you are still experiencing OOM exceptions as you have described, then you should check your user code whether it keeps references to state objects or generates in some other way large objects which cannot be garbage collected. If this is the case, then you should try to refactor your code to rely on Flink's state abstraction, because with RocksDB it can go out of core.

RocksDB itself needs native memory which adds to Flink's memory footprint. This depends on the block cache size, indexes, bloom filters and memtables. You can find out more about these things and how to configure them here.

Last but not least, you should not activate taskmanager.memory.preallocate when running streaming jobs, because streaming jobs currently don't use managed memory. Thus, by activating preallocation, you would allocate memory for Flink's managed memory which is reduces the available heap space.

回答2:

Using RocksDBStateBackend can lead to significant off-heap/direct memory consumption, up to the available memory on the host. Normally that doesn't cause a problem, when the task manager process is the only big memory consumer. However, if there are other processes with dynamically changing memory allocations, it can lead to out of memory. I came across this post since I'm looking for a way to cap the RocksDBStateBackend memory usage. As of Flink 1.5, there are alternative option sets available here. It appears though that these can only be activated programmatically, not via flink-conf.yaml.

来源：https://stackoverflow.com/questions/50812837/flink-taskmanager-out-of-memory-and-memory-configuration

标签

apache-flink

flink-streaming