What would be the best way to debug memory issues of a dataflow job?
My job was failing with a GC OOM error, but when I profile it locally I cannot reproduce the exa
Please use the option --dumpHeapOnOOM
and --saveHeapDumpsToGcsPath
(see docs).
This will only help if one of your workers actually OOMs. Additionally you can try running jmap -dump PID
on the harness process on the worker to obtain a heap dump at runtime if it's not OOMing but if you observe high memory usage nevertheless.