问题
I ran a job first with Parallelism 1 and then with Parallelism 3. With Parallelism=1, the kafka source was reading records at rate ~500 records per second. With Parallelism=3, the throughput got divided among the three parallelisms, each reading approximately ~150 records per second. Note that the source is publishing records at a much higher rate (~1000 records per second).
Is this expected? I would imagine the throughput to increase with parallelism, but it is remaining the same. I checked the Backpressure
status on the source, it was High
.
Screenshots for reference:
Parallelism 1:
Parallelism 3:
来源:https://stackoverflow.com/questions/56601740/effect-of-increasing-parallelism-on-throughput