ForkJoinPool stalls during invokeAll/join

核能气质少年 提交于 2019-11-30 09:59:08

You are dead right about join(). I wrote this article two years ago that points out the problem with join().

As I said there, the framework cannot execute newly submitted requests until it finishes the earlier ones. And each WorkThread cannot steal until it's current request finishes which results in the wait().

The additional threads you see are "continuation threads." Since join() eventually issues a wait(), these threads are needed so the entire framework doesn't stall.

Since nobody else seems to understand my question I try to explain what I found after some nights of debugging:

The current implementation of ForkJoinTasks works well if all fork/join calls are strictly paired. Illustrating a fork by an opening bracket and join by a closing one a perfect binary fork join pattern may look like this:

{([][]) ([][])} {([][]) ([][])}

If you use invokeAll() you may also submit list of subtasks like this:

{([][][][]) ([][][][]) ([][][][])}

What I did however looks like this pattern:

{([) ([)} ... ]]

You may argue this looks ill or is a misuse of the fork-join framework. But the only constraint is, that the tasks completion dependencies are acyclic, else you may run into a deadlock. As long as my [] tasks are not dependent on the () tasks, I don't see any problem with it. The offending ]]'s just express that I do not wait for them explicitly; they may finish some day, it does not matter to me (at that point).

Indeed the current implementation is able to execute my interlocked tasks, but only by spawning additional helper threads which is quite inefficient.

The flaw seems to be the current implementation of join(): joining an ) expects to see its corresponding ( on top of its execution queue, but it finds a [ and is perplexed. Instead of simply executing [] to get rid of it, the current thread suspends (calling wait()) until someone else comes around to execute the unexpected task. This causes a drastic performance break down.

My primary intend was to put additional work onto the queue to prevent the worker thread from suspending if the queue runs empty. Unfortunately the opposite happens :-(

You’re not using this framework for the very narrow purpose for which it was intended.

The framework started life as the experiment in the 2000 research paper. It’s been modified since then but the basic design, fork-and-join on large arrays, remains the same. The basic purpose is to teach undergraduates how to walk down the leaves of a balanced tree. When people use it for other than simple array-processing weird things happen. What it is doing in Java7 is beyond me; which is the purpose of the article.

The problems only get worse in Java8. There it’s the engine to drive all stream parallel work. Have a read in part two of that article. The lambda interest lists are filled with reports of thread stalls, stack overflow, and out of memory errors.

You use it at your own risk when you don’t use it for pure recursive decomposition of large data structures. Even then, the excessive threads it creates can cause havoc. I’m not going to pursue this discussion any further.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!