ForkJoinPool scheduling vs ExecutorService

人走茶凉 提交于 2020-01-01 11:32:11

问题


I'm slightly confused by the internal scheduling mechanism of the ExecutorService and the ForkJoinPool.

I understood the ExecutorService scheduling is done this way.

A bunch of tasks are queued. Once a thread is available it will handle the first available task and so forth.

Meanwhile, a ForkJoinPool is presented as distinct because it uses a work-stealing algorithm. If I understand correctly it means a thread can steal some tasks from another thread.

Yet, I don't really understand the difference with the mechanism implemented in the ExecutorService. From my understanding, both mechanisms should allow to reduce as most as possible the idle time of each threads.

I would understand if in the case of an ExecutorService, each thread would have its own queue. Yet, it is not the case as the queue is shared by the different threads of the pool...

Any clarification would be more than welcome!


回答1:


Suppose you have a very big array of ints and you want to add all of them. With an ExecutorService you might say: let's divide that array into chunks of let's say number of threads / 4. So if you have an array of 160 elements (and you have 4 CPU's), you create 120 / 4 / 4 = 10, so you would create 16 chunks each holding 10 ints. Create runnables/callables and submit those to an executor service (and of course think of a way to merge those results once they are done).

Now your hopes are that each of the CPU's will take 4 of those tasks and work on them. Now let's also suppose that some of the numbers are very complicated to add (of course not, but bear with me), it could turn out that 3 threads/CPUs are done with their work while one of them is busy only with the first chunk. No one wants that, of course, but could happen. The bad thing now is that you can't do anything about it.

What ForkJoinPool does instead is say provide me with how you want to split your task and the implementation for the minimal workload I have to do and I'll take care of the rest. In the Stream API this is done with Spliterators; mainly with two methods trySplit (that either returns null meaning nothing can be split more or a new Spliterator - meaning a new chunk) and forEachRemaning that will process elements once you can't split your task anymore. And this is where work stealing will help you.

You say how are your chunks are computed (usually split in half) and what to do when you can't split anymore. ForkJoinPool will dispatch the first chunk to all threads and when some of them are free - they are done with their work, they can query other queues from other threads and see if they have work. If they notice that there are chunks in some other threads queues, they will take them, split them on their own and work on those. It can even turn out that they don't do the entire work on that chunks on their own - some other thread can now query this thread's queue and notice that there is still work to do and so on... This is far better as now, when those 3 threads are free they can pick up some other work to do - and all of them are busy.


This example is a bit simplified, but is not very far from reality. It's just that you need to have a lot more chunks than CPU's/threads for work stealing to work; thus usually trySplit has to have a smart implementation and you need lots of elements in the source of your stream.



来源:https://stackoverflow.com/questions/52717900/forkjoinpool-scheduling-vs-executorservice

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!