问题
Can someone explain what Fork/Join is?
回答1:
Fork Join is a new framework that has an easier to use API for a parallel, divide and conquer algorithm.
Say you have a long running task that, for this instance, has a complicated algorithm. You would want to fork the large tasks and now work on those two tasks. Now lets say that that those two tasks are still too big, you would fork each into two tasks (at this point there are four).
You would continue this until each task is at an acceptable size and invoke the algorithm. It is important to know the invocation of each task is done in parallel. When the task is completed it is joined with the other task it was forked with and consolidate the results.
This will continue until all tasks have been joined and one task is returned.
回答2:
Check these links -
http://www.artima.com/forums/flat.jsp?forum=276&thread=219155
http://lambda-the-ultimate.org/node/3521
http://www.ibm.com/developerworks/java/library/j-jtp11137.html
回答3:
In addition to what was already said, fork/join utilizes work stealing - threads that run out of things to do can steal tasks from other threads that are still busy. And here is an example that can help you to understand how FJ can be used:
public class SumCounter extends RecursiveTask<Long> {
private final Node node;
public SumCounter(Node node) {
this.node = node;
}
@Override
protected Long compute() {
long sum = node.getValue();
List<ValueSumCounter> subTasks = new LinkedList<>();
for(Node child : node.getChildren()) {
SumCounter task = new SumCounter(child);
task.fork(); // run asynchronously
subTasks.add(task);
}
for(SumCounter task : subTasks) {
sum += task.join(); // wait for the result
}
return sum;
}
public static void main(String[] args) {
Node root = getRootNode();
new ForkJoinPool().invoke(new SumCounter(root));
}
}
回答4:
this is a very good resource to understand Fork and Join :
Fork and Join: Java Can Excel at Painless Parallel Programming Too! By Julien Ponge
回答5:
Say you have a collection of things that need to be processed. You have a number of threads that can grab subsets of this collection and process them. They all do this concurrently (the fork part) then wait on the last one to finish (the join part) before returning.
回答6:
Let me just put my two cents to make basic understanding of Fork/Join simple.
What is Fork/Join?
The fork/join framework is an implementation of the ExecutorService interface
that helps you take advantage of multiple processors. It is designed for work
that can be broken into smaller pieces recursively. The goal is to use all the
available processing power to enhance the performance of your application.
- This framework is very useful for modeling divide-and-conquer problems. This approach is suitable for tasks that can be divided recursively and computed on a smaller scale; the computed results are then combined.
- The framework is an implementation of the ExecutorService interface and provides an easy-to-use concurrent platform in order to exploit multiple processors.
Term useful for this framework
- Forking : Dividing the task into smaller tasks is forking.
- Joining : Merging the results from the smaller tasks is joining
As with any ExecutorService implementation, the fork/join framework distributes tasks to worker threads in a thread pool. The fork/join framework is distinct because it uses a work-stealing algorithm. Worker threads that run out of things to do can steal tasks from other threads that are still busy.
The Fork/Join algorithm is designed as follows:
- split tasks
- fork the tasks
- join the tasks
- compose the results
doRecursiveTask(input){ if( task is small enough to handled by a thread){ compute the small task; if there is result to return, then do so }else{ divide the task i.e, fork() into two parts call compute on first task, join on second task, combine both results and return } }
What is work-stealing algorithm?
Each worker thread in the Fork/Join framework has a work queue, which is implemented using a Deque. Each time a new task (or subtask) is created, it is pushed to the head of its own queue. When a task completes a task and executes a join with another task that is not completed yet, it works smart. The thread pops a new task from the head of its queue and starts executing rather than sleeping (in order to wait for another task to complete). In fact, if the queue of a thread is empty, then the thread pops a task from the tail of the queue belonging to another thread. This is nothing but a work-stealing algorithm. More detail is here
回答7:
I will answer what is Fork Join parallelism. This is one of the parallel design pattern widely used in many systems to achieve concurrency. I will explain this design pattern using a example.
For example lets say we have program which executes sequence of tasks :
A -> B -> C -> D. Here A,B,C,D are tasks.
- A takes 8 seconds
- B takes 4 seconds
- C takes 6 seconds
- D takes 7 seconds
So it will take total of 8+4+6+7 = 25 seconds for this program execution.
Now you found out that tasks A,B,C are independent and D is depend on A,B,C tasks' results. Now you may get a feeling that instead of waiting for A to finish we can start the execution of B simultaneously. Same for Task C can start the task simultaneously with A and B. What you can do is: You can invoke 3 new threads by your main thread and assign them A,B,C tasks and wait for the results before start the execution of task D. If your machine has multiple cores then these threads can run in parallel.
Now the execution time taken for the program is :
max(time_taken_A,_B,_C) + time_taken_D + threading_overhead_time
which is almost equal to = 8 + 7 + k = 15 + k;
In fork join parallelism we can offload tasks with a new thread only if these tasks are independent. Other wise we will face race conditions. If you have a program where one task is waiting for another task execution but this is not dependent on its results then you can offload these two tasks with new threads using fork join parallelism and you can get performance boost. But always think about the threading over head. If your tasks are very light weighted then using these parallel patterns will decrease your performance because of the thread creation, context switching overheads.
来源:https://stackoverflow.com/questions/3524634/java-7-fork-join-framework