I\'m trying to understand why we need all parts of the standard sample code:
a `par` b `pseq` a+b
Why won\'t the following be sufficient?
Ok. I think the following paper answers my question: http://community.haskell.org/~simonmar/papers/threadscope.pdf
In summary, the problem with
a `par` b `par` a+b
and
a `par` a+b
is the lack of ordering of evaluation. In both versions, the main thread gets to work on a
(or sometimes b
) immediately, causing the sparks to "fizzle" away immediately since there is no more need to start a thread to evaluate what the main thread has already started evaluating.
The original version
a `par` b `pseq` a+b
ensures the main thread works on b
before a+b
(or else would have started evaluating a
instead), thus giving a chance for the spark a
to materialize into a thread for parallel evaluation.
a `par` b `par` a+b
creates sparks for both a
and b
, but a+b
is reached immediately so one of the sparks will fizzle (i.e., it is evaluated in the main thread). The problem with this is efficiency, as we created an unnecessary spark. If you're using this to implement parallel divide & conquer then the overhead will limit your speedup.
a `par` a+b
seems better because it only creates a single spark. However, attempting to evaluate a
before b
will fizzle the spark for a
, and as b
does not have a spark this will result in sequential evaluation of a+b
. Switching the order to b+a
would solve this problem, but as code this doesn't enforce ordering and Haskell could still evaluate that as a+b
.
So, we do a `par` b `pseq` a+b
to force evaluation of b
in the main thread before we attempt to evaluate a+b
. This gives the a
spark chance to materialise before we try evaluating a+b
, and we haven't created any unnecessary sparks.
a `par` b `par` a+b
will evaluate a and b in parallel and returns a+b, yes.
However, the pseq there ensures both a and b are evaluated before a+b is.
See this link for more details on that topic.