The documentation of seq
function says the following:
A note on evaluation order: the expression
seq a b
does not guarantee that
The answers so far have focused on the seq
versus pseq
performance issues, but I think you originally wanted to know which of the two you should use.
The short answer is: while both should generate nearly identically performing code in practice (at least when proper optimization flags are turned on), the primitive seq
, and not pseq
, is the correct choice for your situation. Using pseq
is non-idiomatic, confusing, and potentially counterproductive from a performance standpoint, and your reason for using it is based on a flawed understand of what its order-of-evaluation guarantee means and what it implies with respect to performance. While there are no guarantees about performance across different sets of compiler flags (much less across other compilers), if you ever run into a situation where the seq
version of the above code runs significantly slower than the pseq
version using "production quality" optimization flags with the GHC compiler, you should consider it a GHC bug and file a bug report.
The long answer is, of course, longer...
First, let's be clear that seq
and pseq
are semantically identical in the sense that they both satisfy the equations:
seq _|_ b = _|_
seq a b = b -- if a is not _|_
pseq _|_ b = _|_
pseq a b = b -- if a is not _|_
This is really the only thing that either of them guarantees semantically, and since the definition of the Haskell language (as given in the Haskell report say) only makes -- at best -- semantic guarantees and does not deal with performance or implementation, there's no reason to choose between one or the other for reasons of guaranteed performance across different compilers or compiler flags.
Furthermore, in your particular seq
-based version of the function sum
, it's not too difficult to see that there is no situation in which seq
is called with an undefined first argument but a defined second argument (assuming the use of a standard numeric type), so you aren't even using the semantic properties of seq
. You could re-define seq
as seq a b = b
and have exactly the same semantics. Of course, you know this -- that's why your first version didn't use seq
. Instead, you're using seq
for an incidental performance side-effect, so we're out of the realm of semantic guarantees and back in the realm of specific GHC compiler implementation and performance characteristics (where there aren't really any guarantees to speak of).
Second, that brings us to the intended purpose of seq
. It is rarely used for its semantic properties because those properties aren't very useful. Who would want a computation seq a b
to return b
except that it should fail to terminate if some unrelated expression a
fails to terminate? (The exceptions -- no pun intended -- would be things like handling exceptions, where you might use seq
or deepSeq
which is based on seq
to force evaluation of a non-terminating expression in either an uncontrolled or controlled way, before starting evaluation of another expression.)
Instead, seq a b
is intended to force evaluation of a
to weak head normal form before returning the result of b
to prevent accumulation of thunks. The idea is, if you have an expression b
which builds a thunk that could potentially accumulate on top of another unevaluated thunk represented by a
, you can prevent that accumulation by using seq a b
. The "guarantee" is a weak one: GHC guarantees that it understands you don't want a
to remain an unevaluated thunk when seq a b
's value is demanded. Technically, it doesn't guarantee that a
will be "evaluated before" b
, whatever that means, but you don't need that guarantee. When you worry that, without this guarantee, GHC might evaluate the recursive call first and still accumulate thunks, this is as ridiculous as worrying that pseq a b
might evaluate its first argument, then wait 15 minutes (just to make absolutely sure the first argument has been evaluated!), before evaluating its second.
This is a situation where you should trust GHC to do the right thing. It may seem to you that the only way to realize the performance benefit of seq a b
is for a
to be evaluated to WHNF before evaluation of b
starts, but it is conceivable that there are optimizations in this or other situations that technically start evaluating b
(or even fully evaluate b
to WHNF) while leaving a
unevaluated for a short time to improve performance while still preserving the semantics of seq a b
. By using pseq
instead, you may prevent GHC from making such optimizations. (In your sum
program situation, there undoubtedly is no such optimization, but in a more complex use of seq
, there might be.)
Third, it's important to understand what pseq
is actually for. It was first described in Marlow 2009 in the context of concurrent programming. Suppose we want to parallelize two expensive computations foo
and bar
and then combine (say, add) their results:
foo `par` (bar `seq` foo+bar) -- parens redundant but included for clarity
The intention here is that -- when this expression's value is demanded -- it creates a spark to compute foo
in parallel and then, via the seq
expression, starts evaluating bar
to WHNF (i.e., it's numeric value, say) before finally evaluating foo+bar
which will wait on the spark for foo
before adding and returning the results.
Here, it's conceivable that GHC will recognize that for a specific numeric type, (1) foo+bar
automatically fails to terminate if bar
does, satisfying the formal semantic guarantee of seq
; and (2) evaluating foo+bar
to WHNF will automatically force evaluation of bar
to WHNF preventing any thunk accumulation and so satisfying the informal implementation guarantee of seq
. In this situation, GHC may feel free to optimize the seq
away to yield:
foo `par` foo+bar
particularly if it feels that it would be more performant to start evaluation of foo+bar
before finishing evaluating bar
to WHNF.
What GHC isn't smart enough to realize is that -- if evaluation of foo
in foo+bar
starts before the foo
spark is scheduled, the spark will fizzle, and no parallel execution will occur.
It's really only in this case, where you need to explicitly delay demanding the value of a sparked expression to allow an opportunity for it to be scheduled before the main thread "catches up" that you need the extra guarantee of pseq
and are willing to have GHC forgo additional optimization opportunities permitted by the weaker guarantee of seq
:
foo `par` (bar `pseq` foo+bar)
Here, pseq
will prevent GHC from introducing any optimization that might allow foo+bar
to start evaluating (potentially fizzling the foo
spark) before bar
is in WHNF (which, we hope, allows enough time for the spark to be scheduled).
The upshot is that, if you're using pseq
for anything other than concurrent programming, you're using it wrong. (Well, maybe there are some weird situations, but...) If all you want to do is force strict evaluation and/or thunk evaluation to improve performance in non-concurrent code, using seq
(or $!
which is defined in terms of seq
or Haskell strict data types which are defined in terms of $!
) is the correct approach.
(Or, if @Kindaro's benchmarks are to be believed, maybe merciless benchmarking with specific compiler versions and flags is the correct approach.)