I wanted to test foldl vs foldr. From what I\'ve seen you should use foldl over foldr when ever you can due to tail reccursion optimization.
This makes sense. Howeve
Well, let me rewrite your functions in a way that difference should be obvious -
a :: a -> [a] -> [a]
a = (:)
b :: [b] -> b -> [b]
b = flip (:)
You see that b is more complex than a. If you want to be precise a
needs one reduction step for value to be calculated, but b
needs two. That makes the time difference you are measuring, in second example twice as much reductions must be performed.
//edit: But time complexity is the same, so I wouldn't bother about it much.
EDIT: Upon looking at this problem again, I think all current explanations are somewhat insufficient so I've written a longer explanation.
The difference is in how foldl
and foldr
apply their reduction function. Looking at the foldr
case, we can expand it as
foldr (\x -> [x] ++ ) [] [0..10000]
[0] ++ foldr a [] [1..10000]
[0] ++ ([1] ++ foldr a [] [2..10000])
...
This list is processed by sum
, which consumes it as follows:
sum = foldl' (+) 0
foldl' (+) 0 ([0] ++ ([1] ++ ... ++ [10000]))
foldl' (+) 0 (0 : [1] ++ ... ++ [10000]) -- get head of list from '++' definition
foldl' (+) 0 ([1] ++ [2] ++ ... ++ [10000]) -- add accumulator and head of list
foldl' (+) 0 (1 : [2] ++ ... ++ [10000])
foldl' (+) 1 ([2] ++ ... ++ [10000])
...
I've left out the details of the list concatenation, but this is how the reduction proceeds. The important part is that everything gets processed in order to minimize list traversals. The foldr
only traverses the list once, the concatenations don't require continuous list traversals, and sum
finally consumes the list in one pass. Critically, the head of the list is available from foldr
immediately to sum
, so sum
can begin working immediately and values can be gc'd as they are generated. With fusion frameworks such as vector
, even the intermediate lists will likely be fused away.
Contrast this to the foldl
function:
b xs = ( ++xs) . (\y->[y])
foldl b [] [0..10000]
foldl b ( [0] ++ [] ) [1..10000]
foldl b ( [1] ++ ([0] ++ []) ) [2..10000]
foldl b ( [2] ++ ([1] ++ ([0] ++ [])) ) [3..10000]
...
Note that now the head of the list isn't available until foldl
has finished. This means that the entire list must be constructed in memory before sum
can begin to work. This is much less efficient overall. Running the two versions with +RTS -s
shows miserable garbage collection performance from the foldl version.
This is also a case where foldl'
will not help. The added strictness of foldl'
doesn't change the way the intermediate list is created. The head of the list remains unavailable until foldl' has finished, so the result will still be slower than with foldr
.
I use the following rule to determine the best choice of fold
foldl'
(e.g. this will be the only/final traversal)foldr
.foldl
.In most cases foldr
is the best fold function because the traversal direction is optimal for lazy evaluation of lists. It's also the only one capable of processing infinite lists. The extra strictness of foldl'
can make it faster in some cases, but this is dependent on how you'll use that structure and how lazy it is.
The problem is that tail recursion optimization is a memory optimization, not a execution time optimization!
Tail recursion optimization avoids the need to remember values for each recursive call.
So, foldl is in fact "good" and foldr is "bad".
For example, considering the definitions of foldr and foldl:
foldl f z [] = z
foldl f z (x:xs) = foldl f (z `f` x) xs
foldr f z [] = z
foldr f z (x:xs) = x `f` (foldr f z xs)
That's how the expression "foldl (+) 0 [1,2,3]" is evaluated:
foldl (+) 0 [1, 2, 3]
foldl (+) (0+1) [2, 3]
foldl (+) ((0+1)+2) [3]
foldl (+) (((0+1)+2)+3) [ ]
(((0+1)+2)+3)
((1+2)+3)
(3+3)
6
Note that foldl doesn't remember the values 0, 1, 2..., but pass the whole expression (((0+1)+2)+3) as argument lazily and don't evaluates it until the last evaluation of foldl, where it reaches the base case and returns the value passed as the second parameter (z) wich isn't evaluated yet.
On the other hand, that's how foldr works:
foldr (+) 0 [1, 2, 3]
1 + (foldr (+) 0 [2, 3])
1 + (2 + (foldr (+) 0 [3]))
1 + (2 + (3 + (foldr (+) 0 [])))
1 + (2 + (3 + 0)))
1 + (2 + 3)
1 + 5
6
The important difference here is that where foldl evaluates the whole expression in the last call, avoiding the need to come back to reach remembered values, foldr no. foldr remember one integer for each call and performs a addition in each call.
Is important to bear in mind that foldr and foldl are not always equivalents. For instance, try to compute this expressions in hugs:
foldr (&&) True (False:(repeat True))
foldl (&&) True (False:(repeat True))
foldr and foldl are equivalent only under certain conditions described here
(sorry for my bad english)
I don't think anyone's actually said the real answer on this one yet, unless I'm missing something (which may well be true and welcomed with downvotes).
I think the biggest different in this case is that foldr
builds the list like this:
[0] ++ ([1] ++ ([2] ++ (... ++ [1000000])))
Whereas foldl
builds the list like this:
((([0] ++ [1]) ++ [2]) ++ ... ) ++ [999888]) ++ [999999]) ++ [1000000]
The difference in subtle, but notice that in the foldr
version ++
always has only one list element as its left argument. With the foldl
version, there are up to 999999 elements in ++
's left argument (on average around 500000), but only one element in the right argument.
However, ++
takes time proportional to the size of the left argument, as it has to look though the entire left argument list to the end and then repoint that last element to the first element of the right argument (at best, perhaps it actually needs to do a copy). The right argument list is unchanged, so it doesn't matter how big it is.
That's why the foldl
version is much slower. It's got nothing to do with laziness in my opinion.
For a, the [0.. 100000]
list needs to be expanded right away so that foldr can start with the last element. Then as it folds things together, the intermediate results are
[100000]
[99999, 100000]
[99998, 99999, 100000]
...
[0.. 100000] -- i.e., the original list
Because nobody is allowed to change this list value (Haskell is a pure functional language), the compiler is free to reuse the value. The intermediate values, like [99999, 100000]
can even be simply pointers into the expanded [0.. 100000]
list instead of separate lists.
For b, look at the intermediate values:
[0]
[0, 1]
[0, 1, 2]
...
[0, 1, ..., 99999]
[0.. 100000]
Each of those intermediate lists can't be reused, because if you change the end of the list then you've changed any other values that point to it. So you're creating a bunch of extra lists that take time to build in memory. So in this case you spend a lot more time allocating and filling in these lists that are intermediate values.
Since you're just making a copy of the list, a runs faster because it starts by expanding the full list and then just keeps moving a pointer from the back of the list to the front.
Neither foldl
nor foldr
is tail optimized. It is only foldl'
.
But in your case using ++
with foldl'
is not good idea because successive evaluation of ++
will cause traversing growing accumulator again and again.