问题
I'm currently on chapter 4 of Real World Haskell, and I'm trying to wrap my head around implementing foldl in terms of foldr.
(Here's their code:)
myFoldl :: (a -> b -> a) -> a -> [b] -> a
myFoldl f z xs = foldr step id xs z
where step x g a = g (f a x)
I thought I'd try to implement zip
using the same technique, but I don't seem to be making any progress. Is it even possible?
回答1:
zip2 xs ys = foldr step done xs ys
where done ys = []
step x zipsfn [] = []
step x zipsfn (y:ys) = (x, y) : (zipsfn ys)
How this works: (foldr step done xs) returns a function that consumes ys; so we go down the xs list building up a nested composition of functions that will each be applied to the corresponding part of ys.
How to come up with it: I started with the general idea (from similar examples seen before), wrote
zip2 xs ys = foldr step done xs ys
then filled in each of the following lines in turn with what it had to be to make the types and values come out right. It was easiest to consider the simplest cases first before the harder ones.
The first line could be written more simply as
zip2 = foldr step done
as mattiast showed.
回答2:
The answer had already been given here, but not an (illustrative) derivation. So even after all these years, perhaps it's worth adding it.
It is actually quite simple. First,
foldr f z xs = foldr f z [x1,x2,x3,...,xn] = f x1 (foldr f z [x2,x3,...,xn]) = ... = f x1 (f x2 (f x3 (... (f xn z) ...)))
hence by eta-expansion,
foldr f z xs ys = foldr f z [x1,x2,x3,...,xn] ys = f x1 (foldr f z [x2,x3,...,xn]) ys = ... = f x1 (f x2 (f x3 (... (f xn z) ...))) ys
As is apparent here, if f
is non-forcing in its 2nd argument, it gets to work first on x1
and ys
, f x1
r1
ys
where r1 =
(f x2 (f x3 (... (f xn z) ...)))
= foldr f z [x2,x3,...,xn]
.
So, using
f x1 r1 [] = [] f x1 r1 (y1:ys1) = (x1,y1) : r1 ys1
we arrange for passage of information left-to-right along the list, by calling r1
with the rest of the input list ys1
, foldr f z [x2,x3,...,xn]
ys1 = f x2
r2
ys1
, as the next step. And that's that.
When ys
is shorter than xs
(or the same length), the []
case for f
fires and the processing stops. But if ys
is longer than xs
then f
's []
case won't fire and we'll get to the final f xn
z
(yn:ysn)
application,
f xn z (yn:ysn) = (xn,yn) : z ysn
Since we've reached the end of xs
, the zip
processing must stop:
z _ = []
And this means the definition z = const []
should be used:
zip xs ys = foldr f (const []) xs ys
where
f x r [] = []
f x r (y:ys) = (x,y) : r ys
From the standpoint of f
, r
plays the role of a success continuation, which f
calls when the processing is to continue, after having emitted the pair (x,y)
.
So r
is "what is done with more ys
when there are more x
s", and z = const []
, the nil
-case in foldr
, is "what is done with ys
when there are no more x
s". Or f
can stop by itself, returning []
when ys
is exhausted.
Notice how ys
is used as a kind of accumulating value, which is passed from left to right along the list xs
, from one invocation of f
to the next ("accumulating" step being, here, stripping a head element from it).
Naturally this corresponds to the left fold, where an accumulating step is "applying the function", with z = id
returning the final accumulated value when "there are no more x
s":
foldl f a xs =~ foldr (\x r a-> r (f a x)) id xs a
Similarly, for finite lists,
foldr f a xs =~ foldl (\r x a-> r (f x a)) id xs a
And since the combining function gets to decide whether to continue or not, it is now possible to have left fold that can stop early:
foldlWhile t f a xs = foldr cons id xs a
where
cons x r a = if t x then r (f a x) else a
or a skipping left fold, foldlWhen t ...
, with
cons x r a = if t x then r (f a x) else r a
etc.
回答3:
I found a way using quite similar method to yours:
myzip = foldr step (const []) :: [a] -> [b] -> [(a,b)]
where step a f (b:bs) = (a,b):(f bs)
step a f [] = []
回答4:
For the non-native Haskellers here, I've written a Scheme version of this algorithm to make it clearer what's actually happening:
> (define (zip lista listb)
((foldr (lambda (el func)
(lambda (a)
(if (empty? a)
empty
(cons (cons el (first a)) (func (rest a))))))
(lambda (a) empty)
lista) listb))
> (zip '(1 2 3 4) '(5 6 7 8))
(list (cons 1 5) (cons 2 6) (cons 3 7) (cons 4 8))
The foldr
results in a function which, when applied to a list, will return the zip of the list folded over with the list given to the function. The Haskell hides the inner lambda
because of lazy evaluation.
To break it down further:
Take zip on input: '(1 2 3) The foldr func gets called with
el->3, func->(lambda (a) empty)
This expands to:
(lambda (a) (cons (cons el (first a)) (func (rest a))))
(lambda (a) (cons (cons 3 (first a)) ((lambda (a) empty) (rest a))))
If we were to return this now, we'd have a function which takes a list of one element and returns the pair (3 element):
> (define f (lambda (a) (cons (cons 3 (first a)) ((lambda (a) empty) (rest a)))))
> (f (list 9))
(list (cons 3 9))
Continuing, foldr now calls func with
el->3, func->f ;using f for shorthand
(lambda (a) (cons (cons el (first a)) (func (rest a))))
(lambda (a) (cons (cons 2 (first a)) (f (rest a))))
This is a func which takes a list with two elements, now, and zips them with (list 2 3)
:
> (define g (lambda (a) (cons (cons 2 (first a)) (f (rest a)))))
> (g (list 9 1))
(list (cons 2 9) (cons 3 1))
What's happening?
(lambda (a) (cons (cons 2 (first a)) (f (rest a))))
a
, in this case, is (list 9 1)
(cons (cons 2 (first (list 9 1))) (f (rest (list 9 1))))
(cons (cons 2 9) (f (list 1)))
And, as you recall, f
zips its argument with 3
.
And this continues etc...
回答5:
The problem with all these solutions for zip
is that they only fold over one list or the other, which can be a problem if both of them are "good producers", in the parlance of list fusion. What you actually need is a solution that folds over both lists. Fortunately, there is a paper about exactly that, called "Coroutining Folds with Hyperfunctions".
You need an auxiliary type, a hyperfunction, which is basically a function that takes another hyperfunction as its argument.
newtype H a b = H { invoke :: H b a -> b }
The hyperfunctions used here basically act like a "stack" of ordinary functions.
push :: (a -> b) -> H a b -> H a b
push f q = H $ \k -> f $ invoke k q
You also need a way to put two hyperfunctions together, end to end.
(.#.) :: H b c -> H a b -> H a c
f .#. g = H $ \k -> invoke f $ g .#. k
This is related to push
by the law:
(push f x) .#. (push g y) = push (f . g) (x .#. y)
This turns out to be an associative operator, and this is the identity:
self :: H a a
self = H $ \k -> invoke k self
You also need something that disregards everything else on the "stack" and returns a specific value:
base :: b -> H a b
base b = H $ const b
And finally, you need a way to get a value out of a hyperfunction:
run :: H a a -> a
run q = invoke q self
run
strings all of the push
ed functions together, end to end, until it hits a base
or loops infinitely.
So now you can fold both lists into hyperfunctions, using functions that pass information from one to the other, and assemble the final value.
zip xs ys = run $ foldr (\x h -> push (first x) h) (base []) xs .#. foldr (\y h -> push (second y) h) (base Nothing) ys where
first _ Nothing = []
first x (Just (y, xys)) = (x, y):xys
second y xys = Just (y, xys)
The reason why folding over both lists matters is because of something GHC does called list fusion, which is talked about in the GHC.Base module, but probably should be much more well-known. Being a good list producer and using build
with foldr
can prevent lots of useless production and immediate consumption of list elements, and can expose further optimizations.
回答6:
I tried to understand this elegant solution myself, so I tried to derive the types and evaluation myself. So, we need to write a function:
zip xs ys = foldr step done xs ys
Here we need to derive step
and done
, whatever they are. Recall foldr's type, instantiated to lists:
foldr :: (a -> state -> state) -> state -> [a] -> state
However our foldr
invocation must be instantiated to something like below, because we must accept not one, but two list arguments:
foldr :: (a -> ? -> ?) -> ? -> [a] -> [b] -> [(a,b)]
Because ->
is right-associative, this is equivalent to:
foldr :: (a -> ? -> ?) -> ? -> [a] -> ([b] -> [(a,b)])
Our ([b] -> [(a,b)])
corresponds to state
type variable in the original foldr
type signature, therefore we must replace every occurrence of state
with it:
foldr :: (a -> ([b] -> [(a,b)]) -> ([b] -> [(a,b)]))
-> ([b] -> [(a,b)])
-> [a]
-> ([b] -> [(a,b)])
This means that arguments that we pass to foldr
must have the following types:
step :: a -> ([b] -> [(a,b)]) -> [b] -> [(a,b)]
done :: [b] -> [(a,b)]
xs :: [a]
ys :: [b]
Recall that foldr (+) 0 [1,2,3]
expands to:
1 + (2 + (3 + 0))
Therefore if xs = [1,2,3]
and ys = [4,5,6,7]
, our foldr
invocation would expand to:
1 `step` (2 `step` (3 `step` done)) $ [4,5,6,7]
This means that our 1 `step` (2 `step` (3 `step` done))
construct must create a recursive function that would go through [4,5,6,7]
and zip up the elements. (Keep in mind, that if one of the original lists is longer, the excess values are thrown away). IOW, our construct must have the type [b] -> [(a,b)]
.
3 `step` done
is our base case, where done
is an initial value, like 0
in foldr (+) 0 [1..3]
. We don't want to zip anything after 3, because 3 is the final value of xs
, so we must terminate the recursion. How do you terminate the recursion over list in the base case? You return empty list []
. But recall done
type signature:
done :: [b] -> [(a,b)]
Therefore we can't return just []
, we must return a function that would ignore whatever it receives. Therefore use const:
done = const [] -- this is equivalent to done = \_ -> []
Now let's start figuring out what step
should be. It combines a value of type a
with a function of type [b] -> [(a,b)]
and returns a function of type [b] -> [(a,b)]
.
In 3 `step` done
, we know that the result value that would later go to our zipped list must be (3,6)
(knowing from original xs
and ys
). Therefore 3 `step` done
must evaluate into:
\(y:ys) -> (3,y) : done ys
Remember, we must return a function, inside which we somehow zip up the elements, the above code is what makes sense and typechecks.
Now that we assumed how exactly step
should evaluate, let's continue the evaluation. Here's how all reduction steps in our foldr
evaluation look like:
3 `step` done -- becomes
(\(y:ys) -> (3,y) : done ys)
2 `step` (\(y:ys) -> (3,y) : done ys) -- becomes
(\(y:ys) -> (2,y) : (\(y:ys) -> (3,y) : done ys) ys)
1 `step` (\(y:ys) -> (2,y) : (\(y:ys) -> (3,y) : done ys) ys) -- becomes
(\(y:ys) -> (1,y) : (\(y:ys) -> (2,y) : (\(y:ys) -> (3,y) : done ys) ys) ys)
The evaluation gives rise to this implementation of step (note that we account for ys
running out of elements early by returning an empty list):
step x f = \[] -> []
step x f = \(y:ys) -> (x,y) : f ys
Thus, the full function zip
is implemented as follows:
zip :: [a] -> [b] -> [(a,b)]
zip xs ys = foldr step done xs ys
where done = const []
step x f [] = []
step x f (y:ys) = (x,y) : f ys
P.S.: If you are inspired by elegance of folds, read Writing foldl using foldr and then Graham Hutton's A tutorial on the universality and expressiveness of fold.
回答7:
A simple approach:
lZip, rZip :: Foldable t => [b] -> t a -> [(a, b)]
-- implement zip using fold?
lZip xs ys = reverse.fst $ foldl f ([],xs) ys
where f (zs, (y:ys)) x = ((x,y):zs, ys)
-- Or;
rZip xs ys = fst $ foldr f ([],reverse xs) ys
where f x (zs, (y:ys)) = ((x,y):zs, ys)
来源:https://stackoverflow.com/questions/235148/implement-zip-using-foldr