How to implement delete with foldr in Haskell

后端 未结 3 746
無奈伤痛
無奈伤痛 2021-02-19 04:13

I\'ve been studying folds for the past few days. I can implement simple functions with them, like length, concat and filter. What I\'m stu

相关标签:
3条回答
  • 2021-02-19 04:52

    here is a simple delete, implemented with foldr:

    delete :: (Eq a) => a -> [a] -> [a]
    delete a xs = foldr (\x xs -> if x == a then (xs) else (x:xs)) [] xs
    
    0 讨论(0)
  • 2021-02-19 05:08

    delete doesn't operate on the entire list evenly. The structure of the computation isn't just considering the whole list one element at a time. It differs after it hits the element it's looking for. This tells you it can't be implemented as just a foldr. There will have to be some sort of post-processing involved.

    When that happens, the general pattern is that you build a pair of values and just take one of them at completion of the foldr. That's probably what you did when you imitated Hutton's dropWhile, though I'm not sure since you didn't include code. Something like this?

    delete :: Eq a => a -> [a] -> [a]
    delete a = snd . foldr (\x (xs1, xs2) -> if x == a then (x:xs1, xs1) else (x:xs1, x:xs2)) ([], [])
    

    The main idea is that xs1 is always going to be the full tail of the list, and xs2 is the result of the delete over the tail of the list. Since you only want to remove the first element that matches, you don't want to use the result of delete over the tail when you do match the value you're searching for, you just want to return the rest of the list unchanged - which fortunately is what's always going to be in xs1.

    And yeah, that doesn't work on infinite lists - but only for one very specific reason. The lambda is too strict. foldr only works on infinite lists when the function it is provided doesn't always force evaluation of its second argument, and that lambda does always force evaluation of its second argument in the pattern match on the pair. Switching to an irrefutable pattern match fixes that, by allowing the lambda to produce a constructor before ever examining its second argument.

    delete :: Eq a => a -> [a] -> [a]
    delete a = snd . foldr (\x ~(xs1, xs2) -> if x == a then (x:xs1, xs1) else (x:xs1, x:xs2)) ([], [])
    

    That's not the only way to get that result. Using a let-binding or fst and snd as accessors on the tuple would also do the job. But it is the change with the smallest diff.

    The most important takeaway here is to be very careful with handling the second argument to the reducing function you pass to foldr. You want to defer examining the second argument whenever possible, so that the foldr can stream lazily in as many cases as possible.

    If you look at that lambda, you see that the branch taken is chosen before doing anything with the second argument to the reducing function. Furthermore, you'll see that most of the time, the reducing function produces a list constructor in both halves of the result tuple before it ever needs to evaluate the second argument. Since those list constructors are what make it out of delete, they are what matter for streaming - so long as you don't let the pair get in the way. And making the pattern-match on the pair irrefutable is what keeps it out of the way.

    As a bonus example of the streaming properties of foldr, consider my favorite example:

    dropWhileEnd :: (a -> Bool) -> [a] -> [a]
    dropWhileEnd p = foldr (\x xs -> if p x && null xs then [] else x:xs) []
    

    It streams - as much as it can. If you figure out exactly when and why it does and doesn't stream, you'll understand pretty much every detail of the streaming structure of foldr.

    0 讨论(0)
  • 2021-02-19 05:10

    delete is a modal search. It has two different modes of operation - whether it's already found the result or not. You can use foldr to construct a function that passes the state down the line as each element is checked. So in the case of delete, the state can be a simple Bool. It's not exactly the best type, but it will do.

    Once you have identified the state type, you can start working on the foldr construction. I'm going to walk through figuring it out the way I did. I'll be enabling ScopedTypeVariables just so I can annotate the type of subexpressions better. One you know the state type, you know you want foldr to generate a function taking a value of that type, and returning a value of the desired final type. That's enough to start sketching things.

    {-# LANGUAGE ScopedTypeVariables #-}
    
    delete :: forall a. Eq a => a -> [a] -> [a]
    delete a xs = foldr f undefined xs undefined
      where
        f :: a -> (Bool -> [a]) -> (Bool -> [a])
        f x g = undefined
    

    It's a start. The exact meaning of g is a little bit tricky here. It's actually the function for processing the rest of the list. It's accurate to look at it as a continuation, in fact. It absolutely represents performing the rest of the folding, with your whatever state you choose to pass along. Given that, it's time to figure out what to put in some of those undefined places.

    {-# LANGUAGE ScopedTypeVariables #-}
    
    delete :: forall a. Eq a => a -> [a] -> [a]
    delete a xs = foldr f undefined xs undefined
      where
        f :: a -> (Bool -> [a]) -> (Bool -> [a])
        f x g found | x == a && not found = g True
                    | otherwise           = x : g found
    

    That seems relatively straightforward. If the current element is the one being searched for, and it hasn't yet been found, don't output it, and continue with the state set to True, indicating it's been found. otherwise, output the current value and continue with the current state. This just leaves the rest of the arguments to foldr. The last one is the initial state. The other one is the state function for an empty list. Ok, those aren't too bad either.

    {-# LANGUAGE ScopedTypeVariables #-}
    
    delete :: forall a. Eq a => a -> [a] -> [a]
    delete a xs = foldr f (const []) xs False
      where
        f :: a -> (Bool -> [a]) -> (Bool -> [a])
        f x g found | x == a && not found = g True
                    | otherwise           = x : g found
    

    No matter what the state is, produce an empty list when an empty list is encountered. And the initial state is that the element being searched for has not yet been found.

    This technique is also applicable in other cases. For instance, foldl can be written as a foldr this way. If you look at foldl as a function that repeatedly transforms an initial accumulator, you can guess that's the function being produced - how to transform the initial value.

    {-# LANGUAGE ScopedTypeVariables #-}
    
    foldl :: forall a b. (a -> b -> a) -> a -> [b] -> a
    foldl f z xs = foldr g id xs z
      where
        g :: b -> (a -> a) -> (a -> a)
        g x cont acc = undefined
    

    The base cases aren't too tricky to find when the problem is defined as manipulating the initial accumulator, named z there. The empty list is the identity transformation, id, and the value passed to the created function is z.

    The implementation of g is trickier. It can't just be done blindly on types, because there are two different implementations that use all the expected values and type-check. This is a case where types aren't enough, and you need to consider the meanings of the functions available.

    Let's start with an inventory of the values that seem like they should be used, and their types. The things that seem like they must need to be used in the body of g are f :: a -> b -> a, x :: b, cont :: (a -> a), and acc :: a. f will obviously take x as its second argument, but there's a question of the appropriate place to use cont. To figure out where it goes, remember that it represents the transformation function returned by processing the rest of the list, and that foldl processes the current element and then passes the result of that processing to the rest of the list.

    {-# LANGUAGE ScopedTypeVariables #-}
    
    foldl :: forall a b. (a -> b -> a) -> a -> [b] -> a
    foldl f z xs = foldr g id xs z
      where
        g :: b -> (a -> a) -> (a -> a)
        g x cont acc = cont $ f acc x
    

    This also suggests that foldl' can be written this way with only one tiny change:

    {-# LANGUAGE ScopedTypeVariables #-}
    
    foldl' :: forall a b. (a -> b -> a) -> a -> [b] -> a
    foldl' f z xs = foldr g id xs z
      where
        g :: b -> (a -> a) -> (a -> a)
        g x cont acc = cont $! f acc x
    

    The difference is that ($!) is used to suggest evaluation of f acc x before it's passed to cont. (I say "suggest" because there are some edge cases where ($!) doesn't force evaluation even as far as WHNF.)

    0 讨论(0)
提交回复
热议问题