What's the benefit of conduit's leftovers?

前端 未结 2 1031
野性不改
野性不改 2021-02-05 14:20

I\'m trying to understand the differences between conduit and pipes. Unlike pipes, conduit has the concept of leftovers. What are leftovers u

相关标签:
2条回答
  • 2021-02-05 15:10

    I'll answer for pipes. The short answer to your question is that the upcoming pipes-parse library will have support for leftovers as part of a more general parsing framework. I find that almost every case where people want leftovers they actually want a parser, which is why I frame the leftovers problem as a subset of parsing. You can find the current draft of the library here.

    However, if you want to understand how pipes-parse gets it to work, the simplest possible way to implement leftovers is to just use StateP to store the pushback buffer. This requires only defining the following two functions:

    import Control.Proxy
    import Control.Proxy.Trans.State
    
    draw :: (Monad m, Proxy p) => StateP [a] p () a b' b m a
    draw = do
        s <- get
        case s of
            []   -> request ()
            a:as -> do
                put as
                return a
    
    unDraw :: (Monad m, Proxy p) => a -> StateP [a] p () a b' b m ()
    unDraw a = do
        as <- get
        put (a:as)
    

    draw first consults the pushback buffer to see if there are any stored elements, popping one element off the stack if available. If the buffer is empty, it instead requests a new element from upstream. Of course, there's no point having a buffer if we can't push anything back, so we also define unDraw to push an element onto the stack to save for later.

    Edit: Oops, I forgot to include a useful example of when leftovers are useful. Like Michael says, takeWhile and dropWhile are useful cases of leftovers. Here's the drawWhile function (analogous to what Michael calls takeWhile):

    drawWhile :: (Monad m, Proxy p) => (a -> Bool) -> StateP [a] p () a b' b m [a]
    drawWhile pred = go
      where
        go = do
            a <- draw
            if pred a
            then do
                as <- go
                return (a:as)
            else do
                unDraw a
                return []
    

    Now imagine that your producer was:

    producer () = do
        respond 1
        respond 3
        respond 4
        respond 6
    

    ... and you hooked that up to a consumer that used:

    consumer () = do
        evens <- drawWhile odd
        odds  <- drawWhile even
    

    If the first drawWhile odd didn't push back the final element it drew, then you would drop the 4, which wouldn't get correctly passed onto to the second drawWhile even statement`.

    0 讨论(0)
  • 2021-02-05 15:24

    Gabriel's point that leftovers are always part of parsing is interesting. I'm not sure I would agree, but that may just depend on the definition of parsing.

    There are a large category of use cases which require leftovers. Parsing is certainly one: any time a parse requires some kind of lookahead, you'll need leftovers. One example of this is in the markdown package's getIndented function, which isolates all of the upcoming lines with a certain indentation level, leaving the rest of the lines to be processed later.

    But a much more mundane set of examples lives in conduit itself. Any time you're dealing with packed data (like ByteString or Text), you'll need to read a chunk, analyze it somehow, use leftover to push back the extra, and then do something with the original content. Perhaps the simplest example of this is dropWhile.

    In fact, I consider leftover to be such a core, basic feature of a streaming library that the new 1.0 interface for conduit doesn't even expose the option to users of disabling leftovers. I know of very few real-world use cases that don't need it in one way or another.

    0 讨论(0)
提交回复
热议问题