I\'m trying to understand the differences between conduit and pipes. Unlike pipes, conduit has the concept of leftovers. What are leftovers u
I'll answer for pipes
. The short answer to your question is that the upcoming pipes-parse
library will have support for leftovers as part of a more general parsing framework. I find that almost every case where people want leftovers they actually want a parser, which is why I frame the leftovers problem as a subset of parsing. You can find the current draft of the library here.
However, if you want to understand how pipes-parse
gets it to work, the simplest possible way to implement leftovers is to just use StateP
to store the pushback buffer. This requires only defining the following two functions:
import Control.Proxy
import Control.Proxy.Trans.State
draw :: (Monad m, Proxy p) => StateP [a] p () a b' b m a
draw = do
s <- get
case s of
[] -> request ()
a:as -> do
put as
return a
unDraw :: (Monad m, Proxy p) => a -> StateP [a] p () a b' b m ()
unDraw a = do
as <- get
put (a:as)
draw
first consults the pushback buffer to see if there are any stored elements, popping one element off the stack if available. If the buffer is empty, it instead requests a new element from upstream. Of course, there's no point having a buffer if we can't push anything back, so we also define unDraw
to push an element onto the stack to save for later.
Edit: Oops, I forgot to include a useful example of when leftovers are useful. Like Michael says, takeWhile
and dropWhile
are useful cases of leftovers. Here's the drawWhile
function (analogous to what Michael calls takeWhile
):
drawWhile :: (Monad m, Proxy p) => (a -> Bool) -> StateP [a] p () a b' b m [a]
drawWhile pred = go
where
go = do
a <- draw
if pred a
then do
as <- go
return (a:as)
else do
unDraw a
return []
Now imagine that your producer was:
producer () = do
respond 1
respond 3
respond 4
respond 6
... and you hooked that up to a consumer that used:
consumer () = do
evens <- drawWhile odd
odds <- drawWhile even
If the first drawWhile odd
didn't push back the final element it drew, then you would drop the 4
, which wouldn't get correctly passed onto to the second drawWhile even
statement`.
Gabriel's point that leftovers are always part of parsing is interesting. I'm not sure I would agree, but that may just depend on the definition of parsing.
There are a large category of use cases which require leftovers. Parsing is certainly one: any time a parse requires some kind of lookahead, you'll need leftovers. One example of this is in the markdown package's getIndented function, which isolates all of the upcoming lines with a certain indentation level, leaving the rest of the lines to be processed later.
But a much more mundane set of examples lives in conduit itself. Any time you're dealing with packed data (like ByteString or Text), you'll need to read a chunk, analyze it somehow, use leftover to push back the extra, and then do something with the original content. Perhaps the simplest example of this is dropWhile.
In fact, I consider leftover to be such a core, basic feature of a streaming library that the new 1.0 interface for conduit doesn't even expose the option to users of disabling leftovers. I know of very few real-world use cases that don't need it in one way or another.