Lazy decoding of a list with Data.Binary

后端 未结 1 926
臣服心动
臣服心动 2020-12-20 17:51

I am lazily encoding lists using this code (taken from this SO question):

import Data.Binary

newtype Stream a = Stream { unstream :: [a] }

instance Binary          


        
相关标签:
1条回答
  • 2020-12-20 18:16

    It is not lazy because the Get monad is a strict state monad (in binary-0.5.0.2 to 0.5.1.1; it was a lazy state monad before, and in binary-0.6.* it has become a continuation monad, I haven't analysed the strictness implications of that change):

    -- | The parse state
    data S = S {-# UNPACK #-} !B.ByteString  -- current chunk
               L.ByteString                  -- the rest of the input
               {-# UNPACK #-} !Int64         -- bytes read
    
    -- | The Get monad is just a State monad carrying around the input ByteString
    -- We treat it as a strict state monad. 
    newtype Get a = Get { unGet :: S -> (# a, S #) }
    
    -- Definition directly from Control.Monad.State.Strict
    instance Monad Get where
        return a  = Get $ \s -> (# a, s #)
        {-# INLINE return #-}
    
        m >>= k   = Get $ \s -> case unGet m s of
                                 (# a, s' #) -> unGet (k a) s'
        {-# INLINE (>>=) #-}
    

    thus the final recursive

    get >>= \x ->
    get >>= \(Stream xs) ->
    return (Stream (x:xs))
    

    forces the entire Stream to be read before it can be returned.

    I don't think it's possible to lazily decode a Stream in the Get monad (so a fortiori not with the Binary instance). But you can write a lazy decoding function using runGetState:

    -- | Run the Get monad applies a 'get'-based parser on the input
    -- ByteString. Additional to the result of get it returns the number of
    -- consumed bytes and the rest of the input.
    runGetState :: Get a -> L.ByteString -> Int64 -> (a, L.ByteString, Int64)
    runGetState m str off =
        case unGet m (mkState str off) of
          (# a, ~(S s ss newOff) #) -> (a, s `join` ss, newOff)
    

    First write a Get parser that returns a Maybe a,

    getMaybe :: Binary a => Get (Maybe a)
    getMaybe = do
        t <- getWord8
        case t of
          0 -> return Nothing
          _ -> fmap Just get
    

    then use that to make a function of type (ByteString,Int64) -> Maybe (a,(ByteString,Int64)):

    step :: Binary a => (ByteString,Int64) -> Maybe (a,(ByteString,Int64))
    step (xs,offset) = case runGetState getMaybe xs offset of
                         (Just v, ys, newOffset) -> Just (v,(ys,newOffset))
                         _                       -> Nothing
    

    and then you can use Data.List.unfoldr to lazily decode a list,

    lazyDecodeList :: Binary a => ByteString -> [a]
    lazyDecodeList xs = unfoldr step (xs,0)
    

    and wrap that in a Stream

    lazyDecodeStream :: Binary a => ByteString -> Stream a
    lazyDecodeStream = Stream . lazyDecodeList
    
    0 讨论(0)
提交回复
热议问题