(Generically) Build Parsers from custom data types?

后端 未结 2 898
梦如初夏
梦如初夏 2021-01-19 10:54

I\'m working on a network streaming client that needs to talk to the server. The server encodes the responses in bytestrings, for example, \"1\\NULJohn\\NULTeddy\\NUL501\\NU

相关标签:
2条回答
  • 2021-01-19 11:40

    For these kinds of problems I turn to generics-sop instead of using generics directly. generics-sop is built on top of Generics and provides functions for manipulating all the fields in a record in a uniform way.

    In this answer I use the ReadP parser which comes with base, but any other Applicative parser would do. Some preliminary imports:

    {-# language DeriveGeneric #-}
    {-# language FlexibleContexts #-}
    {-# language FlexibleInstances #-}
    {-# language TypeFamilies #-}
    {-# language DataKinds #-}
    {-# language TypeApplications #-} -- for the Proxy
    
    import Text.ParserCombinators.ReadP (ReadP,readP_to_S)
    import Text.ParserCombinators.ReadPrec (readPrec_to_P)
    import Text.Read (readPrec)
    import Data.Proxy
    import qualified GHC.Generics as GHC
    import Generics.SOP
    

    We define a typeclass that can produce an Applicative parser for each of its instances. Here we define only the instances for Int and Bool:

    class HasSimpleParser c where
        getSimpleParser :: ReadP c
    
    instance HasSimpleParser Int where
        getSimpleParser = readPrec_to_P readPrec 0
    
    instance HasSimpleParser Bool where
        getSimpleParser = readPrec_to_P readPrec 0
    

    Now we define a generic parser for records in which every field has a HasSimpleParser instance:

    recParser :: (Generic r, Code r ~ '[xs], All HasSimpleParser xs) => ReadP r
    recParser = to . SOP . Z <$> hsequence (hcpure (Proxy @HasSimpleParser) getSimpleParser)
    

    The Code r ~ '[xs], All HasSimpleParser xs constraint means "this type has only one constructor, the list of field types is xs, and all the field types have HasSimpleParser instances".

    hcpure constructs an n-ary product (NP) where each component is a parser for the corresponding field of r. (NP products wrap each component in a type constructor, which in our case is the parser type ReadP).

    Then we use hsequence to turn a n-ary product of parsers into the parser of an n-ary product.

    Finally, we fmap into the resulting parser and turn the n-ary product back into the original r record using to. The Z and SOP constructors are required for turning the n-ary product into the sum-of-products the to function expects.


    Ok, let's define an example record and make it an instance of Generics.SOP.Generic:

    data Foo = Foo { x :: Int, y :: Bool } deriving (Show, GHC.Generic)
    
    instance Generic Foo -- Generic from generics-sop
    

    Let's check if we can parse Foo with recParser:

    main :: IO ()
    main = do
        print $ readP_to_S (recParser @Foo) "55False"
    

    The result is

    [(Foo {x = 55, y = False},"")]
    
    0 讨论(0)
  • 2021-01-19 11:50

    You can write your own parser - but there is already a package that can do the parsing for you: cassava and while SO is usually not a place to search for library recommendations, I want to include this answer for people looking for a solution, but not having the time to implement this themselves and looking for a solution that works out of the box.

    {-# LANGUAGE DeriveGeneric #-}
    {-# LANGUAGE OverloadedStrings #-}
    
    import Data.Csv
    import Data.Vector
    import Data.ByteString.Lazy as B
    import GHC.Generics
    
    data Person = P { personId :: Int
                    , firstName :: String
                    , lastName :: String
                    } deriving (Eq, Generic, Show)
    
     -- the following are provided by friendly neighborhood Generic
    instance FromRecord Person
    instance ToRecord Person
    
    main :: IO ()
    main = do B.writeFile "test" "1\NULThomas\NULof Aquin"
              Right thomas <- decodeWith (DecodeOptions 0) NoHeader <$> 
                                  B.readFile "test"
    
              print (thomas :: Vector Person)
    

    Basically cassava allows you to parse all X-separated structures into a Vector, provided you can write down a FromRecord instance (which needs a parseRecord :: Parser … function to work.

    Side note on Generic until recently I thought - EVERYTHING - in haskell has a Generic instance, or can derive one. Well this is not the case I wanted to serialize some ThreadId to CSV/JSON and happened to find out unboxed types are not so easily "genericked"!

    And before I forget it - when you speak of streaming and server and so on there is cassava-conduit that might be of help.

    0 讨论(0)
提交回复
热议问题