Writing a parser from scratch in Haskell

前端 未结 2 695
一向
一向 2021-01-29 23:22

Is there any good tutorial for writing a parser for a given grammar in Haskell from scratch?

I found:

  1. parsing expressions and statements (HaskellWiki)

相关标签:
2条回答
  • 2021-01-29 23:54

    It's actually surprisingly easy to build Parsec-from-scratch. The actual library code itself is heavily generalized and optimized which contorts the core abstraction, but if you're just building things from scratch to understand more about what's going on you can write it in just a few lines of code. I'll build a slightly weaker Applicative parser below.

    Essentially, we want to produce a Applicative, Parser along with a primitive parser value

    satisfy :: (Char -> Bool) -> Parser Char
    

    and a few combinators such as try, which "undoes" a parser if it fails

    try :: Parser a -> Parser a
    

    and orElse which allows us to continue with a second parser if the first parser fails. Usually this is actually written with the infix combinator (<|>)

    orElse, (<|>) :: Parser a -> Parser a -> Parser a
    

    Since our Applicative needs to keep track of the current stream state and be able to fail, we'll build it by combining the state Applicative and the Either applicative.

    type Error = String
    newtype Parser a = P { unP :: String -> (String, Either Error a) }
    
    instance Functor Parser where
      fmap f (P st) = P $ \stream -> case st stream of
        (res, Left err) -> (res, Left err)
        (res, Right a ) -> (res, Right (f a))
    
    instance Applicative Parser where
      pure a = P (\stream -> (stream, Right a))
      P ff <*> P xx = P $ \stream0 -> case ff stream0 of   -- produce an f
        (stream1, Left err) -> (stream1, Left err)
        (stream1, Right f ) -> case xx stream1 of          -- produce an x
          (stream2, Left err) -> (stream2, Left err)
          (stream2, Right x ) -> (stream2, Right (f x))    -- return (f x)
    

    If we follow the (<*>) method in the Applicative instance carefully we see that it just passes the stream into the f-producing Parser, takes the result stream and, if successful, passes it to the x-producing Parser and if they both succeed it returns their application (f x). This means that if we have a function-producing parser and an argument-producing parser we can sequence them with (<*>)

    -- given
    parseChar :: Char -> Parser Char
    
    parseHi   :: Parser (Char, Char) -- parses 'h' then 'i'
    parseHi = pure (,) <$> parseChar 'h' <*> parseChar 'i'
    

    We can use the mechanics of this Applicative to build the needed combinators as well. Here's satisfy

    -- | Peek at the next character and return successfully if it satisfies a predicate
    satisfy :: (Char -> Bool) -> Parser Char
    satisfy f = P $ \stream -> case stream of
      []                 -> ([], Left "end of stream")
      (c:cs) | f c       -> (cs, Right c)
             | otherwise -> (cs, Left "did not satisfy")
    

    And here's try

    -- | Run a parser but if it fails revert the stream to it's original state
    try :: Parser a -> Parser a
    try (P f) = P $ \stream0 -> case f stream0 of
      (_      , Left err) -> (stream0, Left err)
      (stream1, Right a ) -> (stream1, Right a )
    

    And here's orElse

    orElse :: Parser a -> Parser a -> Parser a
    orElse (P f1) (P f2) = P $ \stream0 -> case f1 stream0 of
      (stream1, Left err) -> f2 stream1
      (stream1, Right a ) -> (stream1, Right a)
    

    Typically at this point we also note that Parser forms an Alternative instance with orElse if we also provide an immediately failing parser, empty

    instance Alternative Parser where
      empty = P $ \stream -> (stream, Left "empty")
      (<|>) = orElse
    
      many = manyParser
      some = someParser
    

    And we can write manyParser and someParser as combinators which run a parser repeatedly.

    -- | 0 or more
    manyParser :: Parser a -> Parser [a]
    manyParser (P f) = P go where
      go stream = case f stream of
        (_      , Left err) -> (stream, Right [])  -- throws away the error
        (stream', Right a ) -> case go stream' of 
          (streamFin, Left err) -> (streamFin, Left err)
          (streamFin, Right as) -> (streamFin, Right (a : as))
    
    -- | 1 or more
    someParser :: Parser a -> Parser [a]
    someParser (P f) = P $ \stream -> case f stream of
      (stream', Left err) -> (stream', Left err)
      (stream', Right a ) -> 
        let (P fmany) = manyParser (P f)
        in case fmany stream' of
          (stream'', Left err) -> (stream'', Left err)
          (stream'', Right as) -> (stream'', Right (a:as))
    

    And from here we can begin to work at much higher levels of abstraction.

    char :: Char -> Parser Char
    char c = satisfy (== c)
    
    string :: String -> Parser String
    string []     = pure []
    string (c:cs) = (:) <$> char c <*> string cs
    
    oneOf :: [Char] -> Parser Char
    oneOf cs = satisfy (`elem` cs)
    
    parens :: Parser a -> Parser a
    parens parseA = dropFirstAndLast <$> char '(' <*> parseA <*> char ')'
      where
        dropFirstAndLast _ a _ = a
    
    0 讨论(0)
  • 2021-01-29 23:56

    I really liked Graham Hutton's "Programming in Haskell". It gives a gentle introduction to monads and parser combinators. The eighth chapter builds a basic parser library.

    Here is the link to Programming in Haskell book site. You will also find link to the parser library and basic expression parser.

    Further, if you are interested, you can also look at uu-parsinglib applicative style parser combinators developed at Utrecht University.

    0 讨论(0)
提交回复
热议问题