Implementing Read typeclass where parsing strings includes “$”

前端 未结 2 1681
天涯浪人
天涯浪人 2021-01-05 16:21

I\'ve been playing with Haskell for about a month. For my first \"real\" Haskell project I\'m writing a parts-of-speech tagger. As part of this project I have a type called

相关标签:
2条回答
  • 2021-01-05 16:32

    You're abusing Read here.

    Show and Read are meant to print and parse valid Haskell values, to enable debugging, etc. This doesn't always perfectly (e.g. if you import Data.Map qualified and then call show on a Map value, the call to fromList isn't qualified) but it's a valid starting point.

    If you want to print or parse your values to match some specific format, then use a pretty-printing library for the former and an actual parsing library (e.g. uu-parsinglib, polyparse, parsec, etc.) for the latter. They typically have much nicer support for parsing than ReadS (though ReadP in GHC isn't too bad).

    Whilst you may argue that this isn't necessary, this is just a quick'n'dirty hack you're doing, quick'n'dirty hacks have a tendency to linger around... do yourself a favour and do it right the first time: it means there's less to re-write when you want to do it "properly" later on.

    0 讨论(0)
  • 2021-01-05 16:47

    Don't use the Haskell lexer then. The read functions use ParSec, which you can find an excellent introduction to in the Real World Haskell book.

    Here's some code that seems to work,

    import Text.Read
    import Text.ParserCombinators.ReadP hiding (choice)
    import Text.ParserCombinators.ReadPrec hiding (choice)
    
    data Tag = CC | CD | DT | EX | FW | IN | JJ | JJR | JJS deriving (Show)
    
    strValMap = map (\(x, y) -> lift $ string x >> return y)
    
    instance Read Tag where
        readPrec = choice $ strValMap [
            ("CC", CC),
            ("CD", CD),
            ("JJ$", JJS)
            ]
    

    just run it with

    (read "JJ$") :: Tag
    

    The code is pretty self explanatory. The string x parser monad matches x, and if it succeeds (doesn't throw an exception), then y is returned. We use choice to select among all of these. It will backtrack appropriately, so if you add a CCC constructor, then CC partially matching "CCC" will fail later, and it will backtrack to CCC. Of course, if you don't need this, then use the <|> combinator.

    0 讨论(0)
提交回复
热议问题