Choosing a Haskell parser

前端 未结 4 394
一生所求
一生所求 2021-01-30 07:25

There are many open sourced parser implementations available to us in Haskell. Parsec seems to be the standard for text parsing and attoparsec seems to be a popular choice for b

相关标签:
4条回答
  • 2021-01-30 07:44

    Bryan O’Sullivan’s blog post What’s in a parser? Attoparsec rewired (2/2) includes a nice performance benchmark comparing several implementations along with some comments comparing memory usage.

    0 讨论(0)
  • 2021-01-30 07:48

    Just to add to Don's post: Personally, I quite like Text.ParserCombinators.ReadP (part of base) for no-nonsense quick and easy stuff. Particularly when Parsec seems like overkill.

    There is a bytestringreadp library for the bytestring version, but it doesn't cover Char8 bytestrings, and I suspect attoparsec would be a better choice at this point.

    0 讨论(0)
  • 2021-01-30 07:50

    You have several good options.

    For lightweight parsing of String types:

    • parsec
    • polyparse

    For packed bytestring parsing, e.g. of HTTP headers.

    • attoparsec

    For actual binary data most people use either:

    • binary -- for lazy binary parsing
    • cereal -- for strict binary parsing

    The main question to ask yourself is what is the underlying string type?

    • String?
    • bytestring (strict)?
    • bytestring (lazy)?
    • unicode text

    That decision largely determines which parser toolset you'll use.

    The second question to ask is: do I already have a grammar for the data type? If so, I can just use happy

    • The Happy parser generator

    And obviously for custom data types there are a variety of good existing parsers:

    • XML
      • haxml
      • xml-light
      • hxt
      • hexpat
    • CSV
      • bytestring-csv
      • csv
    • JSON
      • json
    • rss/atom
      • feed
    0 讨论(0)
  • 2021-01-30 07:51

    I recently converted some code from Parsec to Attoparsec. Both are quite capable.

    Attoparsec wins on performance and memory footprint, but Parsec provides better error reporting and has more complete documentation.

    0 讨论(0)
提交回复
热议问题