Haskell source encoding

后端 未结 4 1723
逝去的感伤
逝去的感伤 2021-02-15 19:09

The Haskell 2010 Language Report says:

Haskell uses the Unicode [2] character set. However, source programs are currently biased toward the ASCII character set
4条回答
  •  逝去的感伤
    2021-02-15 19:58

    While the Haskell standard simply says Unicode the set of possible characters (as opposed to e.g. ASCII or Latin-1) it doesn't specify which of the several different encodings (UTF8 UTF16, UTF32, byte order) to use.

    Alex, the lexer that comes with the Haskell Platform requires its input to be UTF8 encoded * which is why you see the code you mention. In practice I think all the major implementations of Haskell require source to be in UTF8.

    * - This is actually a real problem as GHC stores strings and more importantly Data.Text internally as UTF16. It would be nice to be able to lex these directly rather then converting back and forth.

提交回复
热议问题