In Parsec, is there a way to prevent lexeme from consuming newlines?

前端未结

关注

 4  1982

生来不讨喜 2021-02-13 12:36

All of the parsers in Text.Parsec.Token politely use lexeme to eat whitespace after a token. Unfortunately for me, whitespace includes new lines, whic

4条回答

再見小時候 (楼主)

2021-02-13 12:44
No, it is not. Here is the relevant code.

From Text.Parsec.Token:
```
lexeme p
    = do{ x <- p; whiteSpace; return x  }


--whiteSpace
whiteSpace
    | noLine && noMulti  = skipMany (simpleSpace  "")
    | noLine             = skipMany (simpleSpace <|> multiLineComment  "")
    | noMulti            = skipMany (simpleSpace <|> oneLineComment  "")
    | otherwise          = skipMany (simpleSpace <|> oneLineComment <|> multiLineComment  "")
    where
      noLine  = null (commentLine languageDef)
      noMulti = null (commentStart languageDef)
```
One will notice in the where clause of whitespace that the only only options looked at deal with comments. The lexeme function uses whitespace and it is used liberally in the rest of parsec.token.

Update Sept. 28, 2015

The ultimate solution for me was to use a proper lexical analyser (alex). Parsec does a very good job as a parsing library and it is a credit to the design that it can be mangled into doing lexical analysis, but for all but small and simple projects it will quickly become unwieldy. I now use alex to create a linear set of tokens and then Parsec turns them into an AST.
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...

In Parsec, is there a way to prevent lexeme from consuming newlines?

Update Sept. 28, 2015