Excluding certain elements from a specified set in Parsing Expressive Grammar (PEG.js)?

廉价感情. 提交于 2019-12-24 12:11:20

问题


I am writing a lexer for Haskell using JavaScript and Parsing Expression Grammar, the implementation I use being PEG.js.
I have a problem with making it work for reserved words, as demonstrated in a simplified form here:

program = ( word / " " )+  
word = ( reserved / id )  
id = ( "a" / "b" )+  
reserved = ( "aa" )

The point here is to get a series of tokens that are either arbitrary sequences of a:s and/or b:s or the sequence "aa", and they are separated by spaces.
What I really get is either that every token that is not a space is recognized as id or that a token that should be recognised as id has all initial pairs of a:s eaten up as reserved, e.g.
"aab" gets recognized as reserved "aa" followed by id "b".

The way the Haskell lexical specification solves this ambiguity is to specify id like this:

id = ( "a" / "b" )+[BUT NOT reserved]

I have tried replicating this using various combinations of the PEG ! and & -operators to acheive the same effect, but have not found a way to get this to work properly.
The solution:

id = !reserved ( "a" / "b" )+

that I've seen suggested in several places does not work.
Is this a limitation in the particular PEG-implementation, PEG in itself or (hopefully) my methods?

Thanks in advance!


回答1:


!reserved ident is a perfectly acceptable technique in any PEG implementation, and PEG.js seems to support it as well. Btw, you should add !id after the definition of reserved.




回答2:


As far as I know, PEG rules are positional. That basically means that rules are tried deterministically from the first to the last one. That said, you just need to put the "reserved" rule before declaring the "identifier" one.



来源:https://stackoverflow.com/questions/4933788/excluding-certain-elements-from-a-specified-set-in-parsing-expressive-grammar-p

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!