ANTLR on a noisy data stream

前端 未结 1 1096
温柔的废话
温柔的废话 2021-01-22 16:39

I\'m very new in the ANTLR world and I\'m trying to figure out how can I use this parsing tool to interpret a set of \"noisy\" string. What I would like to achieve is the follow

1条回答
  •  执笔经年
    2021-01-22 17:42

    You could create only a couple of lexer rules (the ones you posted, for example), and as a last lexer rule, you could match any character and skip() it:

    VERB            : 'SLEEPING' | 'WALKING';
    SUBJECT         : 'CAT'|'DOG'|'BIRD';
    INDIRECT_OBJECT : 'CAR'| 'SOFA';
    ANY             : . {skip();};
    

    The order is important here: the lexer tries to match tokens from top to bottom, so if it can't match any of the tokens VERB, SUBJECT or INDIRECT_OBJECT, it "falls through" to the ANY rule and skips this token. You can then use these parser rules to filter your input stream:

    parse
      :  sentenceParts+ EOF
      ;
    
    sentenceParts
      :  SUBJECT VERB INDIRECT_OBJECT
      ;  
    

    which will parse the input text:

    It's 10PM and the Lazy CAT is currently SLEEPING heavily on the SOFA in front of the TV. The DOG is WALKING on the SOFA.

    as follows:

    alt text

    0 讨论(0)
提交回复
热议问题