Traversal of tokens using ParserRuleContext in listener - ANTLR4

问题

While iterating over the tokens using a Listener, I would like to know how to use the ParserRuleContext to peek at the next token or the next few tokens in the token stream?

In the code below I am trying to peek at all the tokens after the current token till the EOF:

@Override 
public void enterSemicolon(JavaParser.SemicolonContext ctx) {

    Token tok, semiColon = ctx.getStart();  
    int currentIndex = semiColon.getStartIndex();
    int reqInd = currentIndex+1;
    TokenSource tokSrc= semiColon.getTokenSource();
    CharStream srcStream = semiColon.getInputStream();
    srcStream.seek(currentIndex);

    while(true){

        tok = tokSrc.nextToken() ;
        System.out.println(tok);
        if(tok.getText()=="<EOF>"){break;}
        srcStream.seek(reqInd++);
    }
}

But the output I get is:

            .
            .
            .
            .
            .
[@-1,131:130='',<-1>,13:0]
[@-1,132:131='',<-1>,13:0]
[@-1,133:132='',<-1>,13:0]
[@-1,134:133='',<-1>,13:0]
[@-1,135:134='',<-1>,13:0]
[@-1,136:135='',<-1>,13:0]
[@-1,137:136='',<-1>,13:0]
[@-1,138:137='',<-1>,13:0]
[@-1,139:138='',<-1>,13:0]
[@-1,140:139='',<-1>,13:0]
[@-1,141:140='',<-1>,13:0]
[@-1,142:141='',<-1>,13:0]
[@-1,143:142='',<-1>,13:0]
[@-1,144:143='',<-1>,13:0]
[@-1,145:144='',<-1>,13:0]
[@-1,146:145='',<-1>,13:0]
[@-1,147:146='',<-1>,13:0]
[@-1,148:147='',<-1>,13:0]
[@-1,149:148='',<-1>,13:0]
[@-1,150:149='',<-1>,13:0]
[@-1,151:150='',<-1>,13:0]
[@-1,152:151='',<-1>,13:0]
[@-1,153:152='',<-1>,13:0]
[@-1,154:153='',<-1>,13:0]
[@-1,155:154='',<-1>,13:0]
[@-1,156:155='',<-1>,13:0]
[@-1,157:156='',<-1>,13:0]
[@-1,158:157='',<-1>,13:0]
[@-1,159:158='',<-1>,13:0]
[@-1,160:159='',<-1>,13:0]
[@-1,161:160='<EOF>',<-1>,13:0]
[@-1,137:136='',<-1>,13:0]
[@-1,138:137='',<-1>,13:0]
[@-1,139:138='',<-1>,13:0]
[@-1,140:139='',<-1>,13:0]
[@-1,141:140='',<-1>,13:0]
[@-1,142:141='',<-1>,13:0]
[@-1,143:142='',<-1>,13:0]
[@-1,144:143='',<-1>,13:0]
[@-1,145:144='',<-1>,13:0]
[@-1,146:145='',<-1>,13:0]
[@-1,147:146='',<-1>,13:0]
[@-1,148:147='',<-1>,13:0]
[@-1,149:148='',<-1>,13:0]
[@-1,150:149='',<-1>,13:0]
[@-1,151:150='',<-1>,13:0]
[@-1,152:151='',<-1>,13:0]
[@-1,153:152='',<-1>,13:0]
[@-1,154:153='',<-1>,13:0]
[@-1,155:154='',<-1>,13:0]
[@-1,156:155='',<-1>,13:0]
[@-1,157:156='',<-1>,13:0]
[@-1,158:157='',<-1>,13:0]
[@-1,159:158='',<-1>,13:0]
[@-1,160:159='',<-1>,13:0]
[@-1,161:160='<EOF>',<-1>,13:0]
            .
            .
            .
            .

We see that although I am able to traverse through all the tokens till EOF, I unable to get the actual content or type of the tokens. I would like to know if there is a neat way of doing this using listener traversing.

回答1:

Hard to be certain, but

tok = tokSrc.nextToken() ;

appears to be rerunning the lexer, starting at a presumed proper token boundary, but without having reset the lexer. The lexer throwing errors might explain the observed behavior.

Still, a better approach would be to simply recover the existing Token stream:

public class Walker implements YourJavaListener {

    CommonTokenStream tokens;

    public Walker(JavaParser parser) {
        tokens = (CommonTokenStream) parser.getTokenStream()
    }

then access the stream to get the desired tokens:

@Override 
public void enterSemicolon(JavaParser.SemicolonContext ctx) {
    TerminalNode semi = ctx.semicolon(); // adjust as needed for your impl.
    Token tok = semi.getSymbol();
    int idx = tok.getTokenIndex();

    while(tok.getType() != IntStream.EOF) {
        System.out.println(tok);
        tok = tokens.get(idx++);
    }
}

An entirely different approach that might serve your ultimate purpose is to get a limited set of tokens directly from the parent context:

ParserRuleContext pctx = ctx.getParent();
List<TerminalNode> nodes = pctx.getTokens(pctx.getStart(), pctx.getStop());

来源：https://stackoverflow.com/questions/29941553/traversal-of-tokens-using-parserrulecontext-in-listener-antlr4

标签

java

antlr

abstract-syntax-tree

antlr4

static-analysis