How to implement JavaScript/ECMAScript “no LineTerminator here” rule in JavaCC?

后端 未结 2 872
一整个雨季
一整个雨季 2021-01-22 02:23

I continue working on my JavaCC grammar for ECMAScript 5.1. It actually goes quite well, I think I\'ve covered most of the expressions now.

I have now two questions, bot

2条回答
  •  心在旅途
    2021-01-22 03:14

    Update As Gunther pointed out, my original solution was not correct due to this paragraph in 7.4 of the spec:

    Comments behave like white space and are discarded except that, if a MultiLineComment contains a line terminator character, then the entire comment is considered to be a LineTerminator for purposes of parsing by the syntactic grammar.

    I'm posting a correction but leaving my original solution at the end of the question.

    Corrected solution

    The core idea, as proposed by Theodore Norvell is to use semantic lookahead. However I have decided to implement a more safe check:

    public static boolean precededByLineTerminator(Token token) {
        for (Token specialToken = token.specialToken; specialToken != null; specialToken = specialToken.specialToken) {
            if (specialToken.kind == EcmaScriptParserConstants.LINE_TERMINATOR) {
                return true;
            } else if (specialToken.kind == EcmaScriptParserConstants.MULTI_LINE_COMMENT) {
                final String image = specialToken.image;
                if (StringUtils.containsAny(image, (char)0x000A, (char)0x000D, (char)0x2028,
                        (char)0x2029)) {
                    return true;
                }
            }
        }
        return false;
    }
    

    And the grammar is:

    expression = LeftHandSideExpression()
    (
        LOOKAHEAD ( , { !TokenUtils.precededByLineTerminator(getToken(1))} )
        
        {
            return expression.postIncr();
        }
    |   LOOKAHEAD ( , { !TokenUtils.precededByLineTerminator(getToken(1))} )
        
        {
            return expression.postDecr();
        }
    ) ?
    {
        return expression;
    }
    

    So the ++ or -- are considered here iff they are not preceded by a line terminator.


    Original solution

    This not is how I finally solved it.

    The core idea, as proposed by Theodore Norvell is to use semantic lookahead. However I have decided to implement a more safe check:

    public static boolean precededBySpecialTokenOfKind(Token token, int kind) {
        for (Token specialToken = token.specialToken; specialToken != null; specialToken = specialToken.specialToken) {
            if (specialToken.kind == kind) {
                return true;
            }
        }
        return false;
    }
    

    And the grammar is:

    expression = LeftHandSideExpression()
    (
        LOOKAHEAD ( , { !TokenUtils.precededBySpecialTokenOfKind(getToken(1), LINE_TERMINATOR)} )
        
        {
            return expression.postIncr();
        }
    |   LOOKAHEAD ( , { !TokenUtils.precededBySpecialTokenOfKind(getToken(1), LINE_TERMINATOR)} )
        
        {
            return expression.postDecr();
        }
    ) ?
    {
        return expression;
    }
    

    So the ++ or -- are considered here iff they are not preceded by a line terminator.

提交回复
热议问题