How to implement JavaScript/ECMAScript “no LineTerminator here” rule in JavaCC?

后端未结

关注

 2  873

I continue working on my JavaCC grammar for ECMAScript 5.1. It actually goes quite well, I think I\'ve covered most of the expressions now.

I have now two questions, bot

相关标签:

2条回答

傲寒

2021-01-22 03:03

I think for the "restricted productions" you can do this

void PostfixExpression() : 
{} {
     LeftHandSideExpression() 
     (
         LOOKAHEAD( "++", {getToken(0).beginLine == getToken(1).beginLine})
         "++"
     |
         LOOKAHEAD( "--", {getToken(0).beginLine == getToken(1).beginLine})
         "--"
     |
         {}
     )
}

0 讨论(0)

心在旅途

2021-01-22 03:14

Update As Gunther pointed out, my original solution was not correct due to this paragraph in 7.4 of the spec:

Comments behave like white space and are discarded except that, if a MultiLineComment contains a line terminator character, then the entire comment is considered to be a LineTerminator for purposes of parsing by the syntactic grammar.

I'm posting a correction but leaving my original solution at the end of the question.

Corrected solution

The core idea, as proposed by Theodore Norvell is to use semantic lookahead. However I have decided to implement a more safe check:

public static boolean precededByLineTerminator(Token token) {
    for (Token specialToken = token.specialToken; specialToken != null; specialToken = specialToken.specialToken) {
        if (specialToken.kind == EcmaScriptParserConstants.LINE_TERMINATOR) {
            return true;
        } else if (specialToken.kind == EcmaScriptParserConstants.MULTI_LINE_COMMENT) {
            final String image = specialToken.image;
            if (StringUtils.containsAny(image, (char)0x000A, (char)0x000D, (char)0x2028,
                    (char)0x2029)) {
                return true;
            }
        }
    }
    return false;
}

And the grammar is:

expression = LeftHandSideExpression()
(
    LOOKAHEAD ( <INCR>, { !TokenUtils.precededByLineTerminator(getToken(1))} )
    <INCR>
    {
        return expression.postIncr();
    }
|   LOOKAHEAD ( <DECR>, { !TokenUtils.precededByLineTerminator(getToken(1))} )
    <DECR>
    {
        return expression.postDecr();
    }
) ?
{
    return expression;
}

So the ++ or -- are considered here iff they are not preceded by a line terminator.

Original solution

This not is how I finally solved it.

The core idea, as proposed by Theodore Norvell is to use semantic lookahead. However I have decided to implement a more safe check:

public static boolean precededBySpecialTokenOfKind(Token token, int kind) {
    for (Token specialToken = token.specialToken; specialToken != null; specialToken = specialToken.specialToken) {
        if (specialToken.kind == kind) {
            return true;
        }
    }
    return false;
}

And the grammar is:

expression = LeftHandSideExpression()
(
    LOOKAHEAD ( <INCR>, { !TokenUtils.precededBySpecialTokenOfKind(getToken(1), LINE_TERMINATOR)} )
    <INCR>
    {
        return expression.postIncr();
    }
|   LOOKAHEAD ( <DECR>, { !TokenUtils.precededBySpecialTokenOfKind(getToken(1), LINE_TERMINATOR)} )
    <DECR>
    {
        return expression.postDecr();
    }
) ?
{
    return expression;
}

So the ++ or -- are considered here iff they are not preceded by a line terminator.

0 讨论(0)