Is D's grammar really context-free?

后端 未结 7 1131
萌比男神i
萌比男神i 2021-01-30 03:18

I\'ve posted this on the D newsgroup some months ago, but for some reason, the answer never really convinced me, so I thought I\'d ask it here.


The grammar of D is

7条回答
  •  [愿得一人]
    2021-01-30 03:28

    The property of being context free is a very formal concept; you can find a definition here. Note that it applies to grammars: a language is said to be context free if there is at least one context free grammar that recognizes it. Note that there may be other grammars, possibly non context free, that recognize the same language.

    Basically what it means is that the definition of a language element cannot change according to which elements surround it. By language elements I mean concepts like expressions and identifiers and not specific instances of these concepts inside programs, like a + b or count.

    Let's try and build a concrete example. Consider this simple COBOL statement:

       01 my-field PICTURE 9.9 VALUE 9.9.
    

    Here I'm defining a field, i.e. a variable, which is dimensioned to hold one integral digit, the decimal point, and one decimal digit, with initial value 9.9 . A very incomplete grammar for this could be:

    field-declaration ::= level-number identifier 'PICTURE' expression 'VALUE' expression '.'
    expression ::= digit+ ( '.' digit+ )
    

    Unfortunately the valid expressions that can follow PICTURE are not the same valid expressions that can follow VALUE. I could rewrite the second production in my grammar as follows:

    'PICTURE' expression ::= digit+ ( '.' digit+ ) | 'A'+ | 'X'+
    'VALUE' expression ::= digit+ ( '.' digit+ )
    

    This would make my grammar context-sensitive, because expression would be a different thing according to whether it was found after 'PICTURE' or after 'VALUE'. However, as it has been pointed out, this doesn't say anything about the underlying language. A better alternative would be:

    field-declaration ::= level-number identifier 'PICTURE' format 'VALUE' expression '.'
    format ::= digit+ ( '.' digit+ ) | 'A'+ | 'X'+
    expression ::= digit+ ( '.' digit+ )
    

    which is context-free.

    As you can see this is very different from your understanding. Consider:

    a = b + c;
    

    There is very little you can say about this statement without looking up the declarations of a,b and c, in any of the languages for which this is a valid statement, however this by itself doesn't imply that any of those languages is not context free. Probably what is confusing you is the fact that context freedom is different from ambiguity. This a simplified version of your C++ example:

    a < b > (c)
    

    This is ambiguous in that by looking at it alone you cannot tell whether this is a function template call or a boolean expression. The previous example on the other hand is not ambiguous; From the point of view of grammars it can only be interpreted as:

    identifier assignment identifier binary-operator identifier semi-colon
    

    In some cases you can resolve ambiguities by introducing context sensitivity at the grammar level. I don't think this is the case with the ambiguous example above: in this case you cannot eliminate the ambiguity without knowing whether a is a template or not. Note that when such information is not available, for instance when it depends on a specific template specialization, the language provides ways to resolve ambiguities: that is why you sometimes have to use typename to refer to certain types within templates or to use template when you call member function templates.

提交回复
热议问题