When I try to use print
without parentheses on a simple name in Python 3.4 I get:
>>> print max
Traceback (most recent call last):
...
Looking at the source code for exceptions.c, right above _set_legacy_print_statement_msg
there's this nice block comment:
/* To help with migration from Python 2, SyntaxError.__init__ applies some
* heuristics to try to report a more meaningful exception when print and
* exec are used like statements.
*
* The heuristics are currently expected to detect the following cases:
* - top level statement
* - statement in a nested suite
* - trailing section of a one line complex statement
*
* They're currently known not to trigger:
* - after a semi-colon
*
* The error message can be a bit odd in cases where the "arguments" are
* completely illegal syntactically, but that isn't worth the hassle of
* fixing.
*
* We also can't do anything about cases that are legal Python 3 syntax
* but mean something entirely different from what they did in Python 2
* (omitting the arguments entirely, printing items preceded by a unary plus
* or minus, using the stream redirection syntax).
*/
So there's some interesting info. In addition, in the SyntaxError_init
method in the same file, we can see
/*
* Issue #21669: Custom error for 'print' & 'exec' as statements
*
* Only applies to SyntaxError instances, not to subclasses such
* as TabError or IndentationError (see issue #31161)
*/
if ((PyObject*)Py_TYPE(self) == PyExc_SyntaxError &&
self->text && PyUnicode_Check(self->text) &&
_report_missing_parentheses(self) < 0) {
return -1;
}
Note also that the above references issue #21669 on the python bugtracker with some discussion between the author and Guido about how to go about this. So we follow the rabbit (that is, _report_missing_parentheses
) which is at the very bottom of the file, and see...
legacy_check_result = _check_for_legacy_statements(self, 0);
However, there are some cases where this is bypassed and the normal SyntaxError
message is printed, see MSeifert's answer for more about that. If we go one function up to _check_for_legacy_statements
we finally see the actual check for legacy print statements.
/* Check for legacy print statements */
if (print_prefix == NULL) {
print_prefix = PyUnicode_InternFromString("print ");
if (print_prefix == NULL) {
return -1;
}
}
if (PyUnicode_Tailmatch(self->text, print_prefix,
start, text_len, -1)) {
return _set_legacy_print_statement_msg(self, start);
}
So, to answer the question: "Why isn't Python able to detect the problem earlier?", I would say the problem with parentheses isn't what is detected; it is actually parsed after the syntax error. It's a syntax error the whole time, but the actual minor piece about parentheses is caught afterwards just to give an additional hint.