Why does Python 3 allow “00” as a literal for 0 but not allow “01” as a literal for 1?

前端 未结 3 2015
情书的邮戳
情书的邮戳 2020-12-08 09:06

Why does Python 3 allow \"00\" as a literal for 0 but not allow \"01\" as a literal for 1? Is there a good reason? This inconsistency baffles me. (And we\'re talking about P

相关标签:
3条回答
  • 2020-12-08 09:21

    Per https://docs.python.org/3/reference/lexical_analysis.html#integer-literals:

    Integer literals are described by the following lexical definitions:

    integer        ::=  decimalinteger | octinteger | hexinteger | bininteger
    decimalinteger ::=  nonzerodigit digit* | "0"+
    nonzerodigit   ::=  "1"..."9"
    digit          ::=  "0"..."9"
    octinteger     ::=  "0" ("o" | "O") octdigit+
    hexinteger     ::=  "0" ("x" | "X") hexdigit+
    bininteger     ::=  "0" ("b" | "B") bindigit+
    octdigit       ::=  "0"..."7"
    hexdigit       ::=  digit | "a"..."f" | "A"..."F"
    bindigit       ::=  "0" | "1"
    

    There is no limit for the length of integer literals apart from what can be stored in available memory.

    Note that leading zeros in a non-zero decimal number are not allowed. This is for disambiguation with C-style octal literals, which Python used before version 3.0.

    As noted here, leading zeros in a non-zero decimal number are not allowed. "0"+ is legal as a very special case, which wasn't present in Python 2:

    integer        ::=  decimalinteger | octinteger | hexinteger | bininteger
    decimalinteger ::=  nonzerodigit digit* | "0"
    octinteger     ::=  "0" ("o" | "O") octdigit+ | "0" octdigit+
    

    SVN commit r55866 implemented PEP 3127 in the tokenizer, which forbids the old 0<octal> numbers. However, curiously, it also adds this note:

    /* in any case, allow '0' as a literal */
    

    with a special nonzero flag that only throws a SyntaxError if the following sequence of digits contains a nonzero digit.

    This is odd because PEP 3127 does not allow this case:

    This PEP proposes that the ability to specify an octal number by using a leading zero will be removed from the language in Python 3.0 (and the Python 3.0 preview mode of 2.6), and that a SyntaxError will be raised whenever a leading "0" is immediately followed by another digit.

    (emphasis mine)

    So, the fact that multiple zeros are allowed is technically violating the PEP, and was basically implemented as a special case by Georg Brandl. He made the corresponding documentation change to note that "0"+ was a valid case for decimalinteger (previously that had been covered under octinteger).

    We'll probably never know exactly why Georg chose to make "0"+ valid - it may forever remain an odd corner case in Python.


    UPDATE [28 Jul 2015]: This question led to a lively discussion thread on python-ideas in which Georg chimed in:

    Steven D'Aprano wrote:

    Why was it defined that way? [...] Why would we write 0000 to get zero?

    I could tell you, but then I'd have to kill you.

    Georg

    Later on, the thread spawned this bug report aiming to get rid of this special case. Here, Georg says:

    I don't recall the reason for this deliberate change (as seen from the docs change).

    I'm unable to come up with a good reason for this change now [...]

    and thus we have it: the precise reason behind this inconsistency is lost to time.

    Finally, note that the bug report was rejected: leading zeros will continue to be accepted only on zero integers for the rest of Python 3.x.

    0 讨论(0)
  • 2020-12-08 09:23

    Python2 used the leading zero to specify octal numbers:

    >>> 010
    8
    

    To avoid this (misleading?) behaviour, Python3 requires explicit prefixes 0b, 0o, 0x:

    >>> 0o10
    8
    
    0 讨论(0)
  • 2020-12-08 09:39

    It's a special case ("0"+)

    2.4.4. Integer literals

    Integer literals are described by the following lexical definitions:
    
    integer        ::=  decimalinteger | octinteger | hexinteger | bininteger
    decimalinteger ::=  nonzerodigit digit* | "0"+
    nonzerodigit   ::=  "1"..."9"
    digit          ::=  "0"..."9"
    octinteger     ::=  "0" ("o" | "O") octdigit+
    hexinteger     ::=  "0" ("x" | "X") hexdigit+
    bininteger     ::=  "0" ("b" | "B") bindigit+
    octdigit       ::=  "0"..."7"
    hexdigit       ::=  digit | "a"..."f" | "A"..."F"
    bindigit       ::=  "0" | "1"
    

    If you look at the grammar, it's easy to see that 0 need a special case. I'm not sure why the '+' is considered necessary there though. Time to dig through the dev mailing list...


    Interesting to note that in Python2, more than one 0 was parsed as an octinteger (the end result is still 0 though)

    decimalinteger ::=  nonzerodigit digit* | "0"
    octinteger     ::=  "0" ("o" | "O") octdigit+ | "0" octdigit+
    
    0 讨论(0)
提交回复
热议问题