问题
I am using the tokenize
module in Python and wonder why there are 2 different newline tokens:
NEWLINE = 4
NL = 54
Any examples of code that would produce both tokens would be appreciated.
回答1:
According to python documentation:
tokenize.NL
Token value used to indicate a non-terminating newline. The NEWLINE token indicates the end of a logical line of Python code; NL tokens are generated when a logical line of code is continued over multiple physical lines.
More here: https://docs.python.org/2/library/tokenize.html
回答2:
There are at least 4 possible cases of '\n'
in Python code; 2 of them are codified by tokens:
Statement-terminating newline: tokenize.NEWLINE - this is the token more or less corresponding to the C or Java
;
.Any newline that does not terminate a statement, and does not belong to cases 3 or 4: tokenize.NL.
The newlines in multiline strings.
A newline that occurs at line-continuation
\
- contrary to what the documentation would seem to indicate, this case does not produce any token at all.
Thus the following example:
# case 1
a = 6
b = 7
# case 2
answer = (
a * b
)
# case 3
format = """
A multiline string
"""
# case 4
print "something that is continued" \
"on the following line."
Gives the all possible cases:
1,0-1,8: COMMENT '# case 1'
1,8-1,9: NL '\n'
2,0-2,1: NAME 'a'
2,2-2,3: OP '='
2,4-2,5: NUMBER '6'
2,5-2,6: NEWLINE '\n'
3,0-3,1: NAME 'b'
3,2-3,3: OP '='
3,4-3,5: NUMBER '7'
3,5-3,6: NEWLINE '\n'
4,0-4,1: NL '\n'
5,0-5,8: COMMENT '# case 2'
5,8-5,9: NL '\n'
6,0-6,6: NAME 'answer'
6,7-6,8: OP '='
6,9-6,10: OP '('
6,10-6,11: NL '\n'
7,4-7,5: NAME 'a'
7,6-7,7: OP '*'
7,8-7,9: NAME 'b'
7,9-7,10: NL '\n'
8,0-8,1: OP ')'
8,1-8,2: NEWLINE '\n'
9,0-9,1: NL '\n'
10,0-10,8: COMMENT '# case 3'
10,8-10,9: NL '\n'
11,0-11,6: NAME 'format'
11,7-11,8: OP '='
11,9-13,3: STRING '"""\nA multiline string\n"""'
13,3-13,4: NEWLINE '\n'
14,0-14,1: NL '\n'
15,0-15,8: COMMENT '# case 4'
15,8-15,9: NL '\n'
16,0-16,5: NAME 'print'
16,6-16,35: STRING '"something that is continued"'
17,4-17,28: STRING '"on the following line."'
17,28-17,29: NEWLINE '\n'
18,0-18,0: ENDMARKER ''
回答3:
In addition to the quote from the docs
The NEWLINE token indicates the end of a logical line of Python code; NL tokens are generated when a logical line of code is continued over multiple physical lines.
here is an example
def a_func(a, b):
pass
This will generate
1,0-1,3: NAME 'def'
1,4-1,10: NAME 'a_func'
1,10-1,11: OP '('
1,11-1,12: NAME 'a'
1,12-1,13: OP ','
1,14-1,15: NAME 'b'
1,15-1,16: OP ')'
1,16-1,17: OP ':'
1,17-1,18: NEWLINE '\n'
2,0-2,4: INDENT ' '
2,4-2,8: NAME 'pass'
2,8-2,9: NEWLINE '\n'
3,0-3,0: DEDENT ''
Whereas
def a_func(a,
b):
pass
will generate this
1,0-1,3: NAME 'def'
1,4-1,10: NAME 'a_func'
1,10-1,11: OP '('
1,11-1,12: NAME 'a'
1,12-1,13: OP ','
1,13-1,14: NL '\n'
2,11-2,12: NAME 'b'
2,12-2,13: OP ')'
2,13-2,14: OP ':'
2,14-2,15: NEWLINE '\n'
3,0-3,4: INDENT ' '
3,4-3,8: NAME 'pass'
3,8-3,9: NEWLINE '\n'
4,0-4,0: DEDENT ''
4,0-4,0: ENDMARKER ''
Note the 1,13-1,14: NL '\n'
after a,
Basically the difference between NEWLINE and NL is that NL is generated after a line that is not 'complete':
def a_func(a, b):
results in NEWLINE because the entire logical line is on 1 physical line
def another_func(a,
b)
results in NL, because the code for that 1 logical line is spread over 2 physical lines
来源:https://stackoverflow.com/questions/24519046/python-2-newline-tokens-in-tokenize-module