问题
I want code that can analyze a function call like this:
whatever(foo, baz(), 'puppet', 24+2, meow=3, *meowargs, **meowargs)
And return the positions of each and every argument, in this case foo
, baz()
, 'puppet'
, 24+2
, meow=3
, *meowargs
, **meowargs
.
I tried using the _ast
module, and it seems to be just the thing for the job, but unfortunately there were problems. For example, in an argument like baz()
which is a function call itself, I couldn't find a simple way to get its length. (And even if I found one, I don't want a bunch of special cases for every different kind of argument.)
I also looked at the tokenize
module but couldn't see how to use it to get the arguments.
Any idea how to solve this?
回答1:
This code uses a combination of ast
(to find the initial argument offsets) and regular expressions (to identify boundaries of the arguments):
import ast
import re
def collect_offsets(call_string):
def _abs_offset(lineno, col_offset):
current_lineno = 0
total = 0
for line in call_string.splitlines():
current_lineno += 1
if current_lineno == lineno:
return col_offset + total
total += len(line)
# parse call_string with ast
call = ast.parse(call_string).body[0].value
# collect offsets provided by ast
offsets = []
for arg in call.args:
a = arg
while isinstance(a, ast.BinOp):
a = a.left
offsets.append(_abs_offset(a.lineno, a.col_offset))
for kw in call.keywords:
offsets.append(_abs_offset(kw.value.lineno, kw.value.col_offset))
if call.starargs:
offsets.append(_abs_offset(call.starargs.lineno, call.starargs.col_offset))
if call.kwargs:
offsets.append(_abs_offset(call.kwargs.lineno, call.kwargs.col_offset))
offsets.append(len(call_string))
return offsets
def argpos(call_string):
def _find_start(prev_end, offset):
s = call_string[prev_end:offset]
m = re.search('(\(|,)(\s*)(.*?)$', s)
return prev_end + m.regs[3][0]
def _find_end(start, next_offset):
s = call_string[start:next_offset]
m = re.search('(\s*)$', s[:max(s.rfind(','), s.rfind(')'))])
return start + m.start()
offsets = collect_offsets(call_string)
result = []
# previous end
end = 0
# given offsets = [9, 14, 21, ...],
# zip(offsets, offsets[1:]) returns [(9, 14), (14, 21), ...]
for offset, next_offset in zip(offsets, offsets[1:]):
#print 'I:', offset, next_offset
start = _find_start(end, offset)
end = _find_end(start, next_offset)
#print 'R:', start, end
result.append((start, end))
return result
if __name__ == '__main__':
try:
while True:
call_string = raw_input()
positions = argpos(call_string)
for p in positions:
print ' ' * p[0] + '^' + ((' ' * (p[1] - p[0] - 2) + '^') if p[1] - p[0] > 1 else '')
print positions
except EOFError, KeyboardInterrupt:
pass
Output:
whatever(foo, baz(), 'puppet', 24+2, meow=3, *meowargs, **meowargs)
^ ^
^ ^
^ ^
^ ^
^ ^
^ ^
^ ^
[(9, 12), (14, 19), (21, 29), (31, 35), (37, 43), (45, 54), (56, 66)]
f(1, len(document_text) - 1 - position)
^
^ ^
[(2, 3), (5, 38)]
回答2:
You may want to get the abstract syntax tree for a function call of your function.
Here is a python recipe to do so, based on ast
module.
Python's ast module is used to parse the code string and create an ast Node. It then walks through the resultant ast.AST node to find the features using a NodeVisitor subclass.
Function explain
does the parsing. Here is you analyse your function call, and what you get
>>> explain('mymod.nestmod.func("arg1", "arg2", kw1="kword1", kw2="kword2",
*args, **kws')
[Call( args=['arg1', 'arg2'],keywords={'kw1': 'kword1', 'kw2': 'kword2'},
starargs='args', func='mymod.nestmod.func', kwargs='kws')]
回答3:
If I understand correctly, from your example you want something like:
--> arguments("whatever(foo, baz(), 'puppet', 24+2, meow=3, *meowargs, **meowkwds)")
{
'foo': slice(9, 12),
'baz()': slice(14, 19),
'24+2': slice(21, 29),
'meow=3': slice(32, 38),
'*meowargs': slice(41, 50),
'**meowkwds': slice(53, 63),
}
Note that I changed the name of your last argument, as you can't have two arguments with the same name.
If this is what you want then you need to have the original string in question (shouldn't be a problem if your building an IDE), and you need a string parser. A simple state machine should do the trick.
来源:https://stackoverflow.com/questions/16635254/parsing-python-function-calls-to-get-argument-positions