I need a function which takes one of python\'s operator symbols or keywords as a string, along with its operands, evaluates it, and returns the result. Like
you can use a crude regex. we can do:
import re, operator
def get_symbol(op):
sym = re.sub(r'.*\w\s?(\S+)\s?\w.*','\\1',getattr(operator,op).__doc__)
if re.match('^\\W+$',sym):return sym
Examples:
get_symbol('matmul')
'@'
get_symbol('add')
'+'
get_symbol('eq')
'=='
get_symbol('le')
'<='
get_symbol('mod')
'%'
get_symbol('inv')
'~'
get_symbol('ne')
'!='
Just to mention a few. You could also do:
{get_symbol(i):i for i in operator.__all__}
This gives you a dictionary with the symbols. You will see that somethings like abs
gives gives incorrect since there is no symbolic version implemented
Python does not map symbols to operator
functions. It interprets symbols by calling special dunder
methods.
For example, when you write 2 * 3
, it doesn't call mul(2, 3)
; it calls some C code that figures out whether to use two.__mul__
, three.__rmul__
, or the C-type equivalents (the slots nb_multiply
and sq_repeat
are both equivalent to both __mul__
and __rmul__
). You can call that same code from a C extension module as PyNumber_Multiply(two, three). If you look at the source to operator.mul, it's a completely separate function that calls the same PyNumber_Multiply
.
So, there is no mapping from *
to operator.mul
for Python to expose.
If you want to do this programmatically, the best I can think of is to parse the docstrings of the operator
functions (or, maybe, the operator.c source). For example:
runary = re.compile(r'Same as (.+)a')
rbinary = re.compile(r'Same as a (.+) b')
unary_ops, binary_ops = {}, {}
funcnames = dir(operator)
for funcname in funcnames:
if (not funcname.startswith('_') and
not (funcname.startswith('r') and funcname[1:] in funcnames) and
not (funcname.startswith('i') and funcname[1:] in funcnames)):
func = getattr(operator, funcname)
doc = func.__doc__
m = runary.search(doc)
if m:
unary_ops[m.group(1)] = func
m = rbinary.search(doc)
if m:
binary_ops[m.group(1)] = func
I don't think this misses anything, but it definitely has some false positive, like "a + b, for a "
as an operator that maps to operator.concat
and callable(
as an operator that maps to operator.isCallable
. (The exact set depends on your Python version.) Feel free to tweak the regexes, blacklist such methods, etc. to taste.
However, if you really want to write a parser, you're probably better off writing a parser for your actual language than writing a parser for the docstrings to generate your language parser…
If the language you're trying to parse is a subset of Python, Python does expose the internals to help you there. See the ast module for the starting point. You might still be happier with something like pyparsing
, but you should at least play with ast
. For example:
sentinel = object()
def string_op(op, arg1, arg2=sentinel):
s = '{} {}'.format(op, arg1) if arg2 is sentinel else '{} {} {}'.format(op, arg1, arg2)
a = ast.parse(s).body
Print out a
(or, better, ast.dump(a)
), play with it, etc. You'll still need to map from _ast.Add
to operator.add
, however. But if you want to instead map to an actual Python code
object… well, the code for that is available too.
If you're going to use such a map, why not map directly to functions instead of having a layer of indirection by name? For example:
symbol_func_map = {
'<': (lambda x, y: x < y),
'<=': (lambda x, y: x <= y),
'==': (lambda x, y: x == y),
#...
}
While this wouldn't be any more concise than your current implementation, it should get the correct behaviour in the majority of cases. The remaining problems are where a unary and a binary operator conflict, and those could be addressed by adding arity to the dictionary keys:
symbol_func_map = {
('<', 2): (lambda x, y: x < y),
('<=', 2): (lambda x, y: x <= y),
('==', 2): (lambda x, y: x == y),
('-', 2): (lambda x, y: x - y),
('-', 1): (lambda x: -x),
#...
}
You could use eval to generate lambda functions for the operators instead of using the operator
module. Eval is generally bad practice, but I think for this purpose it's fine because it's nothing really crazy.
def make_binary_op(symbol):
return eval('lambda x, y: x {0} y'.format(symbol))
operators = {}
for operator in '+ - * / ^ % (etc...)'.split(' '):
operators[operator] = make_binary_op(operator)
operators['*'](3, 5) # == 15