It states in the Python documentation that pickle
is not secure and shouldn\'t parse untrusted user input. If you research this; almost all examples demonstrat
The name of the module (os
) is part of the opcode, and pickle
automatically imports the module:
# pickle.py
def find_class(self, module, name):
# Subclasses may override this
__import__(module)
mod = sys.modules[module]
klass = getattr(mod, name)
return klass
Note the __import__(module)
line.
The function is called when the GLOBAL 'os system'
pickle bytecode instruction is executed.
This mechanism is necessary in order to be able to unpickle instances of classes whose modules haven't been explicitly imported into the caller's namespace.
If you use pickletools.dis to disassemble the pickle you can see how this is working:
import pickletools
print pickletools.dis("cos\nsystem\n(S'ls ~'\ntR.")
Output:
0: c GLOBAL 'os system'
11: ( MARK
12: S STRING 'ls ~'
20: t TUPLE (MARK at 11)
21: R REDUCE
22: . STOP
Pickle uses a simple stack-based virtual machine that records the instructions used to reconstruct the object. In other words the pickled instructions in your example are:
Push self.find_class(module_name, class_name) i.e. push os.system Push the string 'ls ~' Build tuple from topmost stack items Apply callable to argtuple, both on stack. i.e. os.system(*('ls ~',))
Source
For altogether too much information on writing malicious Pickles that go much further than the standard os.system() example, see this presentation and its accompanying paper.
Importing a module only adds it to the local namespace, which is not necessarily the one you're in. Except when it doesn't:
>>> dir()
['__builtins__', '__doc__', '__name__', '__package__']
>>> __import__('os')
<module 'os' from '/usr/lib64/python2.7/os.pyc'>
>>> dir()
['__builtins__', '__doc__', '__name__', '__package__']