Is the following code snippet from a Python WSGI app safe from directory traversal? It reads a file name passed as parameter and returns the named file.
file_nam
Your code does not prevent directory traversal. You can guard against this with the os.path module.
>>> import os.path
>>> os.curdir
'.'
>>> startdir = os.path.abspath(os.curdir)
>>> startdir
'/home/jterrace'
startdir
is now an absolute path where you don't want to allow the path to go outside of. Now let's say we get a filename from the user and they give us the malicious /etc/passwd
.
>>> filename = "/etc/passwd"
>>> requested_path = os.path.relpath(filename, startdir)
>>> requested_path
'../../etc/passwd'
>>> requested_path = os.path.abspath(requested_path)
>>> requested_path
'/etc/passwd'
We have now converted their path into an absolute path relative to our starting path. Since this wasn't in the starting path, it doesn't have the prefix of our starting path.
>>> os.path.commonprefix([requested_path, startdir])
'/'
You can check for this in your code. If the commonprefix function returns a path that doesn't start with startdir
, then the path is invalid and you should not return the contents.
The above can be wrapped to a static method like so:
import os
def is_directory_traversal(file_name):
current_directory = os.path.abspath(os.curdir)
requested_path = os.path.relpath(file_name, start=current_directory)
requested_path = os.path.abspath(requested_path)
common_prefix = os.path.commonprefix([requested_path, current_directory])
return common_prefix != current_directory
There's a much simpler solution here:
relative_path = os.path.relpath(path, start=self.test_directory)
has_dir_traversal = relative_path.startswith(os.pardir)
relpath
takes care of normalising path for us. And if the relative path starts with ..
, then you don't allow it.
Use only the base name of the user inputed file:
file_name = request.path_params["file"]
file_name = os.path.basename(file_name)
file = open(os.path.join("/path", file_name), "rb")
os.path.basename
strips ../
from the path:
>>> os.path.basename('../../filename')
'filename'