Does my code prevent directory traversal?
Solution 1
Your code does not prevent directory traversal. You can guard against this with the os.path module.
>>> import os.path
>>> os.curdir
'.'
>>> startdir = os.path.abspath(os.curdir)
>>> startdir
'/home/jterrace'
startdir
is now an absolute path where you don't want to allow the path to go outside of. Now let's say we get a filename from the user and they give us the malicious /etc/passwd
.
>>> filename = "/etc/passwd"
>>> requested_path = os.path.relpath(filename, startdir)
>>> requested_path
'../../etc/passwd'
>>> requested_path = os.path.abspath(requested_path)
>>> requested_path
'/etc/passwd'
We have now converted their path into an absolute path relative to our starting path. Since this wasn't in the starting path, it doesn't have the prefix of our starting path.
>>> os.path.commonprefix([requested_path, startdir])
'/'
You can check for this in your code. If the commonprefix function returns a path that doesn't start with startdir
, then the path is invalid and you should not return the contents.
The above can be wrapped to a static method like so:
import os
def is_directory_traversal(file_name):
current_directory = os.path.abspath(os.curdir)
requested_path = os.path.relpath(file_name, start=current_directory)
requested_path = os.path.abspath(requested_path)
common_prefix = os.path.commonprefix([requested_path, current_directory])
return common_prefix != current_directory
Solution 2
Use only the base name of the user inputed file:
file_name = request.path_params["file"]
file_name = os.path.basename(file_name)
file = open(os.path.join("/path", file_name), "rb")
os.path.basename
strips ../
from the path:
>>> os.path.basename('../../filename')
'filename'
Solution 3
There's a much simpler solution here:
relative_path = os.path.relpath(path, start=self.test_directory)
has_dir_traversal = relative_path.startswith(os.pardir)
relpath
takes care of normalising path for us. And if the relative path starts with ..
, then you don't allow it.
Related videos on Youtube
deamon
Updated on June 04, 2022Comments
-
deamon almost 2 years
Is the following code snippet from a Python WSGI app safe from directory traversal? It reads a file name passed as parameter and returns the named file.
file_name = request.path_params["file"] file = open(file_name, "rb") mime_type = mimetypes.guess_type(file_name)[0] start_response(status.OK, [('Content-Type', mime_type)]) return file
I mounted the app under
http://localhost:8000/file/{file}
and sent requests with the URLshttp://localhost:8000/file/../alarm.gif
andhttp://localhost:8000/file/%2e%2e%2falarm.gif
. But none of my attempts delivered the (existing) file. So is my code already safe from directory traversal?New approach
It seems like the following code prevents directory traversal:
file_name = request.path_params["file"] absolute_path = os.path.join(self.base_directory, file_name) normalized_path = os.path.normpath(absolute_path) # security check to prevent directory traversal if not normalized_path.startswith(self.base_directory): raise IOError() file = open(normalized_path, "rb") mime_type = mimetypes.guess_type(normalized_path)[0] start_response(status.OK, [('Content-Type', mime_type)]) return file
-
KatrielWhat happens if you give it an absolute path?
-
-
Graham Dumpleton almost 13 yearsJust don't rely on doing things relative to the current working directory as that can be anything in a web application. Should always base it off an absolute path as starting point, be that hard wired or calculated from __file__.
-
jterrace almost 13 years@graham-dumpleton This has nothing to do with relative or absolute. A path can always break out unless you do this type of sanity check.
-
Graham Dumpleton almost 13 yearsYou don't seem to quite get what I am referring to. You gave in your example 'startdir = os.path.abspath(os.curdir)'. That will set 'startdir' to the current working directory. In a Python web application there are no guarantees what the current working directory will be. So, it is a poor example to use because people will blindly cut and paste the code without understanding that they should be anchoring it at an absolute path which has meaning for their application rather than relying on whatever os.getcwd() is going to return when os.path.abspath(os.curdir) is called.
-
deamon almost 13 yearsThis does not prevent directory traversal since
file_name
can contain../
! But your code was helpful anyway. -
deamon almost 13 yearsSorry, your solution seems to work too. But it would be limited to a single directory level (without sub directories). I'll correct my down vote if you modify your answer slightly so that I can vote again).
-
evandrix about 9 yearsYou can test this code snippet out against a directory traversal attack fuzzer like dotdotpwn @ github.com/wireghoul/dotdotpwn, eg. './dotdotpwn.pl -m http-url -u "google.com/?q=TRAVERSAL" -O -k "root:"'
-
Clodoaldo Neto about 9 years@daemon
os.path.basename
strips../
from the path. Check the update answer. -
Flimm about 8 years
os.path.basename
does strip../
, but it also strips all path related information.os.path.basename("a/b/c")
returns"c"
-
Flimm about 8 yearsNote this depends on
startdir
ending with a forward slash, which it doesn't in this case! This code will incorrectly accept the input../jterrace-attack/foobar
. -
Clodoaldo Neto about 8 years@Flimm: Yes, that is intended.