I have a directory bar
inside a directory foo
, with file foo_file.txt
in directory foo
and file bar_file.txt
in
os.path.relpath(arg1, arg2)
will give the relative path of arg2 from the directory of arg1. In order to get from arg2 to arg1 in your case, you would need to cd up one directory(..), go the bar directory(bar), and then the bar_file.txt. Therefore, the relative path is
../bar/bar_file.txt
os.path.relpath()
assumes that its arguments are directories.
>>> os.path.join(os.path.relpath(os.path.dirname('foo/bar/bar_file.txt'),
os.path.dirname('foo/foo_file.txt')),
os.path.basename('foo/bar/bar_file.txt'))
'bar/bar_file.txt'
relpath
has unexpected behavior. It treats all elements of a path as though it is a directory. So, in the path:
/path/to/a/file.txt
file.txt
is treated like a directory as well.
This means that when you run relpath
on two paths, say,
>>> from os.path import relpath
>>> relpath('/path/to/dest/file.txt', '/path/to/origin/file.txt')
'../../dest/file.txt'
This is incorrect. The true relative path from directory origin to dest is '../dest/file.txt'
This gets especially frustrating if you're trying to create symlinks and they end up being malformed.
To fix the problem, we must first find out if the path points to a file, if not we can do the comparison as usual, otherwise we need to remove the filename from the end, do the comparison with only directories, and then add the file back to the end.
Note that this only works if you actually have these files created on your system, python must access the filesystem to find the node types.
import os
def realrelpath(origin, dest):
'''Get the relative path between two paths, accounting for filepaths'''
# get the absolute paths so that strings can be compared
origin = os.path.abspath(origin)
dest = os.path.abspath(dest)
# find out if the origin and destination are filepaths
origin_isfile = os.path.isfile(origin)
dest_isfile = os.path.isfile(dest)
# if dealing with filepaths,
if origin_isfile or dest_isfile:
# get the base filename
filename = os.path.basename(origin) if origin_isfile else os.path.basename(dest)
# in cases where we're dealing with a file, use only the directory name
origin = os.path.dirname(origin) if origin_isfile else origin
dest = os.path.dirname(dest) if dest_isfile else dest
# get the relative path between directories, then re-add the filename
return os.path.join(os.path.relpath(dest, origin), filename)
else:
# if not dealing with any filepaths, just run relpath as usual
return os.path.relpath(dest, origin)
To get the real relative path from directory origin to dest, run:
>>> relrealpath('/path/to/origin/file.txt', '/path/to/dest/file.txt')
'../dest/file.txt'
I flipped the argument order because in my brain it makes more sense to say, "I want to know the relative path to take from arg1 to get to arg2", the standard relpath
implementation has it backwards (probably because that's how UNIX does it).
This need to access the filesystem is the real reason that relpath
has such strange behavior. Filesystem calls are expensive, so python leaves it up to you to know whether you're dealing with a file or with a directory and only performs string operations on the path you provide.
Note: There is probably a way to make the realrelpath
function a bit more efficient. For example, I'm not sure if the abspath
calls are necessary, or if they could be bundled with the os.path.isfile
checks with a syscall that returns more information. I welcome improvements.