How to resolve symbolic links in a shell script

后端 未结 19 2102
死守一世寂寞
死守一世寂寞 2020-11-28 00:48

Given an absolute or relative path (in a Unix-like system), I would like to determine the full path of the target after resolving any intermediate symlinks. Bonus points for

相关标签:
19条回答
  • 2020-11-28 01:22

    Note: I believe this to be a solid, portable, ready-made solution, which is invariably lengthy for that very reason.

    Below is a fully POSIX-compliant script / function that is therefore cross-platform (works on macOS too, whose readlink still doesn't support -f as of 10.12 (Sierra)) - it uses only POSIX shell language features and only POSIX-compliant utility calls.

    It is a portable implementation of GNU's readlink -e (the stricter version of readlink -f).

    You can run the script with sh or source the function in bash, ksh, and zsh:

    For instance, inside a script you can use it as follows to get the running's script true directory of origin, with symlinks resolved:

    trueScriptDir=$(dirname -- "$(rreadlink "$0")")
    

    rreadlink script / function definition:

    The code was adapted with gratitude from this answer.
    I've also created a bash-based stand-alone utility version here, which you can install with
    npm install rreadlink -g, if you have Node.js installed.

    #!/bin/sh
    
    # SYNOPSIS
    #   rreadlink <fileOrDirPath>
    # DESCRIPTION
    #   Resolves <fileOrDirPath> to its ultimate target, if it is a symlink, and
    #   prints its canonical path. If it is not a symlink, its own canonical path
    #   is printed.
    #   A broken symlink causes an error that reports the non-existent target.
    # LIMITATIONS
    #   - Won't work with filenames with embedded newlines or filenames containing 
    #     the string ' -> '.
    # COMPATIBILITY
    #   This is a fully POSIX-compliant implementation of what GNU readlink's
    #    -e option does.
    # EXAMPLE
    #   In a shell script, use the following to get that script's true directory of origin:
    #     trueScriptDir=$(dirname -- "$(rreadlink "$0")")
    rreadlink() ( # Execute the function in a *subshell* to localize variables and the effect of `cd`.
    
      target=$1 fname= targetDir= CDPATH=
    
      # Try to make the execution environment as predictable as possible:
      # All commands below are invoked via `command`, so we must make sure that
      # `command` itself is not redefined as an alias or shell function.
      # (Note that command is too inconsistent across shells, so we don't use it.)
      # `command` is a *builtin* in bash, dash, ksh, zsh, and some platforms do not 
      # even have an external utility version of it (e.g, Ubuntu).
      # `command` bypasses aliases and shell functions and also finds builtins 
      # in bash, dash, and ksh. In zsh, option POSIX_BUILTINS must be turned on for
      # that to happen.
      { \unalias command; \unset -f command; } >/dev/null 2>&1
      [ -n "$ZSH_VERSION" ] && options[POSIX_BUILTINS]=on # make zsh find *builtins* with `command` too.
    
      while :; do # Resolve potential symlinks until the ultimate target is found.
          [ -L "$target" ] || [ -e "$target" ] || { command printf '%s\n' "ERROR: '$target' does not exist." >&2; return 1; }
          command cd "$(command dirname -- "$target")" # Change to target dir; necessary for correct resolution of target path.
          fname=$(command basename -- "$target") # Extract filename.
          [ "$fname" = '/' ] && fname='' # !! curiously, `basename /` returns '/'
          if [ -L "$fname" ]; then
            # Extract [next] target path, which may be defined
            # *relative* to the symlink's own directory.
            # Note: We parse `ls -l` output to find the symlink target
            #       which is the only POSIX-compliant, albeit somewhat fragile, way.
            target=$(command ls -l "$fname")
            target=${target#* -> }
            continue # Resolve [next] symlink target.
          fi
          break # Ultimate target reached.
      done
      targetDir=$(command pwd -P) # Get canonical dir. path
      # Output the ultimate target's canonical path.
      # Note that we manually resolve paths ending in /. and /.. to make sure we have a normalized path.
      if [ "$fname" = '.' ]; then
        command printf '%s\n' "${targetDir%/}"
      elif  [ "$fname" = '..' ]; then
        # Caveat: something like /var/.. will resolve to /private (assuming /var@ -> /private/var), i.e. the '..' is applied
        # AFTER canonicalization.
        command printf '%s\n' "$(command dirname -- "${targetDir}")"
      else
        command printf '%s\n' "${targetDir%/}/$fname"
      fi
    )
    
    rreadlink "$@"
    

    A tangent on security:

    jarno, in reference to the function ensuring that builtin command is not shadowed by an alias or shell function of the same name, asks in a comment:

    What if unalias or unset and [ are set as aliases or shell functions?

    The motivation behind rreadlink ensuring that command has its original meaning is to use it to bypass (benign) convenience aliases and functions often used to shadow standard commands in interactive shells, such as redefining ls to include favorite options.

    I think it's safe to say that unless you're dealing with an untrusted, malicious environment, worrying about unalias or unset - or, for that matter, while, do, ... - being redefined is not a concern.

    There is something that the function must rely on to have its original meaning and behavior - there is no way around that.
    That POSIX-like shells allow redefinition of builtins and even language keywords is inherently a security risk (and writing paranoid code is hard in general).

    To address your concerns specifically:

    The function relies on unalias and unset having their original meaning. Having them redefined as shell functions in a manner that alters their behavior would be a problem; redefinition as an alias is not necessarily a concern, because quoting (part of) the command name (e.g., \unalias) bypasses aliases.

    However, quoting is not an option for shell keywords (while, for, if, do, ...) and while shell keywords do take precedence over shell functions, in bash and zsh aliases have the highest precedence, so to guard against shell-keyword redefinitions you must run unalias with their names (although in non-interactive bash shells (such as scripts) aliases are not expanded by default - only if shopt -s expand_aliases is explicitly called first).

    To ensure that unalias - as a builtin - has its original meaning, you must use \unset on it first, which requires that unset have its original meaning:

    unset is a shell builtin, so to ensure that it is invoked as such, you'd have to make sure that it itself is not redefined as a function. While you can bypass an alias form with quoting, you cannot bypass a shell-function form - catch 22.

    Thus, unless you can rely on unset to have its original meaning, from what I can tell, there is no guaranteed way to defend against all malicious redefinitions.

    0 讨论(0)
  • 2020-11-28 01:24

    Try this:

    cd $(dirname $([ -L $0 ] && readlink -f $0 || echo $0))
    
    0 讨论(0)
  • 2020-11-28 01:25
    readlink -f "$path"
    

    Editor's note: The above works with GNU readlink and FreeBSD/PC-BSD/OpenBSD readlink, but not on OS X as of 10.11.
    GNU readlink offers additional, related options, such as -m for resolving a symlink whether or not the ultimate target exists.

    Note since GNU coreutils 8.15 (2012-01-06), there is a realpath program available that is less obtuse and more flexible than the above. It's also compatible with the FreeBSD util of the same name. It also includes functionality to generate a relative path between two files.

    realpath $path
    

    [Admin addition below from comment by halloleo —danorton]

    For Mac OS X (through at least 10.11.x), use readlink without the -f option:

    readlink $path
    

    Editor's note: This will not resolve symlinks recursively and thus won't report the ultimate target; e.g., given symlink a that points to b, which in turn points to c, this will only report b (and won't ensure that it is output as an absolute path).
    Use the following perl command on OS X to fill the gap of the missing readlink -f functionality:
    perl -MCwd -le 'print Cwd::abs_path(shift)' "$path"

    0 讨论(0)
  • 2020-11-28 01:26

    "pwd -P" seems to work if you just want the directory, but if for some reason you want the name of the actual executable I don't think that helps. Here's my solution:

    #!/bin/bash
    
    # get the absolute path of the executable
    SELF_PATH=$(cd -P -- "$(dirname -- "$0")" && pwd -P) && SELF_PATH=$SELF_PATH/$(basename -- "$0")
    
    # resolve symlinks
    while [[ -h $SELF_PATH ]]; do
        # 1) cd to directory of the symlink
        # 2) cd to the directory of where the symlink points
        # 3) get the pwd
        # 4) append the basename
        DIR=$(dirname -- "$SELF_PATH")
        SYM=$(readlink "$SELF_PATH")
        SELF_PATH=$(cd "$DIR" && cd "$(dirname -- "$SYM")" && pwd)/$(basename -- "$SYM")
    done
    
    0 讨论(0)
  • 2020-11-28 01:27
    readlink -e [filepath]
    

    seems to be exactly what you're asking for - it accepts an arbirary path, resolves all symlinks, and returns the "real" path - and it's "standard *nix" that likely all systems already have

    0 讨论(0)
  • 2020-11-28 01:30

    According to the standards, pwd -P should return the path with symlinks resolved.

    C function char *getcwd(char *buf, size_t size) from unistd.h should have the same behaviour.

    getcwd pwd

    0 讨论(0)
提交回复
热议问题