POSIX sh equivalent for Bash’s printf %q

前端 未结 4 1485
我寻月下人不归
我寻月下人不归 2020-12-05 05:36

Suppose I have a #!/bin/sh script which can take a variety of positional parameters, some of which may include spaces, either/both kinds of quotes, etc. I want

相关标签:
4条回答
  • 2020-12-05 05:37

    This is absolutely doable.

    The answer you see by Jesse Glick is approximately there, but it has a couple of bugs, and I have a few more alternatives for your consideration, since this is a problem I ran into more than once.

    First, and you might already know this, echo is a bad idea, one should use printf instead, if the goal is portability: "echo" has undefined behavior in POSIX if the argument it receives is "-n", and in practice some implementations of echo treat -n as a special option, while others just treat it as a normal argument to print. So that becomes this:

    esceval()
    {
        printf %s "$1" | sed "s/'/'\"'\"'/g"
    }
    

    Alternatively, instead of escaping embedded single quotes by making them into:

    '"'"'
    

    ..instead you could turn them into:

    '\''
    

    ..stylistic differences I guess (I imagine performance difference is negligible either way, though I've never tested). The resulting sed string looks like this:

    esceval()
    {
        printf %s "$1" | sed "s/'/'\\\\''/g"
    }
    

    (It's four backslashes because double quotes swallow two of them, and leaving two, and then sed swallows one, leaving just the one. Personally, I find this way more readable so that's what I'll use in the rest of the examples that involve it, but both should be equivalent.)

    BUT, we still have a bug: command substitution will delete at least one (but in many shells ALL) of the trailing newlines from the command output (not all whitespace, just newlines specifically). So the above solution works unless you have newline(s) at the very end of an argument. Then you'll lose that/those newline(s). The fix is obviously simple: Add another character after the actual command value before outputting from your quote/esceval function. Incidentally, we already needed to do that anyway, because we needed to start and stop the escaped argument with single quotes. Honestly, I don't understand why that wasn't done to begin with. You have two alternatives:

    esceval()
    {
        printf '%s\n' "$1" | sed "s/'/'\\\\''/g; 1 s/^/'/; $ s/$/'/"
    }
    

    This will ensure the argument comes out already fully escaped, no need for adding more single quotes when building the final string. This is probably the closest thing you will get to a single, inline-able version. If you're okay with having a sed dependency, you can stop here.

    If you're not okay with the sed dependency, but you're fine with assuming that your shell is actually POSIX-compliant (there are still some out there, notably the /bin/sh on Solaris 10 and below, which won't be able to do this next variant - but almost all shells you need to care about will do this just fine):

    esceval()
    {
        printf \'
        UNESCAPED=$1
        while :
        do
            case $UNESCAPED in
            *\'*)
                printf %s "${UNESCAPED%%\'*}""'\''"
                UNESCAPED=${UNESCAPED#*\'}
                ;;
            *)
                printf %s "$UNESCAPED"
                break
            esac
        done
        printf \'
    }
    

    You might notice seemingly redundant quoting here:

    printf %s "${UNESCAPED%%\'*}""'\''"
    

    ..this could be replaced with:

    printf %s "${UNESCAPED%%\'*}'\''"
    

    The only reason I do the former, is because one upon a time there were Bourne shells which had bugs when substituting variables into quoted strings where the quote around the variable didn't exactly start and end where the variable substitution did. Hence it's a paranoid portability habit of mine. In practice, you can do the latter, and it won't be a problem.

    If you don't want to clobber the variable UNESCAPED in the rest of your shell environment, then you can wrap the entire contents of that function in a subshell, like so:

    esceval()
    {
      (
        printf \'
        UNESCAPED=$1
        while :
        do
            case $UNESCAPED in
            *\'*)
                printf %s "${UNESCAPED%%\'*}""'\''"
                UNESCAPED=${UNESCAPED#*\'}
                ;;
            *)
                printf %s "$UNESCAPED"
                break
            esac
        done
        printf \'
      )
    }
    

    "But wait", you say: "What I want to do this on MULTIPLE arguments in one command? And I want the output to still look kinda nice and legible for me as a user if I run it from the command line for whatever reason."

    Never fear, I have you covered:

    esceval()
    {
        case $# in 0) return 0; esac
        while :
        do
            printf "'"
            printf %s "$1" | sed "s/'/'\\\\''/g"
            shift
            case $# in 0) break; esac
            printf "' "
        done
        printf "'\n"
    }
    

    ..or the same thing, but with the shell-only version:

    esceval()
    {
      case $# in 0) return 0; esac
      (
        while :
        do
            printf "'"
            UNESCAPED=$1
            while :
            do
                case $UNESCAPED in
                *\'*)
                    printf %s "${UNESCAPED%%\'*}""'\''"
                    UNESCAPED=${UNESCAPED#*\'}
                    ;;
                *)
                    printf %s "$UNESCAPED"
                    break
                esac
            done
            shift
            case $# in 0) break; esac
            printf "' "
        done
        printf "'\n"
      )
    }
    

    In those last four, you could collapse some of the outer printf statements and roll their single quotes up into another printf - I kept them separate because I feel it makes the logic more clear when you can see the starting and ending single-quotes on separate print statements.

    P.S. There's also this monstrosity I made, which is a polyfill which will select between the previous two versions depending on if your shell seems to be capable of supporting the necessary variable substitution syntax (it looks awful though, because the shell-only version has to be inside an eval-ed string to keep the incompatible shells from barfing when they see it): https://github.com/mentalisttraceur/esceval/blob/master/sh/esceval.sh

    0 讨论(0)
  • 2020-12-05 05:37

    I think this is POSIX. It works by clearing $@ after expanding it for the for loop, but only once so that we can iteratively build it back up (in reverse) using set.

    flag=0
    for i in "$@"; do
        [ "$flag" -eq 0 ] && shift $#
        set -- "$i" "$@"
        flag=1
    done
    
    echo "$@"   # To see that "$@" has indeed been reversed
    ls "$@"
    

    I realize reversing the arguments was just an example, but you may be able to use this trick of set -- "$arg" "$@" or set -- "$@" "$arg" in other situations.

    And yes, I realize I may have just reimplemented (poorly) ormaaj's Push.

    0 讨论(0)
  • 2020-12-05 05:46

    The following seems to work with everything I have thrown at it so far, including spaces, both kinds of quotes and a variety of other metacharacters, and embedded newlines:

    #!/bin/sh
    quote() {
        echo "$1" | sed "s/'/'\"'\"'/g"
    }
    args=
    for arg in "$@"
    do
        argq="'"`quote "$arg"`"'"
        args="$argq $args"
    done
    eval "ls $args"
    
    0 讨论(0)
  • 2020-12-05 05:48

    Push. See the readme for examples.

    0 讨论(0)
提交回复
热议问题