Reading quoted/escaped arguments correctly from a string

前端 未结 4 1594
难免孤独
难免孤独 2020-11-22 03:48

I\'m encountering an issue passing an argument to a command in a Bash script.

poc.sh:

#!/bin/bash

ARGS=\'\"hi there\" test\'
./swap ${ARGS}
<         


        
相关标签:
4条回答
  • 2020-11-22 04:12

    This might not be the most robust approach, but it is simple, and seems to work for your case:

    ## demonstration matching the question
    $ ( ARGS='"hi there" test' ; ./swap ${ARGS} )
    there" "hi
    
    ## simple solution, using 'xargs'
    $ ( ARGS='"hi there" test' ; echo ${ARGS} |xargs ./swap )
    test hi there
    
    0 讨论(0)
  • 2020-11-22 04:21

    Embedded quotes do not protect whitespace; they are treated literally. Use an array in bash:

    args=( "hi there" test)
    ./swap "${args[@]}"
    

    In POSIX shell, you are stuck using eval (which is why most shells support arrays).

    args='"hi there" test'
    eval "./swap $args"
    

    As usual, be very sure you know the contents of $args and understand how the resulting string will be parsed before using eval.

    0 讨论(0)
  • 2020-11-22 04:24

    Ugly Idea Alert: Pure Bash Function

    Here's a quoted-string parser written in pure bash (what terrible fun)!

    Caveat: just like the xargs example above, this errors in the case of an escaped quote. This could be fixed... but much better to do in an actual programming language.

    Example Usage

    MY_ARGS="foo 'bar baz' qux * "'$(dangerous)'" sudo ls -lah"
    
    # Create array from multi-line string
    IFS=$'\r\n' GLOBIGNORE='*' args=($(parseargs "$MY_ARGS"))
    
    # Show each of the arguments array
    for arg in "${args[@]}"; do
        echo "$arg"
    done
    

    Example Output

    foo
    bar baz
    qux
    *
    

    Parse Argument Function

    This literally goes character-by-character and either adds to the current string or the current array.

    set -u
    set -e
    
    # ParseArgs will parse a string that contains quoted strings the same as bash does
    # (same as most other *nix shells do). This is secure in the sense that it doesn't do any
    # executing or interpreting. However, it also doesn't do any escaping, so you shouldn't pass
    # these strings to shells without escaping them.
    parseargs() {
        notquote="-"
        str=$1
        declare -a args=()
        s=""
    
        # Strip leading space, then trailing space, then end with space.
        str="${str## }"
        str="${str%% }"
        str+=" "
    
        last_quote="${notquote}"
        is_space=""
        n=$(( ${#str} - 1 ))
    
        for ((i=0;i<=$n;i+=1)); do
            c="${str:$i:1}"
    
            # If we're ending a quote, break out and skip this character
            if [ "$c" == "$last_quote" ]; then
                last_quote=$notquote
                continue
            fi
    
            # If we're in a quote, count this character
            if [ "$last_quote" != "$notquote" ]; then
                s+=$c
                continue
            fi
    
            # If we encounter a quote, enter it and skip this character
            if [ "$c" == "'" ] || [ "$c" == '"' ]; then
                is_space=""
                last_quote=$c
                continue
            fi
    
            # If it's a space, store the string
            re="[[:space:]]+" # must be used as a var, not a literal
            if [[ $c =~ $re ]]; then
                if [ "0" == "$i" ] || [ -n "$is_space" ]; then
                    echo continue $i $is_space
                    continue
                fi
                is_space="true"
                args+=("$s")
                s=""
                continue
            fi
    
            is_space=""
            s+="$c"
        done
    
        if [ "$last_quote" != "$notquote" ]; then
            >&2 echo "error: quote not terminated"
            return 1
        fi
    
        for arg in "${args[@]}"; do
            echo "$arg"
        done
        return 0
    }
    

    I may or may not keep this updated at:

    • https://git.coolaj86.com/coolaj86/git-scripts/src/branch/master/git-proxy

    Seems like a rather stupid thing to do... but I had the itch... oh well.

    0 讨论(0)
  • 2020-11-22 04:29

    A Few Introductory Words

    If at all possible, don't use shell-quoted strings as an input format.

    • It's hard to parse consistently: Different shells have different extensions, and different non-shell implementations implement different subsets (see the deltas between shlex and xargs below).
    • It's hard to programmatically generate. ksh and bash have printf '%q', which will generate a shell-quoted string with contents of an arbitrary variable, but no equivalent exists to this in the POSIX sh standard.
    • It's easy to parse badly. Many folks consuming this format use eval, which has substantial security concerns.

    NUL-delimited streams are a far better practice, as they can accurately represent any possible shell array or argument list with no ambiguity whatsoever.


    xargs, with bashisms

    If you're getting your argument list from a human-generated input source using shell quoting, you might consider using xargs to parse it. Consider:

    array=( )
    while IFS= read -r -d ''; do
      array+=( "$REPLY" )
    done < <(xargs printf '%s\0' <<<"$ARGS")
    
    swap "${array[@]}"
    

    ...will put the parsed content of $ARGS into the array array. If you wanted to read from a file instead, substitute <filename for <<<"$ARGS".


    xargs, POSIX-compliant

    If you're trying to write code compliant with POSIX sh, this gets trickier. (I'm going to assume file input here for reduced complexity):

    # This does not work with entries containing literal newlines; you need bash for that.
    run_with_args() {
      while IFS= read -r entry; do
        set -- "$@" "$entry"
      done
      "$@"
    }
    xargs printf '%s\n' <argfile | run_with_args ./swap
    

    These approaches are safer than running xargs ./swap <argfile inasmuch as it will throw an error if there are more or longer arguments than can be accommodated, rather than running excess arguments as separate commands.


    Python shlex -- rather than xargs -- with bashisms

    If you need more accurate POSIX sh parsing than xargs implements, consider using the Python shlex module instead:

    shlex_split() {
      python -c '
    import shlex, sys
    for item in shlex.split(sys.stdin.read()):
        sys.stdout.write(item + "\0")
    '
    }
    while IFS= read -r -d ''; do
      array+=( "$REPLY" )
    done < <(shlex_split <<<"$ARGS")
    
    0 讨论(0)
提交回复
热议问题