In a bash script, how do I sanitize user input?

前端 未结 6 1052
抹茶落季
抹茶落季 2020-12-24 05:01

I\'m looking for the best way to take a simple input:

echo -n \"Enter a string here: \"
read -e STRING

and clean it up by removing non-alph

相关标签:
6条回答
  • 2020-12-24 05:41

    Bash can do this all on it's own, thank you very much. If you look at the section of the man page on Parameter Expansion, you'll see that that bash has built-in substitutions, substring, trim, rtrim, etc.

    To eliminate all non-alphanumeric characters, do

    CLEANSTRING=${STRING//[^a-zA-Z0-9]/}
    

    That's Occam's razor. No need to launch another process.

    0 讨论(0)
  • 2020-12-24 05:43

    You could run it through perl.

    export CLEANSTRING=$(perl -e 'print join( q//, map { s/\\s+/_/g; lc } split /[^\\s\\w]+/, \$ENV{STRING} )')
    

    I'm using ksh-style subshell here, I'm not totally sure that it works in bash.

    That's the nice thing about shell, is that you can use perl, awk, sed, grep....

    0 讨论(0)
  • 2020-12-24 05:44

    Quick and dirty:

    STRING=`echo 'dit /ZOU/ een test123' | perl -pe's/ //g;tr/[A-Z]/[a-z]/;s/[^a-zA-Z0-9]//g'`

    0 讨论(0)
  • 2020-12-24 05:48

    For Bash >= 4.0:

    CLEAN="${STRING//_/}" && \
    CLEAN="${CLEAN// /_}" && \
    CLEAN="${CLEAN//[^a-zA-Z0-9]/}" && \
    CLEAN="${CLEAN,,}"
    

    This is especially useful for creating container names programmatically using docker/podman. However, in this case you'll also want to remove the underscores:

    # Sanitize $STRING for a container name
    CLEAN="${STRING//[^a-zA-Z0-9]/}" && \
    CLEAN="${CLEAN,,}"
    
    0 讨论(0)
  • 2020-12-24 06:05

    After a bit of looking around it seems tr is indeed the simplest way:

    export CLEANSTRING="`echo -n "${STRING}" | tr -cd '[:alnum:] [:space:]' | tr '[:space:]' '-'  | tr '[:upper:]' '[:lower:]'`"
    

    Occam's razor, I suppose.

    0 讨论(0)
  • 2020-12-24 06:06

    As dj_segfault points out, the shell can do most of this for you. Looks like you'll have to fall back on something external for lower-casing the string, though. For this you have many options, like the perl one-liners above, etc., but I think tr is probably the simplest.

    # first, strip underscores
    CLEAN=${STRING//_/}
    # next, replace spaces with underscores
    CLEAN=${CLEAN// /_}
    # now, clean out anything that's not alphanumeric or an underscore
    CLEAN=${CLEAN//[^a-zA-Z0-9_]/}
    # finally, lowercase with TR
    CLEAN=`echo -n $CLEAN | tr A-Z a-z`
    

    The order here is somewhat important. We want to get rid of underscores, plus replace spaces with underscores, so we have to be sure to strip underscores first. By waiting to pass things to tr until the end, we know we have only alphanumeric and underscores, and we can be sure we have no spaces, so we don't have to worry about special characters being interpreted by the shell.

    0 讨论(0)
提交回复
热议问题