Bash script to create multiple arrays from csv with unknown columns.
I am trying to write a script to compare two csv files with similar columns. I need it to locat
arr$varEnvCol[$index]="$(...)"
doesn't work the way you expect it to - you cannot assign to shell variables indirectly - via an expression that expands to the variable name - this way.
Your attempted workaround with eval
is also flawed - see below.
declare -n targetArray="arr$varEnvCol"
targetArray[index]=$(echo $line | awk -F, '{print $1}')
declare "arr$varEnvCol"[index]="$(echo $line | awk -F, '{print $1}')"
Caveat: This will work in your particular situation, but may fail subtly in others; read on for details, including a more robust, but cumbersome alternative based on read
.
The eval
-based solution mentioned by @shellter in a since-deleted comment is problematic not only for security reasons (as they mentioned), but also because it can get quite tricky with respect to quoting; for completeness, here's the eval
-based solution:
eval "arr$varEnvCol[index]"='$(echo $line | awk -F, '\''{print $1}'\'')'
See below for an explanation.
bash
array variable indirectly:bash 4.3+
: use declare -n
to effectively create an alias ('nameref') of another variableThis is by far the best option, if available:
declare -n targetArray="arr$varEnvCol"
targetArray[index]=$(echo $line | awk -F, '{print $1}')
declare -n
effectively allows you to refer to a variable by another name (whether that variable is an array or not), and the name to create an alias for can be the result of an expression (an expanded string), as demonstrated.
bash 4.2-
: there are several options, each with tradeoffsNOTE: With non-array variables, the best approach is to use printf -v
. Since this question is about array variables, this approach is not discussed further.
read
:IFS=$'\n' read -r -d '' "arr$varEnvCol"[index] <<<"$(echo $line | awk -F, '{print $1}')"
IFS=$'\n'
ensures that that leading and trailing whitespace in each input line is left intact.-r
prevents interpretation of \
chars. in the input.-d ''
ensures that ALL input is captured, even multi-line.
\n
chars. are stripped.-d ''
"arr$varEnvCol"[index]
expands to the variable - array element, in this case - to assign to; note that referring to variable index
inside an array subscript does NOT need the $
prefix, because subscripts are evaluated in arithmetic context, where the prefix is optional.<<<
- a so-called here-string - sends its argument to stdin
, where read
takes its input from.
[simplest, but may break]: use declare
:
declare "arr$varEnvCol"[index]="$(echo $line | awk -F, '{print $1}')"
declare
is meant to declare, not modify a variable, but it works in bash 3.x and 4.x, with the constraints noted below.)declare
or not.Caveat: INSIDE a function, only works with LOCAL variables - you cannot reference shell-global variables (variables declared outside the function) from inside a function that way. Attempting to do so invariably creates a LOCAL variable ECLIPSING the shell-global variable.
[insecure and tricky]: use eval
:
eval "arr$varEnvCol[index]"='$(echo $line | awk -F, '\''{print $1}'\'')'
eval
if you fully control the contents of the string being evaluated; eval
will execute any command contained in a string, with potentially unwanted results.eval
executes rather than immediate expansion that happens when arguments are passed to eval
.'
chars. spliced in as \'
.eval
must be a double-quoted string - using an unquoted string with selective quoting of $
won't work, curiously:
eval "arr$varEnvCol[index]"=...
eval arr\$varEnvCol[index]=...