I tried creating a pseudo-multidimensional array in awk.
# Calculate cumulative context score
BEGIN { FS=OFS="\t" }
{
a[$2+FS+$7,$3]+=$6
}
END { for (i,j) in a
{ print i,j,a[i,j] }
}
Output:
awk: ccstscan.awk:9: END { for (i,j) in a
awk: ccstscan.awk:9: ^ syntax error
this is what is mentioned in GNU awk manual:
To test whether a particular index sequence exists in a multidimensional array, use the same operator (in) that is used for single dimensional arrays. Write the whole sequence of indices in parentheses, separated by commas, as the left operand:
(subscript1, subscript2, ...) in array
i tried modifying the script to create a true-multi dimensional array:
BEGIN { FS=OFS="\t" }
{
a[$2+FS+$7][$3]+=$6
}
END { for i in a
{
for j in a[i]
{ print i,j,a[i][j]
}
}
}
i ran it with gawk. it also gave an error:
gawk: ccstscan.awk:6: a[$2+FS+$7][$3]+=$6
gawk: ccstscan.awk:6: ^ syntax error
gawk: ccstscan.awk:9: END { for i in a
gawk: ccstscan.awk:9: ^ syntax error
gawk: ccstscan.awk:11: for j in a[i]
gawk: ccstscan.awk:11: ^ syntax error
gawk: ccstscan.awk:11: for j in a[i]
gawk: ccstscan.awk:11: ^ syntax error
gawk: ccstscan.awk:12: { print i,j,a[i][j]
gawk: ccstscan.awk:12: ^ syntax error
what is the correct format to make and scan multi-dimensional-associative arrays
If you are using the simulated multi-dimensional arrays, your loop would need to be like this:
END {
for (ij in a) {
split(ij,indices,SUBSEP);
i=indices[1];
j=indices[2];
print i,j,a[ij]
}
}
The (i,j) in a
syntax only works for testing whether a particular index is in the array. It doesn't work for for-loops, despite the for-loop allowing a similar syntax.
For the true multi-dimensional arrays (arrays of arrays), you can write it like this:
BEGIN { FS=OFS="\t" }
{ a[$2+FS+$7][$3]+=$6 }
END {
for (i in a) {
for (j in a[i]) {
print i,j,a[i][j]
}
}
}
However, arrays of arrays was only added in gawk 4.0, so your version of gawk may not support it.
Another note: on this line:
a[$2+FS+$7,$3]+=$6
It seems like you are trying to concatenate $2, FS, and $7, but "+" is for numerical addition, not concatenation. You would need to write it like this:
a[$2 FS $7,$3] += $6
来源:https://stackoverflow.com/questions/14280877/multidimensional-arrays-in-awk