Is it possible to append an item to an array in awk without specifying an index?

问题

I realize that awk has associative arrays, but I wonder if there is an awk equivalent to this:

http://php.net/manual/en/function.array-push.php

The obvious workaround is to just say:

array[$new_element] = $new_element

However, this seems less readable and more hackish than it needs to be.

回答1:

I don't think an array length is immediately available in awk (at least not in the versions I fiddle around with). But you could simply maintain the length and then do something like this:

array[arraylen++] = $0;

And then access the elements it via the same integer values:

for ( i = 0; i < arraylen; i++ )
   print array[i];

回答2:

In gawk you can find the length of an array with length(var) so it's not very hard to cook up your own function.

function push(A,B) { A[length(A)+1] = B }

Notice this discussion, though: http://objectmix.com/awk/361598-gawk-length-array-question.html -- all the places I can access right now have gawk 3.1.5 so I cannot properly test my function, duh. But here is an approximation.

vnix$ gawk '# BEGIN: make sure arr is an array
>   BEGIN { delete arr[0] }
>   { print "=" length(arr); arr[length(arr)+1] = $1;
>     print length(arr), arr[length(arr)] }
>   END { print "---";
>     for (i=1; i<=length(arr); ++i) print i, arr[i] }' <<HERE
> fnord foo
> ick bar
> baz quux
> HERE
=0
1 fnord
=1
2 ick
=2
3 baz
---
1 fnord
2 ick
3 baz

回答3:

As others have said, awk provides no functionality like this out of the box. Your "hackish" workaround may work for some datasets, but not others. Consider that you might add the same array value twice, and want it represented twice within the array.

$ echo 3 | awk 'BEGIN{ a[1]=5; a[2]=12; a[3]=2 }
>   { a[$1] = $1 }
>   END {print length(a) " - " a[3]}'
3 - 3

The best solution may be informed by the data are in the array, but here are some thoughts.

First off, if you are certain that your index will always be numeric, will always start at 1, and that you will never delete array elements, then triplee's suggestion of A[length(A)+1]="value" may work for you. But if you do delete an element, then your next write may overwrite your last element.

If your index does not matter, and you're not worried about wasting space with long keys, you could use a random number that's long enough to reduce the likelihood of collisions. A quick & dirty option might be:

srand()
a[rand() rand() rand()]="value"

Remember to use srand() for better randomization, and don't trust rand() to produce actual random numbers. This is a less than perfect solution in a number of ways, but it has the advantage of being a single line of code.

If your keys are numeric but possibly sparse, as in the example that would break tripleee's solution, you can add a small search to your push function:

function push (a, v,     n) {
  n=length(a)+1
  while (n in a) n++
  a[n]=v
}

The while loop insures that you'll assign an unused index. This function is also compatible with arrays that use non-numeric indices -- it assigns keys that are numeric, but it doesn't care what's already there.

Note that awk does not guarantee the order of elements within an array, so the idea that you will "push an item onto the end of the array" is wrong. You'll add this element to the array, but there's no guarantee it's appear last when you step through with a for loop.

$ cat a
#!/usr/bin/awk -f

function push (a, v,     n) {
  n=length(a)+1
  while (n in a) n++
  a[n]=v
}

{
  push(a, $0)
}

END {
  print "length=" length(a)
  for(i in a) print i " - " a[i]
}

$ printf '3\nfour\ncinq\n' | ./a
length=3
2 - four
3 - cinq
1 - 3

来源：https://stackoverflow.com/questions/10758564/is-it-possible-to-append-an-item-to-an-array-in-awk-without-specifying-an-index

标签

awk

gawk