问题
Here's my array (gawk script) :
myArray["peter"] = 32
myArray["bob"] = 5
myArray["john"] = 463
myArray["jack"] = 11
After sort, I need the following result :
bob 5
jack 11
peter 32
john 463
When i use "asort", indices are lost. How to sort by array value without losing indices ? (I need ordered indices based on their values)
(I need to obtain this result with awk/gawk only, not shell script, perl, etc)
If my post isn't clear enough, here is an other post explaining the same issue : http://www.experts-exchange.com/Programming/Languages/Scripting/Shell/Q_26626841.html )
Thanks in advance
Update :
Thanks to you both, but i need to sort by values, not indices (i want ordered indices according to their values).
In other terms, i need this result :
bob 5
jack 11
peter 32
john 463
not :
bob 5
jack 11
john 463
peter 32
(I agree, my example is confusing, the chosen values are pretty bad)
From the code of Catcall, I wrote a quick implementation that works, but it's rather ugly (I concatenate keys & values before sort and split during comparison). Here's what it looks like :
function qsort(A, left, right, i, last) {
if (left >= right)
return
swap(A, left, left+int((right-left+1)*rand()))
last = left
for (i = left+1; i <= right; i++)
if (getPart(A[i], "value") < getPart(A[left], "value"))
swap(A, ++last, i)
swap(A, left, last)
qsort(A, left, last-1)
qsort(A, last+1, right)
}
function swap(A, i, j, t) {
t = A[i]; A[i] = A[j]; A[j] = t
}
function getPart(str, part) {
if (part == "key")
return substr(str, 1, index(str, "#")-1)
if (part == "value")
return substr(str, index(str, "#")+1, length(str))+0
return
}
BEGIN { }
{ }
END {
myArray["peter"] = 32
myArray["bob"] = 5
myArray["john"] = 463
myArray["jack"] = 11
for (key in myArray)
sortvalues[j++] = key "#" myArray[key]
qsort(sortvalues, 0, length(myArray));
for (i = 1; i <= length(myArray); i++)
print getPart(sortvalues[i], "key"), getPart(sortvalues[i], "value")
}
Of course I'm interested if you have something more clean...
Thanks for your time
回答1:
Edit:
Sort by values
Oh! To sort the values, it's a bit of a kludge, but you can create a temporary array using a concatenation of the values and the indices of the original array as indices in the new array. Then you can asorti()
the temporary array and split the concatenated values back into indices and values. If you can't follow that convoluted description, the code is much easier to understand. It's also very short.
# right justify the integers into space-padded strings and cat the index
# to create the new index
for (i in myArray) tmpidx[sprintf("%12s", myArray[i]),i] = i
num = asorti(tmpidx)
j = 0
for (i=1; i<=num; i++) {
split(tmpidx[i], tmp, SUBSEP)
indices[++j] = tmp[2] # tmp[2] is the name
}
for (i=1; i<=num; i++) print indices[i], myArray[indices[i]]
Edit 2:
If you have GAWK 4, you can traverse the array by order of values without performing an explicit sort:
#!/usr/bin/awk -f
BEGIN {
myArray["peter"] = 32
myArray["bob"] = 5
myArray["john"] = 463
myArray["jack"] = 11
PROCINFO["sorted_in"] = "@val_num_asc"
for (i in myArray) {
{print i, myArray[i]}}
}
}
There are settings for traversing by index or value, ascending or descending and other options. You can also specify a custom function.
Previous answer:
Sort by indices
If you have an AWK, such as gawk
3.1.2 or greater, which supports asorti()
:
#!/usr/bin/awk -f
BEGIN {
myArray["peter"] = 32
myArray["bob"] = 5
myArray["john"] = 463
myArray["jack"] = 11
num = asorti(myArray, indices)
for (i=1; i<=num; i++) print indices[i], myArray[indices[i]]
}
If you don't have asorti()
:
#!/usr/bin/awk -f
BEGIN {
myArray["peter"] = 32
myArray["bob"] = 5
myArray["john"] = 463
myArray["jack"] = 11
for (i in myArray) indices[++j] = i
num = asort(indices)
for (i=1; i<=num; i++) print i, indices[i], myArray[indices[i]]
}
回答2:
Use the Unix sort command with the pipe, keeps Awk code simple and follow Unix philosophy
Create a input file with values seperated by comma
peter,32
jack,11
john,463
bob,5
Create a sort.awk file with the code
BEGIN { FS=","; }
{
myArray[$1]=$2;
}
END {
for (name in myArray)
printf ("%s,%d\n", name, myArray[name]) | "sort -t, -k2 -n"
}
Run the program, should give you the output
$ awk -f sort.awk data
bob,5
jack,11
peter,32
john,463
回答3:
PROCINFO["sorted_in"] = "@val_num_desc";
Before iterating an array, use the above statement. But, it works in awk version 4.0.1. It does not work in awk version 3.1.7.
I am not sure in which intermediate version, it got introduced.
回答4:
And the simple answer...
function sort_by_myArray(i1, v1, i2, v2) {
return myArray[i2] < myArray[i1];
}
BEGIN {
myArray["peter"] = 32;
myArray["bob"] = 5;
myArray["john"] = 463;
myArray["jack"] = 11;
len = length(myArray);
asorti(myArray, k, "sort_by_myArray");
# Print result.
for(n = 1; n <= len; ++n) {
print k[n], myArray[k[n]]
}
}
回答5:
The authors of The Awk Programming Language provide a quicksort function, which is available online.
I think you'd do something like this.
END {
for (key in myArray) {
sortkeys[j++] = key;
}
qsort(sortkeys, 0, length(myArray)); # Not sure I got the args right.
for (i = 1; i <= length(myArray); i++) {
print sortkeys[i], myArray[sortkeys[i]];
}
}
来源:https://stackoverflow.com/questions/5342782/sort-associative-array-with-awk