问题
hi as suggested in previous question, i will try more clarify what i want to achieve. as in file1, in column $4 i have numbers which are not continuosly sequenced like 1,2,3,4,5.. , it means i need print those missing ones e.g. after number 3 i should get number 4 and so on
cat file1
A R5 A48 1
B R5 A48 2
C R4 A48 3
D R8 A48 15
E R9 A48 22
F R20 B55 21
G R55 B22 19
R B1 I77 14
AA B8 PP 18
BX A255 PA 7
CA A77 PB 10
WW W7 PX 11
i find out partly solution in this awk one liner returning
arr=($(awk '{ print $4 }' file1 )) | printf '%s\n' ${arr[*]}| \
awk -v first=1 -v last=23 ' BEGIN {for(i=first; i<=last; i++) array[i] = 1} \
{for(i=1;i<=NF;i++) array[$i] += 1} END {for (num in array) if (array[num] == 0) print num}'
4
5
6
8
9
12
13
16
17
20
23
this is what i want it, BUT i still missing to be printed remaining numbers after 23 till number 31 and have it pasted as column $3 (number 3) based on file2 number of rows/lines
cat file2
md5sum 25d422cc23b44c3bbd7a66c76d52af46
md5sum 25d422cc23b44c3bbd7a66c76d52af47
md5sum 25d422cc23b44c3bbd7a66c76d52af48
md5sum 25d422cc23b44c3bbd7a66c76d52af41
md5sum 25d422cc23b44c3bbd7a66c76d52af22
md5sum 25d422cc23b44c3bbd7a66c76d52af33
md5sum 25d422cc23b44c3bbd7a66c76d52af12
md5sum 25d422cc23b44c3bbd7a66c76d52af01
md5sum 25d422cc23b44c3bbd7a66c76d52af55
md5sum 25d422cc23b44c3bbd7a66c76d52af14
md5sum 25d422cc23b44c3bbd7a66c76d52af18
md5sum 25d422cc23b44c3bbd7a66c76d52af17
md5sum 25d422cc23b44c3bbd7a66c76d52af77
md5sum 25d422cc23b44c3bbd7a66c76d52af06
md5sum 25d422cc23b44c3bbd7a66c76d52af05
md5sum 25d422cc23b44c3bbd7a66c76d52af72
md5sum 25d422cc23b44c3bbd7a66c76d52af73
md5sum 25d422cc23b44c3bbd7a66c76d52af74
md5sum 25d422cc23b44c3bbd7a66c76d52af75
md5sum 25d422cc23b44c3bbd7a66c76d52af76
resulting into
md5sum 25d422cc23b44c3bbd7a66c76d52af46 4
md5sum 25d422cc23b44c3bbd7a66c76d52af47 5
md5sum 25d422cc23b44c3bbd7a66c76d52af48 6
md5sum 25d422cc23b44c3bbd7a66c76d52af41 8
md5sum 25d422cc23b44c3bbd7a66c76d52af22 9
md5sum 25d422cc23b44c3bbd7a66c76d52af33 12
md5sum 25d422cc23b44c3bbd7a66c76d52af12 13
md5sum 25d422cc23b44c3bbd7a66c76d52af01 16
md5sum 25d422cc23b44c3bbd7a66c76d52af55 17
md5sum 25d422cc23b44c3bbd7a66c76d52af14 19
md5sum 25d422cc23b44c3bbd7a66c76d52af18 20
md5sum 25d422cc23b44c3bbd7a66c76d52af17 23
md5sum 25d422cc23b44c3bbd7a66c76d52af77 24
md5sum 25d422cc23b44c3bbd7a66c76d52af06 25
md5sum 25d422cc23b44c3bbd7a66c76d52af05 26
md5sum 25d422cc23b44c3bbd7a66c76d52af72 27
md5sum 25d422cc23b44c3bbd7a66c76d52af73 28
md5sum 25d422cc23b44c3bbd7a66c76d52af74 29
md5sum 25d422cc23b44c3bbd7a66c76d52af75 30
md5sum 25d422cc23b44c3bbd7a66c76d52af76 31
e.g. if if next file2 will have 22 rows/lines it will add number till 32 for example
i believe it should be done by more better way as well with putting numbers from file1 column $4 into array too and remaing logic
回答1:
awk
to the rescue! No need to insert bash
into the script. awk
is a fully fledged programming language especially for text processing.
$ awk 'NR==FNR{a[$NF]; next} {while(++c in a); print $0, c}' file1 file2
md5sum 25d422cc23b44c3bbd7a66c76d52af46 4
md5sum 25d422cc23b44c3bbd7a66c76d52af47 5
md5sum 25d422cc23b44c3bbd7a66c76d52af48 6
md5sum 25d422cc23b44c3bbd7a66c76d52af41 8
md5sum 25d422cc23b44c3bbd7a66c76d52af22 9
md5sum 25d422cc23b44c3bbd7a66c76d52af33 12
md5sum 25d422cc23b44c3bbd7a66c76d52af12 13
md5sum 25d422cc23b44c3bbd7a66c76d52af01 16
md5sum 25d422cc23b44c3bbd7a66c76d52af55 17
md5sum 25d422cc23b44c3bbd7a66c76d52af14 20
md5sum 25d422cc23b44c3bbd7a66c76d52af18 23
md5sum 25d422cc23b44c3bbd7a66c76d52af17 24
md5sum 25d422cc23b44c3bbd7a66c76d52af77 25
md5sum 25d422cc23b44c3bbd7a66c76d52af06 26
md5sum 25d422cc23b44c3bbd7a66c76d52af05 27
md5sum 25d422cc23b44c3bbd7a66c76d52af72 28
md5sum 25d422cc23b44c3bbd7a66c76d52af73 29
md5sum 25d422cc23b44c3bbd7a66c76d52af74 30
md5sum 25d422cc23b44c3bbd7a66c76d52af75 31
md5sum 25d422cc23b44c3bbd7a66c76d52af76 32
Note that 19
is in your first file so it's skipped in the output. Your output is not consistent with your spec for the given input.
来源:https://stackoverflow.com/questions/61719638/awk-find-missing-number-in-sequence-from-file1-and-append-to-column-in-file2