awk to compare two file by identifier & output in a specific format

╄→гoц情女王★ 提交于 2019-12-13 23:44:52

问题


I have 2 large files i need to compare all pipe delimited

file 1

a||d||f||a
1||2||3||4

file 2

a||d||f||a
1||1||3||4
1||2||r||f

Now I want to compare the files & print accordingly such as if any update found in file 2 will be printed as updated_value#oldvalue & any new line added to file 2 will also be updated accordingly.

So the desired output is: (only the updated & new data)

1||1#2||3||4
1||2||r||f

what I have tried so far is to get the separated changed values:

awk -F '[||]+' 'NR==FNR{for(i=1;i<=NF;i++)a[NR,i]=$i;next}{for(i=1;i<=NF;i++)if(a[FNR,i]!=$i)print $i"#"a[FNR,i]}' file1 file2 >output

But I want to print the whole line. How can I achieve that??


回答1:


I would say:

awk 'BEGIN{FS=OFS="|"}
     FNR==NR {for (i=1;i<=NF;i+=2) a[FNR,i]=$i; next}
     {for (i=1; i<=NF; i+=2)
         if (a[FNR,i] && a[FNR,i]!=$i)
             $i=$i"#"a[FNR,i]
     }1' f1 f2

This stores the file1 in a matrix a[line number, column]. Then, it compares its values with its correspondence in file2.

Note I am using the field separator | instead of || and looping in steps of two to use the proper data. This is because I for example did gawk -F'||' '{print NF}' f1 and got just 1, meaning that FS wasn't well understood. Will be grateful if someone points the error here!

Test

$ awk 'BEGIN{FS=OFS="|"} FNR==NR {for (i=1;i<=NF;i+=2) a[FNR,i]=$i; next} {for (i=1; i<=NF; i+=2) if (a[FNR,i] && a[FNR,i]!=$i) $i=$i"#"a[FNR,i]}1' f1 f2
a||d||f||b#a
1||1#2||3||4
1||2||r||f


来源:https://stackoverflow.com/questions/30300585/awk-to-compare-two-file-by-identifier-output-in-a-specific-format

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!