I am trying to combine data from two different files. In each file, some data is linked to some ID. I want to \'combine\' both files in the sense that all ID\'s mus
Does it have to be awk, or did you choose this because you think that's the best - easiest way?
You can do this via join
$join -j 1 -a 1 -a 2 -o auto file_1 file_2 | column -t -s' ' -o' '
1.01 data_a data_aa
1.02 data_b
1.03 data_c data_cc
1.04 data_d
1.05 data_e data_ee
1.06 data_f
1.09 data_ii
edit: As per the excellent suggestion from KamilCuk you can preserve the output afterwards.
1st Solution: In case you do have duplicate values of $1
in your Input_file(s) then following will take care of that case also.
awk '
BEGIN{
OFS="\t"
}
FNR==NR{
a[$1]=$2
next
}
$1 in a{
print $1,a[$1],$2
c[$1]
next
}
{
b[$1]=$2
}
END{
for(i in a){
if(!(i in c)){
print i,a[i],"\t"
}
}
for(j in b){
print j,"\t",b[j]
}
}
' Input_file2 Input_file1
2nd solution: Could you please try following in case you are NOT worried about order of output. You need not to run these many commands, you could simply pass your Input_files to this code.
awk '
BEGIN{
OFS="\t"
}
FNR==NR{
a[$1]=$2
next
}
$1 in a{
print $1,a[$1],$2
delete a[$1]
next
}
{
b[$1]=$2
}
END{
for(i in a){
print i,a[i],"\t"
}
for(j in b){
print j,"\t",b[j]
}
}
' file2 file1