Matching data to correct ID from two files in awk

后端 未结 2 1757
耶瑟儿~
耶瑟儿~ 2021-01-16 06:23

I am trying to combine data from two different files. In each file, some data is linked to some ID. I want to \'combine\' both files in the sense that all ID\'s mus

相关标签:
2条回答
  • 2021-01-16 07:03

    Does it have to be awk, or did you choose this because you think that's the best - easiest way?

    You can do this via join

    $join -j 1 -a 1 -a 2 -o auto file_1 file_2 | column -t -s' ' -o' '
    1.01 data_a data_aa
    1.02 data_b
    1.03 data_c data_cc
    1.04 data_d
    1.05 data_e data_ee
    1.06 data_f
    1.09        data_ii
    

    edit: As per the excellent suggestion from KamilCuk you can preserve the output afterwards.

    0 讨论(0)
  • 2021-01-16 07:05

    1st Solution: In case you do have duplicate values of $1 in your Input_file(s) then following will take care of that case also.

    awk '
    BEGIN{
      OFS="\t"
    }
    FNR==NR{
      a[$1]=$2
      next
    }
    $1 in a{
      print $1,a[$1],$2
      c[$1]
      next
    }
    {
      b[$1]=$2
    }
    END{
      for(i in a){
        if(!(i in c)){
          print i,a[i],"\t"
        }
      }
      for(j in b){
        print j,"\t",b[j]
      }
    }
    ' Input_file2  Input_file1
    


    2nd solution: Could you please try following in case you are NOT worried about order of output. You need not to run these many commands, you could simply pass your Input_files to this code.

    awk '
    BEGIN{
      OFS="\t"
    }
    FNR==NR{
      a[$1]=$2
      next
    }
    $1 in a{
      print $1,a[$1],$2
      delete a[$1]
      next
    }
    {
      b[$1]=$2
    }
    END{
      for(i in a){
        print i,a[i],"\t"
      }
      for(j in b){
        print j,"\t",b[j]
      }
    }
    ' file2 file1
    
    0 讨论(0)
提交回复
热议问题