Join two files on Linux

前端 未结 5 1232
予麋鹿
予麋鹿 2021-01-15 21:51

I have two files; I want to join them.

$cat t1
 1 1.2
 2 2.2
$cat t2
 1
 2
 1

I want to have the output below

$cat joind.tx         


        
相关标签:
5条回答
  • 2021-01-15 22:17

    Something like the following with do:

    $ while IFS= read -r line; do grep -m 1 "^$line" t1; done <t2
     1 1.2
     2 2.2
     1 1.2
    
    0 讨论(0)
  • 2021-01-15 22:32

    A simple awk is suffice for this:

    awk 'FNR==NR{a[$1]=$2;next} {print $1, a[$1]}' t1 t2
    1 1.2
    2 2.2
    1 1.2
    

    Breakup:

    NR == FNR {                  # While processing the first file
      a[$1] = $2                 # store the second field by the first
      next                       # move to next record in 1st file
    }
    {                            # while processing the second file
      print $1, a[$1]            # print $1 and the remembered
                                 # value from the first file.
    }
    
    0 讨论(0)
  • 2021-01-15 22:34

    If I understand you want to match the first column of t1 with the values in t2. So t1 is a dictionnary and t2 the wanted keys.

    If so, you can use this:

    $ cat t2 | xargs -n1 -I{} grep -P "^\Q{}\E\s+" t1
    

    How does it work?

    xargs will execute the command grep for each one entry -n1 of t2. The -I{} allows me to put the value where I want to.

    Then I execute grep which match the wanted value from the dictionary using a regular expression.

    ^    # Any line that begin with
    \Q   # Quote the value (in case we have special chars inside it)
    {}   # The corresponding value matched by xargs
    \E   # End of quoting
    \s+  # Followed by one or more spaces (alternatively we can use `\b`)
    .*   # Followed by anything (optional)
    
    t1   # Inside the file `t1`
    

    Alternatively you can play with Perl :)

    cat t2 | perl -e '$_ = qx{cat $ARGV[0]}; \
          $t1{$1} = $2 while(/^(\w+)\s+(.*)/gm); \
          print "$t1{$_}\n" for (split "\n", do{local $/, <STDIN>})' t1
    
    0 讨论(0)
  • 2021-01-15 22:38

    you can try AWK:

    awk 'NR==FNR{a[$1]=$2}NR>FNR{print $1,a[$1]}' t1 t2

    0 讨论(0)
  • 2021-01-15 22:41

    join requires that both files to be sorted. If you sort them first, you'll get all your output

    $ sort t1 > t1.sorted
    $ sort t2 > t2.sorted
    $ join -j1 -o 1.1,1.2 t1.sorted t2.sorted
    1 1.2
    1 1.2
    2 2.2
    

    Without the sort:

    $ join -j1 -o 1.1,1.2 t1 t2
    1 1.2
    2 2.2
    

    This assumes that the order of your inputs don't need to be preserved; if they do, you will need a custom script like other answers have provided.

    0 讨论(0)
提交回复
热议问题