Awk: Words frequency from one text file, how to ouput into myFile.txt?

后端 未结 3 1878
心在旅途
心在旅途 2021-01-25 22:51

Given a .txt files with space separated words such as:

But where is Esope the holly Bastard
But where is

And the Awk f

3条回答
  •  遥遥无期
    2021-01-25 23:17

    Your pipeline isn't very efficient you should do the whole thing in awk instead:

    awk '{a[$1]++}END{for(k in a)print a[k],k}' RS=" |\n" file > myfile
    

    If you want the output in sorted order:

    awk '{a[$1]++}END{for(k in a)print a[k],k}' RS=" |\n" file | sort > myfile
    

    The actual output given by your pipeline is:

    $ tr ' ' '\n' < file | sort | uniq -c | awk '{print $2"@"$1}'
    Bastard@1
    But@2
    Esope@1
    holly@1
    is@2
    the@1
    where@2
    

    Note: using cat is useless here we can just redirect the input with <. The awk script doesn't make sense either, it's just reversing the order of the words and words frequency and separating them with an @. If we drop the awk script the output is closer to the desired output (notice the preceding spacing however and it's unsorted):

    $ tr ' ' '\n' < file | sort | uniq -c 
          1 Bastard
          2 But
          1 Esope
          1 holly
          2 is
          1 the
          2 where
    

    We could sort again a remove the leading spaces with sed:

    $ tr ' ' '\n' < file | sort | uniq -c | sort | sed 's/^\s*//'
    1 Bastard
    1 Esope
    1 holly
    1 the
    2 But
    2 is
    2 where
    

    But like I mention at the start let awk handle it:

    $ awk '{a[$1]++}END{for(k in a)print a[k],k}' RS=" |\n" file | sort
    1 Bastard
    1 Esope
    1 holly
    1 the
    2 But
    2 is
    2 where
    

提交回复
热议问题