Awk: Words frequency from one text file, how to ouput into myFile.txt?

后端 未结 3 1879
心在旅途
心在旅途 2021-01-25 22:51

Given a .txt files with space separated words such as:

But where is Esope the holly Bastard
But where is

And the Awk f

相关标签:
3条回答
  • 2021-01-25 23:17

    Your pipeline isn't very efficient you should do the whole thing in awk instead:

    awk '{a[$1]++}END{for(k in a)print a[k],k}' RS=" |\n" file > myfile
    

    If you want the output in sorted order:

    awk '{a[$1]++}END{for(k in a)print a[k],k}' RS=" |\n" file | sort > myfile
    

    The actual output given by your pipeline is:

    $ tr ' ' '\n' < file | sort | uniq -c | awk '{print $2"@"$1}'
    Bastard@1
    But@2
    Esope@1
    holly@1
    is@2
    the@1
    where@2
    

    Note: using cat is useless here we can just redirect the input with <. The awk script doesn't make sense either, it's just reversing the order of the words and words frequency and separating them with an @. If we drop the awk script the output is closer to the desired output (notice the preceding spacing however and it's unsorted):

    $ tr ' ' '\n' < file | sort | uniq -c 
          1 Bastard
          2 But
          1 Esope
          1 holly
          2 is
          1 the
          2 where
    

    We could sort again a remove the leading spaces with sed:

    $ tr ' ' '\n' < file | sort | uniq -c | sort | sed 's/^\s*//'
    1 Bastard
    1 Esope
    1 holly
    1 the
    2 But
    2 is
    2 where
    

    But like I mention at the start let awk handle it:

    $ awk '{a[$1]++}END{for(k in a)print a[k],k}' RS=" |\n" file | sort
    1 Bastard
    1 Esope
    1 holly
    1 the
    2 But
    2 is
    2 where
    
    0 讨论(0)
  • 2021-01-25 23:29

    Just redirect output to a file.

    cat /pathway/to/your/file.txt % tr ' ' '\n' | sort | uniq -c | \
    awk '{print $2"@"$1}' > myFile.txt
    
    0 讨论(0)
  • 2021-01-25 23:41

    Just use shell redirection :

     echo "test" > overwrite-file.txt
     echo "test" >> append-to-file.txt
    

    Tips

    A useful command is tee which allow to redirect to a file and still see the output :

    echo "test" | tee overwrite-file.txt
    echo "test" | tee -a append-file.txt
    

    Sorting and locale

    I see you are working with asian script, you need to be need to be careful with the locale use by your system, as the resulting sort might not be what you expect :

    * WARNING * The locale specified by the environment affects sort order. Set LC_ALL=C to get the traditional sort order that uses native byte values.

    And have a look at the output of :

    locale 
    
    0 讨论(0)
提交回复
热议问题