Awk: Words frequency from one text file, how to ouput into myFile.txt?

左心房为你撑大大i 提交于 2019-12-02 06:35:48

Your pipeline isn't very efficient you should do the whole thing in awk instead:

awk '{a[$1]++}END{for(k in a)print a[k],k}' RS=" |\n" file > myfile

If you want the output in sorted order:

awk '{a[$1]++}END{for(k in a)print a[k],k}' RS=" |\n" file | sort > myfile

The actual output given by your pipeline is:

$ tr ' ' '\n' < file | sort | uniq -c | awk '{print $2"@"$1}'
Bastard@1
But@2
Esope@1
holly@1
is@2
the@1
where@2

Note: using cat is useless here we can just redirect the input with <. The awk script doesn't make sense either, it's just reversing the order of the words and words frequency and separating them with an @. If we drop the awk script the output is closer to the desired output (notice the preceding spacing however and it's unsorted):

$ tr ' ' '\n' < file | sort | uniq -c 
      1 Bastard
      2 But
      1 Esope
      1 holly
      2 is
      1 the
      2 where

We could sort again a remove the leading spaces with sed:

$ tr ' ' '\n' < file | sort | uniq -c | sort | sed 's/^\s*//'
1 Bastard
1 Esope
1 holly
1 the
2 But
2 is
2 where

But like I mention at the start let awk handle it:

$ awk '{a[$1]++}END{for(k in a)print a[k],k}' RS=" |\n" file | sort
1 Bastard
1 Esope
1 holly
1 the
2 But
2 is
2 where

Just redirect output to a file.

cat /pathway/to/your/file.txt % tr ' ' '\n' | sort | uniq -c | \
awk '{print $2"@"$1}' > myFile.txt

Just use shell redirection :

 echo "test" > overwrite-file.txt
 echo "test" >> append-to-file.txt

Tips

A useful command is tee which allow to redirect to a file and still see the output :

echo "test" | tee overwrite-file.txt
echo "test" | tee -a append-file.txt

Sorting and locale

I see you are working with asian script, you need to be need to be careful with the locale use by your system, as the resulting sort might not be what you expect :

* WARNING * The locale specified by the environment affects sort order. Set LC_ALL=C to get the traditional sort order that uses native byte values.

And have a look at the output of :

locale 
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!