I am trying to find out the frequency of appearance of every letter in the english alphabet in an input file. How can I do this in a bash script?
A solution with sed
, sort
and uniq
:
sed 's/\(.\)/\1\n/g' file | sort | uniq -c
This counts all characters, not only letters. You can filter out with:
sed 's/\(.\)/\1\n/g' file | grep '[A-Za-z]' | sort | uniq -c
If you want to consider uppercase and lowercase as same, just add a translation:
sed 's/\(.\)/\1\n/g' file | tr '[:upper:]' '[:lower:]' | grep '[a-z]' | sort | uniq -c