Getting the count of unique values in a column in bash

后端未结

关注

 5  897

一整个雨季 2021-01-30 08:14

I have tab delimited files with several columns. I want to count the frequency of occurrence of the different values in a column for all the files in a folder and sort them in d

5条回答

庸人自扰 (楼主)

2021-01-30 08:25
The GNU site suggests this nice awk script, which prints both the words and their frequency.

Possible changes:
- You can pipe through sort -nr (and reverse word and freq[word]) to see the result in descending order.
- If you want a specific column, you can omit the for loop and simply write freq[3]++ - replace 3 with the column number.
Here goes:
```
 # wordfreq.awk --- print list of word frequencies

 {
     $0 = tolower($0)    # remove case distinctions
     # remove punctuation
     gsub(/[^[:alnum:]_[:blank:]]/, "", $0)
     for (i = 1; i <= NF; i++)
         freq[$i]++
 }

 END {
     for (word in freq)
         printf "%s\t%d\n", word, freq[word]
 }
```
0 讨论(0)

查看其它5个回答
发布评论:

提交评论
- 加载中...