问题
I am trying to find the reason of this command and as I know very basic I found that
last | cut -d" " -f 1 | sort | uniq -c | sort
last
= Last searches back through the file /var/log/wtmp (or the file designated by the -f flag) and displays a list of all users logged in (and out) since that file was created.
cut
is to show the desired column.
The option -d
specifies what is the field delimiter that is used in the input file.
-f
specifies which field you want to extract
1 is the out put I think which I am not sure
and the it is sorting and then it is
Uniq
command is helpful to remove or detect duplicate entries in a file. This tutorial explains few most frequently used uniq command line options that you might find helpful.
If anyone can explain this command and also explain why there is two sorts I will appreciate it.
回答1:
You are right on your explanation of cut
: cut -d" " -f1
(no need of space after f
) gets the first f
ield of a stream based on d
elimiter " "
(space).
Then why sort | uniq -c | sort
?
From man uniq
:
Note: 'uniq' does not detect repeated lines unless they are adjacent. You may want to sort the input first, or use 'sort -u' without 'uniq'. Also, comparisons honor the rules specified by 'LC_COLLATE'.
That's why you need to sort the lines before piping to uniq
. Finally, as uniq
output is not sorted, you need to sort again to see the most repeated items first.
See an example of sort
and uniq -c
for a given file with repeated items:
$ seq 5 >>a
$ seq 5 >>a
$ cat a
1
2
3
4
5
1
2
3
4
5
$ sort a | uniq -c | sort <--- no repeated matches
2 1
2 2
2 3
2 4
2 5
$ uniq -c a | sort <---- repeated matches
1 1
1 1
1 2
1 2
1 3
1 3
1 4
1 4
1 5
1 5
Note you can do the sort | uniq -c
all together with this awk:
last | awk '{a[$1]++} END{for (i in a) print i, a[i]}'
This will store in the a[]
array the values of the first column and increase the counter whenever it finds more. In the END{}
blocks it prints the results, unsorted, so you could pipe again to sort
.
回答2:
uniq -c is being used to create a frequency histogram. The reason for the second sort is that you are then sorting your histogram by frequency order.
The reason for the first sort is that uniq is only comparing each line to its previous when deciding whether the line is unique or not.
来源:https://stackoverflow.com/questions/22556470/what-is-the-meaning-of-delimiter-in-cut-and-why-in-this-command-it-is-sorting-tw