I want to print words between \"ctr{words}\" and count the same words in a file.
I tried:
sed -n \'s/.*ctr{\\(.[^}]*\\).*/\\1/p\' file
When searching for matches in files grep
is the best choice more often than not.
Using grep
with postive lookahead and uniq -c
:
$ grep -Po "(?<=ctr{)[^}]+" file | uniq -c
1 Mo7afazat
1 JaishanaIN
2 ZainElKul
1 ZainUnlimited
1 AlBarakehNew
1 ZainElKulSN
From man uniq
:
Note: 'uniq' does not detect repeated lines unless they are adjacent.
For files where the duplicates are not adjacent pipe to sort
first however the order in which each match is found in the orignal file will be lost:
grep -Po "(?<=ctr{)[^}]+" file | sort | uniq -c
1 AlBarakehNew
1 JaishanaIN
1 Mo7afazat
2 ZainElKul
1 ZainElKulSN
1 ZainUnlimited
It looks like you are missing the counts. The easiest way to do this is to pipe your output through uniq -c
:
$ sed -n 's/.*ctr{\(.[^}]*\).*/\1/p' file | sort | uniq -c
1 **Mo7afazat**
1 **JaishanaIN**
2 **ZainElKul**
1 ZainUnlimited
1 **AlBarakehNew**
1 **ZainElKulSN**
Another way, only using awk
:
$ awk 'match($0,".*ctr{([^}]*)}.*",m){a[m[1]]++}END{for(i in a) print i,a[i]}' file
ZainUnlimited 1
**ZainElKulSN** 1
**Mo7afazat** 1
**ZainElKul** 2
**JaishanaIN** 1
**AlBarakehNew** 1