I have a text file containing both text and numbers, I want to use grep to extract only the numbers I need for example, given a file as follow:
miss rate 0.21
You can use:
grep -P "miss rate \d+(\.\d+)?" file.txt
or:
grep -E "miss rate [0-9]+(\.[0-9]+)?"
Both of those commands will print out miss rate 0.21
. If you want to extract the number only, why not use Perl, Sed or Awk?
If you really want to avoid those, maybe this will work?
grep -E "miss rate [0-9]+(\.[0-9]+)?" g | xargs basename | tail -n 1
The grep
-and-cut
solution would look like:
to get the 3rd field for every successful grep use:
grep "^miss rate " yourfile | cut -d ' ' -f 3
or to get the 3rd field and the rest use:
grep "^miss rate " yourfile | cut -d ' ' -f 3-
Or if you use bash and "miss rate" only occurs once in your file you can also just do:
a=( $(grep -m 1 "miss rate" yourfile) )
echo ${a[2]}
where ${a[2]}
is your result.
If "miss rate" occurs more then once you can loop over the grep output reading only what you need. (in bash)
If you really want to use only grep for this, then you can try:
grep "miss rate" file | grep -oe '\([0-9.]*\)'
It will first find the line that matches, and then only output the digits.
Sed might be a bit more readable, though:
sed -n 's#miss rate ##p' file
Use awk
instead:
awk '/^miss rate/ { print $3 }' yourfile
To do it with just grep, you need non-standard extensions like here with GNU grep using PCRE (-P) with positive lookbehind (?<=..) and match only (-o):
grep -Po '(?<=miss rate ).*' yourfile
I believe
sed 's|[^0-9]*\([0-9\.]*\)|\1 |g' fiilename
will do the trick. However every entry will be on it's own line if that is ok. I am sure there is a way for sed to produce a comma or space delimited list but I am not a super master of all things sed.
Using the special look around regex trick \K with pcre engine with grep :
grep -oP 'miss rate \K.*' file.txt
or with perl :
perl -lne 'print $& if /miss rate \K.*/' file.txt