I have tried to extract a number as given below but nothing is printed on screen:
echo \"This is an example: 65 apples\" | sed -n \'s/.*\\([0-9]*\\) apples/\\1/
echo "This is an example: 65 apples" | ssed -nR -e 's/.*?\b([0-9]*) apples/\1/p'
You will however need super-sed for this to work. The -R allows perl regexp.
$ echo "This is an example: 65 apples" | sed -r 's/^[^0-9]*([0-9]+).*/\1/'
65
It's because your first .*
is greedy, and your [0-9]*
allows 0 or more digits.
Hence the .*
gobbles up as much as it can (including the digits) and the [0-9]*
matches nothing.
You can do:
echo "This is an example: 65 apples" | sed -n 's/.*\b\([0-9]\+\) apples/\1/p'
where I forced the [0-9]
to match at least one digit, and also added a word boundary before the digits so the whole number is matched.
However, it's easier to use grep
, where you match just the number:
echo "This is an example: 65 apples" | grep -P -o '[0-9]+(?= +apples)'
The -P
means "perl regex" (so I don't have to worry about escaping the '+').
The -o
means "only print the matches".
The (?= +apples)
means match the digits followed by the word apples.
What you are seeing is the greedy behavior of regex. In your first example, .*
gobbles up all the digits. Something like this does it:
echo "This is an example: 65144 apples" | sed -n 's/[^0-9]*\([0-9]\+\) apples/\1/p'
65144
This way, you can't match any digits in the first bit. Some regex dialects have a way to ask for non-greedy matching, but I don't believe sed
has one.
A simple way for extracting all numbers from a string
echo "1213 test 456 test 789" | grep -P -o "\d+"
And the result:
1213
456
789