I am trying to output a string that contains everything between two words of a string:
input:
\"Here is a String\"
output:
The accepted answer does not remove text that could be before Here
or after String
. This will:
sed -e 's/.*Here\(.*\)String.*/\1/'
The main difference is the addition of .*
immediately before Here
and after String
.
sed -e 's/Here\(.*\)String/\1/'
GNU grep can also support positive & negative look-ahead & look-back: For your case, the command would be:
echo "Here is a string" | grep -o -P '(?<=Here).*(?=string)'
If there are multiple occurrences of Here
and string
, you can choose whether you want to match from the first Here
and last string
or match them individually. In terms of regex, it is called as greedy match (first case) or non-greedy match (second case)
$ echo 'Here is a string, and Here is another string.' | grep -oP '(?<=Here).*(?=string)' # Greedy match
is a string, and Here is another
$ echo 'Here is a string, and Here is another string.' | grep -oP '(?<=Here).*?(?=string)' # Non-greedy match (Notice the '?' after '*' in .*)
is a
is another
If you have a long file with many multi-line ocurrences, it is useful to first print number lines:
cat -n file | sed -n '/Here/,/String/p'
Through GNU awk,
$ echo "Here is a string" | awk -v FS="(Here|string)" '{print $2}'
is a
grep with -P
(perl-regexp) parameter supports \K
, which helps in discarding the previously matched characters. In our case , the previously matched string was Here
so it got discarded from the final output.
$ echo "Here is a string" | grep -oP 'Here\K.*(?=string)'
is a
$ echo "Here is a string" | grep -oP 'Here\K(?:(?!string).)*'
is a
If you want the output to be is a
then you could try the below,
$ echo "Here is a string" | grep -oP 'Here\s*\K.*(?=\s+string)'
is a
$ echo "Here is a string" | grep -oP 'Here\s*\K(?:(?!\s+string).)*'
is a
All the above solutions have deficiencies where the last search string is repeated elsewhere in the string. I found it best to write a bash function.
function str_str {
local str
str="${1#*${2}}"
str="${str%%$3*}"
echo -n "$str"
}
# test it ...
mystr="this is a string"
str_str "$mystr" "this " " string"