Is it possible to have a regex that parses only a1bcdea1
from this line a1bcdea1ABCa1DEFa1
?
This grep command does not work:
How about using a ^ start anchor and restricting character set used:
grep -o '^[A-Za-z]1[A-Za-z]*1'
See this Bash demo or Regex Pattern at regex101
If you expect more digits or other characters in between, go with this
grep -oP '^[A-Za-z]1.*?[A-Za-z]1'
The lazy matching requires perl compatible mode. For not at line start, go with this
grep -oP '^.*?\K[A-Za-z]1.*?[A-Za-z]1'
\K resets beginning of the reported match and is a PCRE feature as well.
Here is a gnu awk
solution using split
function:
awk '(n = split($0, a, /[a-zA-Z]1/, b)) > 1 {print b[1] a[2] b[2]}' file
a1bcdea1
This awk
command splits each line on regex /[a-zA-Z]1/
and stores split tokens in array a
and delimiters in array b
.