I have a string that contains sequences delimited by multiple characters: <<
and >>
. I need a regular expression to only give me the
$string = 'do not match this <<but match this>> not this <<BUT NOT THIS <<this too>> IT HAS CHILDREN>> <<and <also> this>>';
@matches = $string =~ /(<<(?:[^<>]+|<(?!<)|>(?!>))*>>)/g;
Here's a way to use split
for the job:
my $str = 'do not match this <<but match this>> not this <<BUT NOT THIS <<this too>> IT HAS CHILDREN>> <<and <also> this>>';
my @a = split /(?=<<)/, $str;
@a = map { split /(?<=>>)/, $_ } @a;
my @match = grep { /^<<.*?>>$/ } @a;
Keeps the tags in there, if you want them removed, just do:
@match = map { s/^<<//; s/>>$//; $_ } @match;
@matches = $string =~ /(<<(?:(?!<<|>>).)*>>)/g;
(?:(?!PAT).)*
is to patterns as [^CHAR]*
is to characters.