问题
I am still learning Perl, so apologies if this is an obvious question. Is there a way to match text that is NOT enclosed by parenthesis? For example, searching for foo would match the second line only.
(bar foo bar)
bar foo (
bar foo
(bar) (foo)
)
回答1:
This is very far from "obvious"; on the contrary. There is no direct way to say "don't match" for a complex pattern (there is good support at a character level, with [^a]
, \S
etc). Regex is firstly about matching things, not about not-matching them.
One approach is to match those (possibly nested) delimiters and get everything other than that.
A good tool for finding nested delimiters is the core module Text::Balanced. As it matches it can also give us the substring before the match and the rest of the string after the match.
use warnings;
use strict;
use feature 'say';
use Text::Balanced qw(extract_bracketed);
my $text = <<'END';
(bar foo bar)
bar foo (
bar foo
(bar) (foo)
)
END
my ($match, $before);
my $remainder = $text;
while (1) {
($match, $remainder, $before) = extract_bracketed($remainder, '(', '[^(]*');
print $before // $remainder;
last if not defined $match;
}
The extract_bracketed
returns the match, the remainder substring ($remainder
), and the substring before the match ($before
); so we keep matching in the remainder.
Taken from this post, where there are more details and another way, using Regexp::Common.
回答2:
Regex patterns have an implicit leading \G(?s:.)*?
("skip characters until a match is found"). The following expands that definition to consider nested parens to be a character to skip.
while (
$string =~ m{
\G (?&MEGA_DOT)*?
( foo )
(?(DEFINE)
(?<MEGA_DOT> [^()] | \( (?&MEGA_DOT)*+ \) )
)
}xg
) {
say "Found a match at pos $-[1].";
}
来源:https://stackoverflow.com/questions/47778760/matching-text-not-enclosed-by-parenthesis