问题
I'm creating some custom BBcode for a forum. I'm trying to get the regular expression right, but it has been eluding me for weeks. Any expert advice is welcome.
Sample input (a very basic example):
[quote=Bob]I like Candace. She is nice.[/quote]
Ashley Ryan Thomas
Essentially, I want to encase any names (from a specified list) in [user][/user] BBcode... except, of course, those being quoted, because doing that causes some terrible parsing errors.
The desired output:
[quote=Bob]I like [user]Candace[/user]. She is nice.[/quote]
[user]Ashley[/user] [user]Ryan[/user] [user]Thomas[/user]
My current code:
$searchArray = array(
'/(?i)([^=]|\b|\s|\/|\r|\n|\t|^)(Ashley|Bob|Candace|Ryan|Thomas)(\s|\r|\n|\t|,|\.(\b|\s|\.|$)|;|:|\'|"|-|!|\?|\)|\/|\[|$)/'
);
$replaceArray = array(
"\\1[user]\\2[/user]\\3"
);
$text = preg_replace($searchArray, $replaceArray, $input);
What it currently produces:
[quote=Bob]I like [user]Candace[/user]. She is nice.[/quote]
[user]Ashley[/user] Ryan [user]Thomas[/user]
Notice that Ryan isn't encapsulated by [user] tags. Also note that much of the additional regex matching characters were added on an as-needed basis as they cropped up on the forums, so removing them will simply make it fail to match in other situations (i.e. a no-no). Unless, of course, you spot a glaring error in the regex itself, in which case please do point it out.
Really, though, any assistance would be greatly appreciated! Thank you.
回答1:
It's quite simply that you are matching delimiters (\s|\r|...)
at both ends of the searched names. The poor Ashley
and Ryan
share a single space character in your test string. But the regex can only match it once - as left or right border.
The solution here is to use assertions. Enclose the left list in (?<= )
and the right in (?= )
so they become:
(?<=[^=]|\b|\s|\/|^)
(?=\s|,|\.(\b|\s|\.|$)|;|:|\'|"|-|!|\?|\)|\/|\[|$)
Btw, \s
already contains \r|\n|\t
so you can probably remove that.
回答2:
Since you don't really need to match the spaces on either side (just make sure they're there, right?) try replacing your search expression with this:
$searchArray = array(
'/\b(Ashley|Bob|Candace|Ryan|Thomas)\b/i'
);
$replaceArray = array(
'[user]$1[/user]'
);
$text = preg_replace($searchArray, $replaceArray, $input);
来源:https://stackoverflow.com/questions/7772025/complex-name-matching-regex-for-vbulletin