问题
Tehre doesn't appear to be a clear answer on how to do this the best way.
I have some bbcode which may have links in bbcode format:
[url=http://thisisalink.com]link[/url]
as well as possible copy/pasted urls:
http://thisisalink.com
I want to replace both instances with a clickable link. I currently have the following: regexs running:
"/\[link=http:\/\/(.*?)\](.*?)\[\/link\]/is"
"/\[link=https:\/\/(.*?)\](.*?)\[\/link\]/is"
"/\[link=(.*?)\](.*?)\[\/link\]/is"
$URLRegex = '/(?:(?<!(\[\/link\]|\[\/link=))(\s|^))'; // No [url]-tag in front and is start of string, or has whitespace in front
$URLRegex.= '('; // Start capturing URL
$URLRegex.= '(https?|ftps?|ircs?):\/\/'; // Protocol
$URLRegex.= '\S+'; // Any non-space character
$URLRegex.= ')'; // Stop capturing URL
$URLRegex.= '(?:(?<![[:punct:]])(\s|\.?$))/i'; // Doesn't end with punctuation and is end of string, or has whitespace after
It just seems that I can't get both to work. In this case, the last regex seems to unlink the first regex.
Surely this has been documented somewhere on the best way to get both bbcode links and pasted URLs to link up together without conflicting with each other.
回答1:
That you can do is to use an alternation which begin with the bbcode pattern to avoid the replacement of the link inside bbcode tags, example:
$pattern = '~\[url\s*+=\s*+([^]\s]++)]([^[]++)\[/url]|((http://\S++))~i';
$result = preg_replace($pattern, '<a href="$1$3">$2$4</a>', $string);
Note that i have captured two times the copy/pasted url to avoid to use the preg_replace_callback function.
I have used a simplified pattern for the copy/pasted url, you can however replace it by what you want to deal with https, ftp, ftps....
回答2:
I ended up going with this. I then passed it do a callback which allows me to do some special code in php for some link checking:
# MATCH '?://www.link.com' and make it a bbcode link
$URLRegex = '/(?:(?<!(\[\/link\]|\[\/link=))(\s|^))'; // No [url]-tag in front and is start of string, or has whitespace in front
$URLRegex.= '('; // Start capturing URL
$URLRegex.= '(https?|ftps?|ircs?|http?|ftp?|irc?):\/\/'; // Protocol
$URLRegex.= '\S+'; // Any non-space character
$URLRegex.= ')'; // Stop capturing URL
$URLRegex.= '(?:(?<![[:punct:]])(\s|\.?$))/i';
$output = preg_replace($URLRegex, "$2[link=$3]$3[/link]$5", $output);
# MATCH 'www.link.com' and make it a bbcode link
$URLRegex2 = '/(?:(?<!(\[\/link\]|\[\/link=))(\s|^))'; // No [url]-tag in front and is start of string, or has whitespace in front
$URLRegex2.= '('; // Start capturing URL
$URLRegex2.= 'www.'; // Protocol
$URLRegex2.= '\S+'; // Any non-space character
$URLRegex2.= ')'; // Stop capturing URL
$URLRegex2.= '(?:(?<![[:punct:]])(\s|\.?$))/i';
$output = preg_replace($URLRegex2, "$2[link=http://$3]$3[/link]$5", $output);
# link up a [link=....]some words[/link]
$output = preg_replace_callback(
"/\[link=(.*?):\/\/(.*?)\](.*?)\[\/link\]/is",
Array($this,'bbcode_format_link1'),
$output);
来源:https://stackoverflow.com/questions/17151716/regex-for-bbcode-links-non-bbcode-urls