REGEX for bbcode links + non-bbcode URLs

丶灬走出姿态 提交于 2020-01-07 04:06:05

问题


Tehre doesn't appear to be a clear answer on how to do this the best way.

I have some bbcode which may have links in bbcode format:

[url=http://thisisalink.com]link[/url]

as well as possible copy/pasted urls:

http://thisisalink.com

I want to replace both instances with a clickable link. I currently have the following: regexs running:

"/\[link=http:\/\/(.*?)\](.*?)\[\/link\]/is"
"/\[link=https:\/\/(.*?)\](.*?)\[\/link\]/is"
"/\[link=(.*?)\](.*?)\[\/link\]/is"

$URLRegex = '/(?:(?<!(\[\/link\]|\[\/link=))(\s|^))'; // No [url]-tag in front and is start of string, or has whitespace in front
$URLRegex.= '(';                                    // Start capturing URL
$URLRegex.= '(https?|ftps?|ircs?):\/\/';            // Protocol
$URLRegex.= '\S+';                                  // Any non-space character
$URLRegex.= ')';                                    // Stop capturing URL
$URLRegex.= '(?:(?<![[:punct:]])(\s|\.?$))/i';      // Doesn't end with punctuation and is end of string, or has whitespace after

It just seems that I can't get both to work. In this case, the last regex seems to unlink the first regex.

Surely this has been documented somewhere on the best way to get both bbcode links and pasted URLs to link up together without conflicting with each other.


回答1:


That you can do is to use an alternation which begin with the bbcode pattern to avoid the replacement of the link inside bbcode tags, example:

$pattern = '~\[url\s*+=\s*+([^]\s]++)]([^[]++)\[/url]|((http://\S++))~i';
$result = preg_replace($pattern, '<a href="$1$3">$2$4</a>', $string);

Note that i have captured two times the copy/pasted url to avoid to use the preg_replace_callback function.

I have used a simplified pattern for the copy/pasted url, you can however replace it by what you want to deal with https, ftp, ftps....




回答2:


I ended up going with this. I then passed it do a callback which allows me to do some special code in php for some link checking:

# MATCH '?://www.link.com' and make it a bbcode link
$URLRegex = '/(?:(?<!(\[\/link\]|\[\/link=))(\s|^))'; // No [url]-tag in front and is start of string, or has whitespace in front
$URLRegex.= '(';                                    // Start capturing URL
$URLRegex.= '(https?|ftps?|ircs?|http?|ftp?|irc?):\/\/';            // Protocol
$URLRegex.= '\S+';                                  // Any non-space character
$URLRegex.= ')';                                    // Stop capturing URL
$URLRegex.= '(?:(?<![[:punct:]])(\s|\.?$))/i';
$output = preg_replace($URLRegex, "$2[link=$3]$3[/link]$5", $output);

# MATCH 'www.link.com' and make it a bbcode link
$URLRegex2 = '/(?:(?<!(\[\/link\]|\[\/link=))(\s|^))'; // No [url]-tag in front and is start of string, or has whitespace in front
$URLRegex2.= '(';                                    // Start capturing URL
$URLRegex2.= 'www.';            // Protocol
$URLRegex2.= '\S+';                                  // Any non-space character
$URLRegex2.= ')';                                    // Stop capturing URL
$URLRegex2.= '(?:(?<![[:punct:]])(\s|\.?$))/i';
$output = preg_replace($URLRegex2, "$2[link=http://$3]$3[/link]$5", $output);


# link up a [link=....]some words[/link]
$output = preg_replace_callback(
    "/\[link=(.*?):\/\/(.*?)\](.*?)\[\/link\]/is", 
    Array($this,'bbcode_format_link1'),
    $output);


来源:https://stackoverflow.com/questions/17151716/regex-for-bbcode-links-non-bbcode-urls

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!