Extract URLs from text in PHP

后端未结

关注

 14  2094

I have this text:

$string = \"this is my friend\'s website http://example.com I think it is coll\";

How can I extract the link into another

相关标签:

14条回答

抹茶落季

2020-11-22 13:41

You could do like this..

<?php
$string = "this is my friend's website http://example.com I think it is coll";
echo explode(' ',strstr($string,'http://'))[0]; //"prints" http://example.com

0 讨论(0)

北荒

2020-11-22 13:44

If the text you extract the URLs from is user-submitted and you're going to display the result as links anywhere, you have to be very, VERY careful to avoid XSS vulnerabilities, most prominently "javascript:" protocol URLs, but also malformed URLs that might trick your regexp and/or the displaying browser into executing them as Javascript URLs. At the very least, you should accept only URLs that start with "http", "https" or "ftp".

There's also a blog entry by Jeff where he describes some other problems with extracting URLs.

0 讨论(0)
发布评论:

提交评论
- 加载中...

庸人自扰

2020-11-22 13:44

This Regex works great for me and i have checked with all types of URL,

<?php
$string = "Thisregexfindurlhttp://www.rubular.com/r/bFHobduQ3n mixedwithstring";
preg_match_all('/(https?|ssh|ftp):\/\/[^\s"]+/', $string, $url);
$all_url = $url[0]; // Returns Array Of all Found URL's
$one_url = $url[0][0]; // Gives the First URL in Array of URL's
?>

Checked with lots of URL's can find here http://www.rubular.com/r/bFHobduQ3n

0 讨论(0)

再見小時候

2020-11-22 13:47
There are a lot of edge cases with urls. Like url could contain brackets or not contain protocol etc. Thats why regex is not enough.

I created a PHP library that could deal with lots of edge cases: Url highlight.

Example:
```
<?php

use VStelmakh\UrlHighlight\UrlHighlight;

$urlHighlight = new UrlHighlight();
$urlHighlight->getUrls("this is my friend's website http://example.com I think it is coll");
// return: ['http://example.com']
```
For more details see readme. For covered url cases see test.
0 讨论(0)
发布评论:

提交评论
- 加载中...
长发绾君心

2020-11-22 13:48
Probably the safest way is using code snippets from WordPress. Download the latest one (currently 3.1.1) and see wp-includes/formatting.php. There's a function named make_clickable which has plain text for param and returns formatted string. You can grab codes for extracting URLs. It's pretty complex though.

This one line regex might be helpful.
```
preg_match_all('#\bhttps?://[^\s()<>]+(?:$[\w\d]+$|([^[:punct:]\s]|/))#', $string, $match);
```
But this regex still can't remove some malformed URLs (ex. http://google:ha.ckers.org ).

See also: How to mimic StackOverflow Auto-Link Behavior
0 讨论(0)
发布评论:

提交评论
- 加载中...

别那么骄傲

2020-11-22 13:54

Here is a function I use, can't remember where it came from but seems to do a pretty good job of finding links in the text. and making them links.

You can change the function to suit your needs. I just wanted to share this as I was looking around and remembered I had this in one of my helper libraries.

function make_links($str){

  $pattern = '(?xi)\b((?:https?://|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:\'".,<>?«»“”‘’]))';

  return preg_replace_callback("#$pattern#i", function($matches) {
    $input = $matches[0];
    $url = preg_match('!^https?://!i', $input) ? $input : "http://$input";
    return '<a href="' . $url . '" rel="nofollow" target="_blank">' . "$input</a>";
  }, $str);
}

Use:

$subject = 'this is a link http://google:ha.ckers.org maybe don't want to visit it?';
echo make_links($subject);

Output

this is a link <a href="http://google:ha.ckers.org" rel="nofollow" target="_blank">http://google:ha.ckers.org</a> maybe don't want to visit it?

0 讨论(0)

1 2 3 下一页