Extract URLs from text in PHP

后端 未结 14 2086
野趣味
野趣味 2020-11-22 13:29

I have this text:

$string = \"this is my friend\'s website http://example.com I think it is coll\";

How can I extract the link into another

相关标签:
14条回答
  • 2020-11-22 13:41

    You could do like this..

    <?php
    $string = "this is my friend's website http://example.com I think it is coll";
    echo explode(' ',strstr($string,'http://'))[0]; //"prints" http://example.com
    
    0 讨论(0)
  • 2020-11-22 13:44

    If the text you extract the URLs from is user-submitted and you're going to display the result as links anywhere, you have to be very, VERY careful to avoid XSS vulnerabilities, most prominently "javascript:" protocol URLs, but also malformed URLs that might trick your regexp and/or the displaying browser into executing them as Javascript URLs. At the very least, you should accept only URLs that start with "http", "https" or "ftp".

    There's also a blog entry by Jeff where he describes some other problems with extracting URLs.

    0 讨论(0)
  • 2020-11-22 13:44

    This Regex works great for me and i have checked with all types of URL,

    <?php
    $string = "Thisregexfindurlhttp://www.rubular.com/r/bFHobduQ3n mixedwithstring";
    preg_match_all('/(https?|ssh|ftp):\/\/[^\s"]+/', $string, $url);
    $all_url = $url[0]; // Returns Array Of all Found URL's
    $one_url = $url[0][0]; // Gives the First URL in Array of URL's
    ?>
    

    Checked with lots of URL's can find here http://www.rubular.com/r/bFHobduQ3n

    0 讨论(0)
  • 2020-11-22 13:47

    There are a lot of edge cases with urls. Like url could contain brackets or not contain protocol etc. Thats why regex is not enough.

    I created a PHP library that could deal with lots of edge cases: Url highlight.

    Example:

    <?php
    
    use VStelmakh\UrlHighlight\UrlHighlight;
    
    $urlHighlight = new UrlHighlight();
    $urlHighlight->getUrls("this is my friend's website http://example.com I think it is coll");
    // return: ['http://example.com']
    

    For more details see readme. For covered url cases see test.

    0 讨论(0)
  • 2020-11-22 13:48

    Probably the safest way is using code snippets from WordPress. Download the latest one (currently 3.1.1) and see wp-includes/formatting.php. There's a function named make_clickable which has plain text for param and returns formatted string. You can grab codes for extracting URLs. It's pretty complex though.

    This one line regex might be helpful.

    preg_match_all('#\bhttps?://[^\s()<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|/))#', $string, $match);
    

    But this regex still can't remove some malformed URLs (ex. http://google:ha.ckers.org ).

    See also: How to mimic StackOverflow Auto-Link Behavior

    0 讨论(0)
  • 2020-11-22 13:54

    Here is a function I use, can't remember where it came from but seems to do a pretty good job of finding links in the text. and making them links.

    You can change the function to suit your needs. I just wanted to share this as I was looking around and remembered I had this in one of my helper libraries.

    function make_links($str){
    
      $pattern = '(?xi)\b((?:https?://|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:\'".,<>?«»“”‘’]))';
    
      return preg_replace_callback("#$pattern#i", function($matches) {
        $input = $matches[0];
        $url = preg_match('!^https?://!i', $input) ? $input : "http://$input";
        return '<a href="' . $url . '" rel="nofollow" target="_blank">' . "$input</a>";
      }, $str);
    } 
    

    Use:

    $subject = 'this is a link http://google:ha.ckers.org maybe don't want to visit it?';
    echo make_links($subject);
    

    Output

    this is a link <a href="http://google:ha.ckers.org" rel="nofollow" target="_blank">http://google:ha.ckers.org</a> maybe don't want to visit it?
    
    0 讨论(0)
提交回复
热议问题