Replace URLs in text with HTML links

后端 未结 17 1756
栀梦
栀梦 2020-11-22 11:27

Here is a design though: For example is I put a link such as

http://example.com

in textarea. How do I get PHP t

相关标签:
17条回答
  • 2020-11-22 11:44

    I know this answer has been accepted and that this question is quite old, but it can be useful for other people looking for other implementations.

    This is a modified version of the code posted by: Angel.King.47 on July 27,09:

    $text = preg_replace(
     array(
       '/(^|\s|>)(www.[^<> \n\r]+)/iex',
       '/(^|\s|>)([_A-Za-z0-9-]+(\\.[A-Za-z]{2,3})?\\.[A-Za-z]{2,4}\\/[^<> \n\r]+)/iex',
       '/(?(?=<a[^>]*>.+<\/a>)(?:<a[^>]*>.+<\/a>)|([^="\']?)((?:https?):\/\/([^<> \n\r]+)))/iex'
     ),  
     array(
       "stripslashes((strlen('\\2')>0?'\\1<a href=\"http://\\2\" target=\"_blank\">\\2</a>&nbsp;\\3':'\\0'))",
       "stripslashes((strlen('\\2')>0?'\\1<a href=\"http://\\2\" target=\"_blank\">\\2</a>&nbsp;\\4':'\\0'))",
       "stripslashes((strlen('\\2')>0?'\\1<a href=\"\\2\" target=\"_blank\">\\3</a>&nbsp;':'\\0'))",
     ),  
     $text
    );
    

    Changes:

    • I removed rules #2 and #3 (I'm not sure in which situations are useful).
    • Removed email parsing as I really don't need it.
    • I added one more rule which allows the recognition of URLs in the form: [domain]/* (without www). For example: "example.com/faq/" (Multiple tld: domain.{2-3}.{2-4}/)
    • When parsing strings starting with "http://", it removes it from the link label.
    • Added "target='_blank'" to all links.
    • Urls can be specified just after any(?) tag. For example: <b>www.example.com</b>

    As "Søren Løvborg" has stated, this function does not escape the URLs. I tried his/her class but it just didn't work as I expected (If you don't trust your users, then try his/her code first).

    0 讨论(0)
  • 2020-11-22 11:45

    Something along the lines of :

    <?php
    if(preg_match('@^http://(.*)\s|$@g', $textarea_url, $matches)) {
        echo '<a href=http://", $matches[1], '">', $matches[1], '</a>';
    }
    ?>
    
    0 讨论(0)
  • 2020-11-22 11:45

    This class I created works for my needs, admittedly it does needs some work though;

    class addLink
    {
        public function link($string)
        {
            $expression = "/(?i)\b((?:https?:\/\/|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,63}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'\".,<>?«»“”‘’]))/";
            if(preg_match_all($expression, $string, $matches) == 1)// If the pattern is found then
            {
                $string = preg_replace($expression, '<a href="'.$matches[0][0].'" target="_blank">$1</a>', $string);
            }
    
            return $string;       
        }
    }
    

    An example of using this code;

    include 'PHP/addLink.php';
    
    if(class_exists('addLink')) 
    {                  
        $al = new addLink();                  
    }
    else{
        echo 'Class not found...';
    } 
    
    $paragraph = $al->link($paragraph);
    
    0 讨论(0)
  • 2020-11-22 11:46

    This class changes the urls into text and while keeping the home url as it is. I hope this will help and save time for you.Enjoy.

    class RegClass 
    { 
    
         function preg_callback_url($matches) 
         { 
            //var_dump($matches); 
            //Get the matched URL  text <a>text</a>
            $text = $matches[2];
            //Get the matched URL link <a href ="http://www.test.com">text</a>
            $url = $matches[1];
    
            if($url=='href ="http://www.test.com"'){
             //replace all a tag as it is
             return '<a href='.$url.' rel="nofollow"> '.$text.' </a>'; 
    
             }else{
             //replace all a tag to text
             return " $text " ;
             }
    } 
    function ParseText($text){ 
    
        $text = preg_replace( "/www\./", "http://www.", $text );
            $regex ="/http:\/\/http:\/\/www\./"
        $text = preg_replace( $regex, "http://www.", $text );
            $regex2 = "/https:\/\/http:\/\/www\./";
        $text = preg_replace( $regex2, "https://www.", $text );
    
            return preg_replace_callback('/<a\s(.+?)>(.+?)<\/a>/is',
                    array( &$this,        'preg_callback_url'), $text); 
          } 
    
    } 
    $regexp = new RegClass();
    echo $regexp->ParseText($text);
    
    0 讨论(0)
  • 2020-11-22 11:50

    Let's look at the requirements. You have some user-supplied plain text, which you want to display with hyperlinked URLs.

    1. The "http://" protocol prefix should be optional.
    2. Both domains and IP addresses should be accepted.
    3. Any valid top-level domain should be accepted, e.g. .aero and .xn--jxalpdlp.
    4. Port numbers should be allowed.
    5. URLs must be allowed in normal sentence contexts. For instance, in "Visit stackoverflow.com.", the final period is not part of the URL.
    6. You probably want to allow "https://" URLs as well, and perhaps others as well.
    7. As always when displaying user supplied text in HTML, you want to prevent cross-site scripting (XSS). Also, you'll want ampersands in URLs to be correctly escaped as &amp;.
    8. You probably don't need support for IPv6 addresses.
    9. Edit: As noted in the comments, support for email-adresses is definitely a plus.
    10. Edit: Only plain text input is to be supported – HTML tags in the input should not be honoured. (The Bitbucket version supports HTML input.)

    Edit: Check out GitHub for the latest version, with support for email addresses, authenticated URLs, URLs in quotes and parentheses, HTML input, as well as an updated TLD list.

    Here's my take:

    <?php
    $text = <<<EOD
    Here are some URLs:
    stackoverflow.com/questions/1188129/pregreplace-to-detect-html-php
    Here's the answer: http://www.google.com/search?rls=en&q=42&ie=utf-8&oe=utf-8&hl=en. What was the question?
    A quick look at http://en.wikipedia.org/wiki/URI_scheme#Generic_syntax is helpful.
    There is no place like 127.0.0.1! Except maybe http://news.bbc.co.uk/1/hi/england/surrey/8168892.stm?
    Ports: 192.168.0.1:8080, https://example.net:1234/.
    Beware of Greeks bringing internationalized top-level domains: xn--hxajbheg2az3al.xn--jxalpdlp.
    And remember.Nobody is perfect.
    
    <script>alert('Remember kids: Say no to XSS-attacks! Always HTML escape untrusted input!');</script>
    EOD;
    
    $rexProtocol = '(https?://)?';
    $rexDomain   = '((?:[-a-zA-Z0-9]{1,63}\.)+[-a-zA-Z0-9]{2,63}|(?:[0-9]{1,3}\.){3}[0-9]{1,3})';
    $rexPort     = '(:[0-9]{1,5})?';
    $rexPath     = '(/[!$-/0-9:;=@_\':;!a-zA-Z\x7f-\xff]*?)?';
    $rexQuery    = '(\?[!$-/0-9:;=@_\':;!a-zA-Z\x7f-\xff]+?)?';
    $rexFragment = '(#[!$-/0-9:;=@_\':;!a-zA-Z\x7f-\xff]+?)?';
    
    // Solution 1:
    
    function callback($match)
    {
        // Prepend http:// if no protocol specified
        $completeUrl = $match[1] ? $match[0] : "http://{$match[0]}";
    
        return '<a href="' . $completeUrl . '">'
            . $match[2] . $match[3] . $match[4] . '</a>';
    }
    
    print "<pre>";
    print preg_replace_callback("&\\b$rexProtocol$rexDomain$rexPort$rexPath$rexQuery$rexFragment(?=[?.!,;:\"]?(\s|$))&",
        'callback', htmlspecialchars($text));
    print "</pre>";
    
    • To properly escape < and & characters, I throw the whole text through htmlspecialchars before processing. This is not ideal, as the html escaping can cause misdetection of URL boundaries.
    • As demonstrated by the "And remember.Nobody is perfect." line (in which remember.Nobody is treated as an URL, because of the missing space), further checking on valid top-level domains might be in order.

    Edit: The following code fixes the above two problems, but is quite a bit more verbose since I'm more or less re-implementing preg_replace_callback using preg_match.

    // Solution 2:
    
    $validTlds = array_fill_keys(explode(" ", ".aero .asia .biz .cat .com .coop .edu .gov .info .int .jobs .mil .mobi .museum .name .net .org .pro .tel .travel .ac .ad .ae .af .ag .ai .al .am .an .ao .aq .ar .as .at .au .aw .ax .az .ba .bb .bd .be .bf .bg .bh .bi .bj .bm .bn .bo .br .bs .bt .bv .bw .by .bz .ca .cc .cd .cf .cg .ch .ci .ck .cl .cm .cn .co .cr .cu .cv .cx .cy .cz .de .dj .dk .dm .do .dz .ec .ee .eg .er .es .et .eu .fi .fj .fk .fm .fo .fr .ga .gb .gd .ge .gf .gg .gh .gi .gl .gm .gn .gp .gq .gr .gs .gt .gu .gw .gy .hk .hm .hn .hr .ht .hu .id .ie .il .im .in .io .iq .ir .is .it .je .jm .jo .jp .ke .kg .kh .ki .km .kn .kp .kr .kw .ky .kz .la .lb .lc .li .lk .lr .ls .lt .lu .lv .ly .ma .mc .md .me .mg .mh .mk .ml .mm .mn .mo .mp .mq .mr .ms .mt .mu .mv .mw .mx .my .mz .na .nc .ne .nf .ng .ni .nl .no .np .nr .nu .nz .om .pa .pe .pf .pg .ph .pk .pl .pm .pn .pr .ps .pt .pw .py .qa .re .ro .rs .ru .rw .sa .sb .sc .sd .se .sg .sh .si .sj .sk .sl .sm .sn .so .sr .st .su .sv .sy .sz .tc .td .tf .tg .th .tj .tk .tl .tm .tn .to .tp .tr .tt .tv .tw .tz .ua .ug .uk .us .uy .uz .va .vc .ve .vg .vi .vn .vu .wf .ws .ye .yt .yu .za .zm .zw .xn--0zwm56d .xn--11b5bs3a9aj6g .xn--80akhbyknj4f .xn--9t4b11yi5a .xn--deba0ad .xn--g6w251d .xn--hgbk6aj7f53bba .xn--hlcj6aya9esc7a .xn--jxalpdlp .xn--kgbechtv .xn--zckzah .arpa"), true);
    
    $position = 0;
    while (preg_match("{\\b$rexProtocol$rexDomain$rexPort$rexPath$rexQuery$rexFragment(?=[?.!,;:\"]?(\s|$))}", $text, &$match, PREG_OFFSET_CAPTURE, $position))
    {
        list($url, $urlPosition) = $match[0];
    
        // Print the text leading up to the URL.
        print(htmlspecialchars(substr($text, $position, $urlPosition - $position)));
    
        $domain = $match[2][0];
        $port   = $match[3][0];
        $path   = $match[4][0];
    
        // Check if the TLD is valid - or that $domain is an IP address.
        $tld = strtolower(strrchr($domain, '.'));
        if (preg_match('{\.[0-9]{1,3}}', $tld) || isset($validTlds[$tld]))
        {
            // Prepend http:// if no protocol specified
            $completeUrl = $match[1][0] ? $url : "http://$url";
    
            // Print the hyperlink.
            printf('<a href="%s">%s</a>', htmlspecialchars($completeUrl), htmlspecialchars("$domain$port$path"));
        }
        else
        {
            // Not a valid URL.
            print(htmlspecialchars($url));
        }
    
        // Continue text parsing from after the URL.
        $position = $urlPosition + strlen($url);
    }
    
    // Print the remainder of the text.
    print(htmlspecialchars(substr($text, $position)));
    
    0 讨论(0)
  • 2020-11-22 11:50

    Here is the code using Regular Expressions in function

    <?php
    //Function definations
    function MakeUrls($str)
    {
    $find=array('`((?:https?|ftp)://\S+[[:alnum:]]/?)`si','`((?<!//)(www\.\S+[[:alnum:]]/?))`si');
    
    $replace=array('<a href="$1" target="_blank">$1</a>', '<a href="http://$1" target="_blank">$1</a>');
    
    return preg_replace($find,$replace,$str);
    }
    //Function testing
    $str="www.cloudlibz.com";
    $str=MakeUrls($str);
    echo $str;
    ?>
    
    0 讨论(0)
提交回复
热议问题