How to find a URL from a content by PHP?

情到浓时终转凉″ 提交于 2020-01-11 11:58:10

问题


need a simply preg_match, which will find "c.aspx" (without quotes) in the content if it finds, it will return the whole url. As a example

$content = '<div>[4]<a href="/m/c.aspx?mt=01_9310ba801f1255e02e411d8a7ed53ef95235165ee4fb0226f9644d439c11039f%7c8acc31aea5ad3998&amp;n=783622212">New message</a><br/>';

now it should preg_match "c.aspx" from $content and will give a output as

"/m/c.aspx?mt=01_9310ba801f1255e02e411d8a7ed53ef95235165ee4fb0226f9644d439c11039f%7c8acc31aea5ad3998&amp;n=783622212"

The $content should have more links except "c.aspx". I don't want them. I only want all url that have "c.aspx".

Please let me know how I can do it.


回答1:


You use DOM to parse HTML, not regex. You can use regex to parse the attribute value though.

Edit: updated example so it checks for c.aspx.

$content = '<div>[4]<a href="/m/c.aspx?mt=01_9310ba801f1255e02e411d8a7ed53ef95235165ee4fb0226f9644d439c11039f%7c8acc31aea5ad3998&amp;n=783622212">New message</a>

<a href="#bar">foo</a>

<br/>';

$dom = new DOMDocument();
$dom->loadHTML($content);

$anchors = $dom->getElementsByTagName('a');

if ( count($anchors->length) > 0 ) {
    foreach ( $anchors as $anchor ) {
        if ( $anchor->hasAttribute('href') ) {
            $link = $anchor->getAttribute('href');
            if ( strpos( $link, 'c.aspx') ) {
                echo $link;
            }
        }
    }
}



回答2:


If you want to find any quoted string with c.aspx in it:

/"[^"]*c\.aspx[^"]*"|'[^']*c\.aspx[^']*'/

But really, for parsing most HTML you'd be better off with some sort of DOM parser so that you can be sure what you're matching is really an href.



来源:https://stackoverflow.com/questions/1449618/how-to-find-a-url-from-a-content-by-php

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!