regex problem in php

后端未结

关注

 5  1538

...

How to match the html inside(including)

in PHP?

I need a

~~相关标签:~~

5条回答

抹茶落季

2021-01-26 18:08

Use DOM and DOMXPath instead of regex, you'll thank me for it:

// something useful: function dumpDomNode ($node) { $temp = new DOMDocument(); $temp->appendChild($node,true); return $temp->saveHTML(); } $dom = new DOMDocument(); $dom->loadHTML($html_string); $xpath-> new DOMXpath($dom); $elements = $xpath->query("*/div/[@class='begin']"); foreach ($elements as $el) { echo dumpDomNode($el); // <-- or do something more useful with it }

Trying this with regex will lead you down the path to insanity...

0 讨论(0)

发布评论:

提交评论

加载中...

陌清茗

2021-01-26 18:10

Here is your Regex:

preg_match('/<div class=\"begin\">.*<\/div>/simU', $string, $matches);

But:

RegEx do not know what XML/HTML elements are. To them, HTML is just a string. This is why the others are right. Regex are not for parsing a DOM. They are used to find string patterns.

I have provided the Regex because you do not intend to parse an entire HTML page, but just grab one defined piece of text from it, in which case a Regex is fine to use.

If there is a nested DIV inside the DIV, the Regex will not work as expected. If this is the case, do not use Regex. Use one of the other solutions, because then you need DOM parsing, not string matching.

For finding strings with a more or less clearly defined start and end, consider using regular string functions instead, as they are often quicker.

0 讨论(0)

发布评论:

提交评论

加载中...

醉梦人生

2021-01-26 18:11

// Create DOM from URL $html = file_get_html('http://example.org/'); echo $html->find('div.begin', 0)->outertext;

http://simplehtmldom.sourceforge.net/manual.htm

0 讨论(0)

发布评论:

提交评论

加载中...

醉酒成梦

2021-01-26 18:11

here's one way using string methods

$str= <<<A blah <div class="begin"> blah blah blah blah blah </div> blah A; $s = explode("</div>",$str); foreach($s as $k=>$v){ $m=strpos($v,'<div class="begin">'); if($m !==FALSE){ echo substr("$v" ,$m); } }

output

$ php test.php <div class="begin"> blah blah blah blah blah

0 讨论(0)

发布评论:

提交评论

加载中...

萌比男神i

2021-01-26 18:33

This sums it up pretty good.

In short, don't use regular expressions to parse HTML. Instead, look at the DOM classes and especially DOMDocument::loadHTML

0 讨论(0)

发布评论:

提交评论

加载中...

验证码

看不清?

提交回复

~~热议问题~~