How to extract html comments and all html contained by node?

前端未结

关注

 4  597

I\'m creating a little web app to help me manage and analyze the content of my websites, and cURL is my favorite new toy. I\'ve figured out how to extract info about all so

相关标签:

4条回答

轻奢々

2020-12-01 23:25

For the HTML comments a fast method is:

 function getComments ($html) {

     $rcomments = array();
     $comments = array();

     if (preg_match_all('#<\!--(.*?)-->#is', $html, $rcomments)) {

         foreach ($rcomments as $c) {
             $comments[] = $c[1];
         }

         return $comments;

     } else {
         // No comments matchs
         return null;
     }

 }

0 讨论(0)

忘了有多久

2020-12-01 23:26
for comments your looking for recursive regex. For instance, to get rid of html comments:
```
preg_replace('//s',$yourHTML);
```
to find them:
```
preg_match_all('/()/s',$yourHTML,$comments);
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
醉酒成梦

2020-12-01 23:30
Comment nodes should be easy to find in XPath with the comment() test, analogous to the text() test:
```
$comments = $xpath->query('//comment()'); // or another path, as you prefer
```
They are standard nodes: here is the manual entry for the DOMComment class.

To your other question, it's a bit trickier. The simplest way is to use saveXML() with its optional $node argument:
```
$html = $dom->saveXML($el);  // $el should be the element you want to get 
                             // the HTML for
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
眼角桃花

2020-12-01 23:36

That Regex \s*
Helps to you.

In regex Test

0 讨论(0)
发布评论:

提交评论
- 加载中...