How can I use php to remove tags with empty text node?

≯℡__Kan透↙ 提交于 2019-12-05 08:06:33

问题


How can I use php to remove tags with empty text node?

For instance,

<div class="box"></div> remove

<a href="#"></a> remove

<p><a href="#"></a></p> remove

<span style="..."></span> remove

But I want to keep the tag with text node like this,

<a href="#">link</a> keep

Edit:

I want to remove something messy like this too,

<p><strong><a href="http://xx.org.uk/dartmoor-arts"></a></strong></p>
<p><strong><a href="http://xx.org.uk/depw"></a></strong></p>
<p><strong><a href="http://xx.org.uk/devon-guild-of-craftsmen"></a></strong></p>

I tested both regex below,

$content = preg_replace('!<(.*?)[^>]*>\s*</\1>!','',$content);
$content = preg_replace('%<(.*?)[^>]*>\\s*</\\1>%', '', $content);

But they leave something like this,

<p><strong></strong></p>
<p><strong></strong></p>
<p><strong></strong></p>

回答1:


One way could be:

$dom = new DOMDocument();
$dom->loadHtml(
    '<p><strong><a href="http://xx.org.uk/dartmoor-arts">test</a></strong></p>
    <p><strong><a href="http://xx.org.uk/depw"></a></strong></p>
    <p><strong><a href="http://xx.org.uk/devon-guild-of-craftsmen"></a></strong></p>'
);

$xpath = new DOMXPath($dom);

while(($nodeList = $xpath->query('//*[not(text()) and not(node())]')) && $nodeList->length > 0) {
    foreach ($nodeList as $node) {
        $node->parentNode->removeChild($node);
    }
}

echo $dom->saveHtml();

Probably you'll have to change that a bit for your needs.




回答2:


You should buffer the PHP output, then parse that output with some regex, like this:

// start buffering output
ob_start();
// do some output
echo '<div id="non-empty">I am not empty</div><a class="empty"></a>';
// at this point you want to output the contents to the client
$contents = ob_get_contents();
// end buffering and flush
ob_end_flush();
// replace empty html tags
$contents = preg_replace('%<(.*?)[^>]*>\\s*</\\1>%', '', $contents);
// echo the sanitized contents
echo $contents;

Let me know if this helps :)




回答3:


You could do a regex replace like:

$updated="";
while($updated != $original) {
    $updated = $original;
    $original = preg_replace('!<(.*?)[^>]*>\s*</\1>!','',$updated);
}

Putting it in a while loop should fix it.



来源:https://stackoverflow.com/questions/6801586/how-can-i-use-php-to-remove-tags-with-empty-text-node

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!