问题
How can I get all the unclosed tags in a given string, prefferably in the order they should be closed?
Note: consider that there are no errors in the HTML and that it was just cut off after X characters. No it's not a case of bad html or overlapping tags etc. Also there will be no ending
Example:
<p><span>Lorem</span><b>ipsum
---return---> </b></p>
-OR-<ul><li>1</li><li>2
---return---> </li></ul>
So that if the string is concatenated with the function output it will re-create a valid HTML.
I'm not sure if a RegExp would do the trick here, basically I want to get anything that's between < and > that does not have a matching </ > closing tag.
Thank you.
回答1:
This is not an easy task. You might want to look at Tidy:
Tidy is a binding for the Tidy HTML clean and repair utility which allows you to not only clean and otherwise manipulate HTML documents, but also traverse the document tree.
http://php.net/manual/en/book.tidy.php
来源:https://stackoverflow.com/questions/3822156/php-get-all-unclosed-html-tags-in-string