My first guess was the PHP DOM classes (with the formatOutput parameter). However, I cannot get this block of HTML to be formatted and output correctly. As you can see, the
Here's the comment at the php.net: http://ru2.php.net/manual/en/domdocument.save.php#88630
It looks like when you load HTML from the string (like you did) DOMDocument becomes lazy and does not format anything in it.
Here's working solution to your problem:
// Clean your HTML by hand first
$html = preg_replace('/>\s*<', $html);
$dom = new DOMDocument;
$dom->loadHTML($html);
$dom->formatOutput = true;
$dom->preserveWhitespace = false;
// Use saveXML(), not saveHTML()
print $dom->saveXML();
Basically, you throw out the spaces between tags and use saveXML() instead of saveHTML(). saveHTML() just does not work in this situation. However, you get an XML declaration in first line of text.