I want to remove new lines from some html (with php) except in
tags where whitespace is obviously important.
If the html is well formed, you can rely on the fact that tags aren't allowed to be nested. Make two passes: First you split the input into block of pre tags and everything else. You can use a regular expression for this task. Then you strip new lines from each non-pre block, and finally join them all back together.
Note that most html isn't well formed, so this approach may have some limits to where you can use it.