Take your HTML string or document and parse it with HTML Agility Pack. This will give you a HTMLDocument object that is very similar to a XmlDocument.
You can then use it's methods such as SelectNodes
to access those portions of the document that you are interested in.
If you choose to use another approach, be aware that parsing HTML (a non-Regular language) with Regular Expressions is widely regarded as a bad idea.
And regardless of the approach, if you are keeping some markup, use a whitelist approach. This means to remove everything that is not explicitly wanted.