Extract text and links from HTML using Regular Expressions
问题 I would like to extract text from an html document keeping the links inside it. for example: From this HTML code <div class="CssClass21">bla1 bla1 bla1 <a href="http://www.ibrii.com">go to ibrii</a> bla2 bla2 bla2 <img src="http://www.contoso.com/hello.jpg"> <span class="cssClass34">hello hello</span> I would like to extract just this bla1 bla1 bla1 <a href="http://www.ibrii.com">go to ibrii</a> bla2 bla2 bla2 hello hello In another post on StackOverflow i have found the RegEx <[^>]*> which