I want to remove html tags from given string using javascript. I looked into current approaches but there are some unsolved problems occured with them.
Current solutions
If you want to keep invalid markup untouched, regular expressions is your best bet. Something like this might work:
text = html.replace(/<\/?(span|div|img|p...)\b[^<>]*>/g, "")
Expand (span|div|img|p...)
into a list of all tags (or only those you want to remove). NB: the list must be sorted by length, longer tags first!
This may provide incorrect results in some edge cases (like attributes with <>
characters), but the only real alternative would be to program a complete html parser by yourself. Not that it would be extremely complicated, but might be an overkill here. Let us know.