Is there an easy way to take a string of html in JavaScript and strip out the html?
The above function posted by hypoxide works fine, but I was after something that would basically convert HTML created in a Web RichText editor (for example FCKEditor) and clear out all HTML but leave all the Links due the fact that I wanted both the HTML and the plain text version to aid creating the correct parts to an STMP email (both HTML and plain text).
After a long time of searching Google myself and my collegues came up with this using the regex engine in Javascript:
str='this string has html code i want to remove
Link Number 1 ->BBC Link Number 1
Now back to normal text and stuff
';
str=str.replace(/
/gi, "\n");
str=str.replace(//gi, "\n");
str=str.replace(/(.*?)<\/a>/gi, " $2 (Link->$1) ");
str=str.replace(/<(?:.|\s)*?>/g, "");
the str
variable starts out like this:
this string has html code i want to remove
Link Number 1 ->BBC Link Number 1
Now back to normal text and stuff
and then after the code has run it looks like this:-
this string has html code i want to remove
Link Number 1 -> BBC (Link->http://www.bbc.co.uk) Link Number 1
Now back to normal text and stuff
As you can see the all the HTML has been removed and the Link have been persevered with the hyperlinked text is still intact. Also I have replaced the and
tags with \n
(newline char) so that some sort of visual formatting has been retained.
To change the link format (eg. BBC (Link->http://www.bbc.co.uk)
) just edit the $2 (Link->$1)
, where $1
is the href URL/URI and the $2
is the hyperlinked text. With the links directly in body of the plain text most SMTP Mail Clients convert these so the user has the ability to click on them.
Hope you find this useful.