I have this string:
var string = \'
You have to make global search to find any characters any no. of time between <
and >
<script type="text/javascript">
var str='<article><img alt="Ice-cream" src=http://placehold.it/300x300g"><div style="float: right; width: 50px;"><p>Lorem Ipsum </p></div></article>';
var patt=/\<.*?\>/g;
var result = str.replace(patt, "");
console.log(result);
</script>
You can use regex
to get text from string that contains HTML
tags.
<script type="text/javascript">
var regex = "/<(.|\n)*?>/";
var string = '<article><img alt="Ice-cream" src=http://placehold.it/300x300g"><div style="float: right; width: 50px;"><p>Lorem Ipsum </p></div></article>';
var result = string .replace(regex, "");
alert(result); // result should be "Lorem Ipsum "
</script>
This way you strip all HTML tags with empty string.
Let the Browser do the sanitation and use this trick:
var str= '<article><img alt="Ice-cream" src=http://placehold.it/300x300g">'+
'<divstyle="float: right; width: 50px;"><p>Lorem Ipsum </p></div></article>';
var dummyNode = document.createElement('div'),
resultText = '';
dummyNode.innerHTML = str;
resultText = dummyNode.innerText || dummyNode.textContent;
This creates a dummy DOM element and sets its HTML content to the input string.
Now the only text can be got by simply calling the DOM property innerText
or textContent
.
This is also more safe and robust as Browser has already written better algorithms to get these values.