Unescape HTML entities in Javascript?

前端未结

关注

 30  2912

野趣味

I have some Javascript code that communicates with an XML-RPC backend. The XML-RPC returns strings of the form:

相关标签:

30条回答

暖寄归人

2020-11-21 05:57

You're welcome...just a messenger...full credit goes to ourcodeworld.com, link below.

window.htmlentities = {
        /**
         * Converts a string to its html characters completely.
         *
         * @param {String} str String with unescaped HTML characters
         **/
        encode : function(str) {
            var buf = [];

            for (var i=str.length-1;i>=0;i--) {
                buf.unshift(['&#', str[i].charCodeAt(), ';'].join(''));
            }

            return buf.join('');
        },
        /**
         * Converts an html characterSet into its original character.
         *
         * @param {String} str htmlSet entities
         **/
        decode : function(str) {
            return str.replace(/&#(\d+);/g, function(match, dec) {
                return String.fromCharCode(dec);
            });
        }
    };

Full Credit: https://ourcodeworld.com/articles/read/188/encode-and-decode-html-entities-using-pure-javascript

0 讨论(0)

北恋

2020-11-21 05:57

I use this in my project: inspired by other answers but with an extra secure parameter, can be useful when you deal with decorated characters

var decodeEntities=(function(){

    var el=document.createElement('div');
    return function(str, safeEscape){

        if(str && typeof str === 'string'){

            str=str.replace(/\</g, '&lt;');

            el.innerHTML=str;
            if(el.innerText){

                str=el.innerText;
                el.innerText='';
            }
            else if(el.textContent){

                str=el.textContent;
                el.textContent='';
            }

            if(safeEscape)
                str=str.replace(/\</g, '&lt;');
        }
        return str;
    }
})();

And it's usable like:

var label='safe <b> character &eacute;ntity</b>';
var safehtml='<div title="'+decodeEntities(label)+'">'+decodeEntities(label, true)+'</div>';

0 讨论(0)

慢半拍i

2020-11-21 05:58
If you're using jQuery:
```
function htmlDecode(value){ 
  return $('<div/>').html(value).text(); 
}
```
Otherwise, use Strictly Software's Encoder Object, which has an excellent htmlDecode() function.
0 讨论(0)
发布评论:

提交评论
- 加载中...

半阙折子戏

2020-11-21 05:58

function decodeHTMLContent(htmlText) {
  var txt = document.createElement("span");
  txt.innerHTML = htmlText;
  return txt.innerText;
}

var result = decodeHTMLContent('One &amp; two &amp; three');
console.log(result);

0 讨论(0)

醉话见心

2020-11-21 05:59
Most answers given here have a huge disadvantage: if the string you are trying to convert isn't trusted then you will end up with a Cross-Site Scripting (XSS) vulnerability. For the function in the accepted answer, consider the following:
```
htmlDecode("<img src='dummy' onerror='alert(/xss/)'>");
```
The string here contains an unescaped HTML tag, so instead of decoding anything the htmlDecode function will actually run JavaScript code specified inside the string.

This can be avoided by using DOMParser which is supported in all modern browsers:
```
function htmlDecode(input) {
  var doc = new DOMParser().parseFromString(input, "text/html");
  return doc.documentElement.textContent;
}

console.log(  htmlDecode("&lt;img src='myimage.jpg'&gt;")  )    
// "<img src='myimage.jpg'>"

console.log(  htmlDecode("<img src='dummy' onerror='alert(/xss/)'>")  )  
// ""
```
This function is guaranteed to not run any JavaScript code as a side-effect. Any HTML tags will be ignored, only text content will be returned.

Compatibility note: Parsing HTML with DOMParser requires at least Chrome 30, Firefox 12, Opera 17, Internet Explorer 10, Safari 7.1 or Microsoft Edge. So all browsers without support are way past their EOL and as of 2017 the only ones that can still be seen in the wild occasionally are older Internet Explorer and Safari versions (usually these still aren't numerous enough to bother).
0 讨论(0)
发布评论:

提交评论
- 加载中...

自闭症患者

2020-11-21 06:02

There is an variant that 80% as productive as the answers at the very top.

See the benchmark: https://jsperf.com/decode-html12345678/1

performance test

console.log(decodeEntities('test: &gt'));

function decodeEntities(str) {
  // this prevents any overhead from creating the object each time
  const el = decodeEntities.element || document.createElement('textarea')

  // strip script/html tags
  el.innerHTML = str
    .replace(/<script[^>]*>([\S\s]*?)<\/script>/gmi, '')
    .replace(/<\/?\w(?:[^"'>]|"[^"]*"|'[^']*')*>/gmi, '');

  return el.value;
}

If you need to leave tags, then remove the two .replace(...) calls (you can leave the first one if you do not need scripts).

0 讨论(0)