HTML-encoding lost when attribute read from input field

前端 未结 25 3975
时光说笑
时光说笑 2020-11-21 04:04

I’m using JavaScript to pull a value out from a hidden field and display it in a textbox. The value in the hidden field is encoded.

For example,



        
25条回答
  •  南旧
    南旧 (楼主)
    2020-11-21 04:42

    The jQuery trick doesn't encode quote marks and in IE it will strip your whitespace.

    Based on the escape templatetag in Django, which I guess is heavily used/tested already, I made this function which does what's needed.

    It's arguably simpler (and possibly faster) than any of the workarounds for the whitespace-stripping issue - and it encodes quote marks, which is essential if you're going to use the result inside an attribute value for example.

    function htmlEscape(str) {
        return str
            .replace(/&/g, '&')
            .replace(/"/g, '"')
            .replace(/'/g, ''')
            .replace(//g, '>');
    }
    
    // I needed the opposite function today, so adding here too:
    function htmlUnescape(str){
        return str
            .replace(/"/g, '"')
            .replace(/'/g, "'")
            .replace(/</g, '<')
            .replace(/>/g, '>')
            .replace(/&/g, '&');
    }
    

    Update 2013-06-17:
    In the search for the fastest escaping I have found this implementation of a replaceAll method:
    http://dumpsite.com/forum/index.php?topic=4.msg29#msg29
    (also referenced here: Fastest method to replace all instances of a character in a string)
    Some performance results here:
    http://jsperf.com/htmlencoderegex/25

    It gives identical result string to the builtin replace chains above. I'd be very happy if someone could explain why it's faster!?

    Update 2015-03-04:
    I just noticed that AngularJS are using exactly the method above:
    https://github.com/angular/angular.js/blob/v1.3.14/src/ngSanitize/sanitize.js#L435

    They add a couple of refinements - they appear to be handling an obscure Unicode issue as well as converting all non-alphanumeric characters to entities. I was under the impression the latter was not necessary as long as you have an UTF8 charset specified for your document.

    I will note that (4 years later) Django still does not do either of these things, so I'm not sure how important they are:
    https://github.com/django/django/blob/1.8b1/django/utils/html.py#L44

    Update 2016-04-06:
    You may also wish to escape forward-slash /. This is not required for correct HTML encoding, however it is recommended by OWASP as an anti-XSS safety measure. (thanks to @JNF for suggesting this in comments)

            .replace(/\//g, '/');
    

提交回复
热议问题