Sanitize <script> element contents

喜你入骨 提交于 2019-12-01 08:51:38
Cody Gustafson

Edited for non-mutation of data.

If I'm interpreting this correctly. You want to prevent the user from ending the script tag prematurely within the user submitted string. That can be done for html just as you stated with adding the backslash in with the ending tag <\/script>. That is the only escaping you should have to worry about in that case. You shouldn't need to escape html comments as the browser will interpret it as part of the javascript. Perhaps if some older browsers don't interpret script tags default to the type of text/javascript correctly (language="javascript" which is deprecated) adding in type='text/javascript' may be necessary.


Based on Mike Samuel's answer here I may have been wrong about not needing to escape html comments. However I was not able to reproduce it in chrome or chromium.

SilverlightFox

Assuming that you're doing this:

Payload is set to

var data = '[this is user controlled data]';

and the rest of the code (assignment, quotes and semi-colon) is generated by your application, then the encoding you want is hex entity encoding.

See the OWASP XSS Prevention Cheat Sheet, Rule #3 for more information. This will convert

</script><script>alert("Muahahaha!")

into

var data = '\x3c\x2fscript\x3e\x3cscript\x3ealert\x28\x22Muahahaha\x21\x22\x29';

Try this and you will see this has the advantage of storing the user set string exactly correct, no matter what characters it contains. Additionally it takes care of single and double quote encoding. As a super bonus, it is also suitable for storing in HTML attributes:

<a onclick="alert('[user data]');" />

which normally would have to be HTML encoded again for correct display (because &amp; inside an HTML attribute is interpreted as &). However, hex entity encoding does not include any HTML characters with special meaning so you get two for the price of one.

Update from comments

The OP indicated that the server-side code would be generated in the form

var data = <%= JSON.stringify(data) %>;

The above still applies. It is upto the JSON class to properly hex entity encode values as they're inserted into the JSON. This cannot easily be done outside of the class as you'd have to effectively parse the JSON again to determine the current language context. I wouldn't recommend going for the simple option of escaping the forward slash in the </script> because there are other sequences that can end the grammar context such as CDATA closing tags. Escape properly and your code will be future proof and secure.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!