How to stop HTTP (and rfc822, email) header injection?

杀马特。学长 韩版系。学妹 提交于 2019-12-04 18:23:04

Don't escape. Just stop processing (drop the header or the whole request).

Abbafei

If the input data is being included in a header field parameter (for example the filename parameter of the Content-Disposition header), it can be encoded with email.utils.encode_rfc2231 (as constrained by these specifications, which define a variation of the rfc2231 encoding).

If it is not being included a header field parameter, then it seems that this method cannot be used. In such a situation, the safest bet would likely be to just not include the input, as Julian Reschke wrote; however, if you insist on including the input, you may want to try one of the following methods:

(which may be insecure, since HTTP is not a MIME-compliant protocol, and so unless the MIME-Version header is used (and possibly even if it is used?), these ways may not work correctly for HTTP.)

One way...

to do this, although it may not be totally foolproof (edit: it is not foolproof (when used by itself); it accepts \r\n\r\n, which terminates headers and starts the body! Therefore \r and \n would need to handled, unless preceded by non-\r/\n whitespace (like tabs or spaces.)), is to use the email.header module. This is designed specifically for rfc822 headers (edit: but (seemingly, since the email package used to be several separate modules (example)) not for HTTP headers!), so would seem to be the tool for the job. This Header class is meant for encoding header values, not the full Header-Name: value, and so is a candidate for this job (where we want to vaidate or escape the value only).

(Hint: many of the tools in the email module are also handy when working with other MIME-format (edit: and possibly also MIME-like) stuff; so too stuff in the cgi module, cgi.FieldStorage in particular for HTTP-form parsing.)

However, email.header only will raise an error if the input seems malicious (seems to contain another (embedded) header); however, it will not, it seems, handle invalid input by escaping it (please correct this in the comments if it is not so). (The charset parameter should escape the header-fragment, returning valid input, however, it may not have such good compatibility with user agents (email, HTTP, etc.); see here (edit: many HTTP user agents support (not necessarily the charset parameter of the encoding for the email.header.Header class (which seems to use some MIME-specific encodings besides rfc2231 encoding), but) the rfc5987 encoding).

Example:

import email.header
import re

def check_string_for_rfc822_header(s):
    wip_header_component = str(email.header.Header(s))
    if re.search(r'(\r?\n[\S\n\r]|\r[\S\r])', wip_header_component):
        raise Exception
    else:
        return wip_header_component

# testing...
>>> check_string_for_rfc822_header("aaa")
"aaa"
>>> check_string_for_rfc822_header("a\r\nb")
"a\r\nb"
>>> check_string_for_rfc822_header("a\r\nb: c")
<error>

Another way...

to do this, it seems, would be to simply remove \r and \n characters (each separately however; do not just remove occurences the full string \r\n, since this would still leave these unescaped when occuring separately, and many (most?) HTTP utils will accept each of them separately!). Similarly, we can escape the header by replacing \r\n, \r, and \n, with themselves prepended by whitespace (which is the way to escape header; see the standard).

However, this method does not take into account the details of the standards (for example, rfc822 headers must be ACSII), which may be exploitable on their own.

Example:

def remove_linebreakers(s):
    return s.replace("\n", "").replace("\r", "")

# or...
import re

def remove_linebreakers(s):
    re.sub(r'[\n\r]', '', s)


# testing...
>>> remove_linebreakers("aaa")
"aaa"
>>> remove_linebreakers("a\r\nb")
"ab"
>>> remove_linebreakers("a\r\nb: c")
"ab: c"

In summary...

the first way seems better, but only for validation (not for escaping), unless it is a parameter value, in which case escape it using email.utils.encode_rfc2231.

Example:

# if we are not working with a header param value, the following...
# ...raises email.errors.HeaderParseError if input is poisonous when in a header
wip_header_component = str(email.header.Header('<input>'))
header_component = (raise_error() if re.search(r'(\r?\n[\S\n\r]|\r[\S\r])', wip_header_component) else wip_header_component)
# ...or if we *are* working with a header param value...
email.utils.encode_rfc2231('<input>', 'UTF-8')
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!