Parse an HTTP request Authorization header with Python

前端 未结 10 508
情书的邮戳
情书的邮戳 2020-12-30 06:47

I need to take a header like this:

 Authorization: Digest qop=\"chap\",
     realm=\"testrealm@host.com\",
     username=\"Foobear\",
     response=\"6629fae         


        
相关标签:
10条回答
  • 2020-12-30 07:44

    If your response comes in a single string that that never varies and has as many lines as there are expressions to match, you can split it into an array on the newlines called authentication_array and use regexps:

    pattern_array = ['qop', 'realm', 'username', 'response', 'cnonce']
    i = 0
    parsed_dict = {}
    
    for line in authentication_array:
        pattern = "(" + pattern_array[i] + ")" + "=(\".*\")" # build a matching pattern
        match = re.search(re.compile(pattern), line)         # make the match
        if match:
            parsed_dict[match.group(1)] = match.group(2)
        i += 1
    
    0 讨论(0)
  • 2020-12-30 07:46

    A little regex:

    import re
    reg=re.compile('(\w+)[:=] ?"?(\w+)"?')
    
    >>>dict(reg.findall(headers))
    
    {'username': 'Foobear', 'realm': 'testrealm', 'qop': 'chap', 'cnonce': '5ccc069c403ebaf9f0171e9517f40e41', 'response': '6629fae49393a05397450978507c4ef1', 'Authorization': 'Digest'}
    
    0 讨论(0)
  • 2020-12-30 07:46

    An older question but one I found very helpful.

    I needed a parser to handle any properly formed Authorization header, as defined by RFC7235 (raise your hand if you enjoy reading ABNF).

    Authorization = credentials
    
    BWS = <BWS, see [RFC7230], Section 3.2.3>
    
    OWS = <OWS, see [RFC7230], Section 3.2.3>
    
    Proxy-Authenticate = *( "," OWS ) challenge *( OWS "," [ OWS
     challenge ] )
    Proxy-Authorization = credentials
    
    WWW-Authenticate = *( "," OWS ) challenge *( OWS "," [ OWS challenge
     ] )
    
    auth-param = token BWS "=" BWS ( token / quoted-string )
    auth-scheme = token
    
    challenge = auth-scheme [ 1*SP ( token68 / [ ( "," / auth-param ) *(
     OWS "," [ OWS auth-param ] ) ] ) ]
    credentials = auth-scheme [ 1*SP ( token68 / [ ( "," / auth-param )
     *( OWS "," [ OWS auth-param ] ) ] ) ]
    
    quoted-string = <quoted-string, see [RFC7230], Section 3.2.6>
    
    token = <token, see [RFC7230], Section 3.2.6>
    token68 = 1*( ALPHA / DIGIT / "-" / "." / "_" / "~" / "+" / "/" )
     *"="
    

    Starting with PaulMcG's answer, I came up with this:

    import pyparsing as pp
    
    tchar = '!#$%&\'*+-.^_`|~' + pp.nums + pp.alphas
    t68char = '-._~+/' + pp.nums + pp.alphas
    
    token = pp.Word(tchar)
    token68 = pp.Combine(pp.Word(t68char) + pp.ZeroOrMore('='))
    
    scheme = token('scheme')
    
    header = pp.Keyword('Authorization')
    name = pp.Word(pp.alphas, pp.alphanums)
    value = pp.quotedString.setParseAction(pp.removeQuotes)
    name_value_pair = name + pp.Suppress('=') + value
    params = pp.Dict(pp.delimitedList(pp.Group(name_value_pair)))
    
    credentials = scheme + (token68('token') ^ params('params'))
    
    auth_parser = header + pp.Suppress(':') + credentials
    

    This allows for parsing any Authorization header:

    parsed = auth_parser.parseString('Authorization: Basic Zm9vOmJhcg==')
    print('Authenticating with {0} scheme, token: {1}'.format(parsed['scheme'], parsed['token']))
    

    which outputs:

    Authenticating with Basic scheme, token: Zm9vOmJhcg==
    

    Bringing it all together into an Authenticator class:

    import pyparsing as pp
    from base64 import b64decode
    import re
    
    class Authenticator:
        def __init__(self):
            """
            Use pyparsing to create a parser for Authentication headers
            """
            tchar = "!#$%&'*+-.^_`|~" + pp.nums + pp.alphas
            t68char = '-._~+/' + pp.nums + pp.alphas
    
            token = pp.Word(tchar)
            token68 = pp.Combine(pp.Word(t68char) + pp.ZeroOrMore('='))
    
            scheme = token('scheme')
    
            auth_header = pp.Keyword('Authorization')
            name = pp.Word(pp.alphas, pp.alphanums)
            value = pp.quotedString.setParseAction(pp.removeQuotes)
            name_value_pair = name + pp.Suppress('=') + value
            params = pp.Dict(pp.delimitedList(pp.Group(name_value_pair)))
    
            credentials = scheme + (token68('token') ^ params('params'))
    
            # the moment of truth...
            self.auth_parser = auth_header + pp.Suppress(':') + credentials
    
    
        def authenticate(self, auth_header):
            """
            Parse auth_header and call the correct authentication handler
            """
            authenticated = False
            try:
                parsed = self.auth_parser.parseString(auth_header)
                scheme = parsed['scheme']
                details = parsed['token'] if 'token' in parsed.keys() else parsed['params']
    
                print('Authenticating using {0} scheme'.format(scheme))
                try:
                    safe_scheme = re.sub("[!#$%&'*+-.^_`|~]", '_', scheme.lower())
                    handler = getattr(self, 'auth_handle_' + safe_scheme)
                    authenticated = handler(details)
                except AttributeError:
                    print('This is a valid Authorization header, but we do not handle this scheme yet.')
    
            except pp.ParseException as ex:
                print('Not a valid Authorization header')
                print(ex)
    
            return authenticated
    
    
        # The following methods are fake, of course.  They should use what's passed
        # to them to actually authenticate, and return True/False if successful.
        # For this demo I'll just print some of the values used to authenticate.
        @staticmethod
        def auth_handle_basic(token):
            print('- token is {0}'.format(token))
            try:
                username, password = b64decode(token).decode().split(':', 1)
            except Exception:
                raise DecodeError
            print('- username is {0}'.format(username))
            print('- password is {0}'.format(password))
            return True
    
        @staticmethod
        def auth_handle_bearer(token):
            print('- token is {0}'.format(token))
            return True
    
        @staticmethod
        def auth_handle_digest(params):
            print('- username is {0}'.format(params['username']))
            print('- cnonce is {0}'.format(params['cnonce']))
            return True
    
        @staticmethod
        def auth_handle_aws4_hmac_sha256(params):
            print('- Signature is {0}'.format(params['Signature']))
            return True
    

    To test this class:

    tests = [
        'Authorization: Digest qop="chap", realm="testrealm@example.com", username="Foobar", response="6629fae49393a05397450978507c4ef1", cnonce="5ccc069c403ebaf9f0171e9517f40e41"',
        'Authorization: Bearer cn389ncoiwuencr',
        'Authorization: Basic Zm9vOmJhcg==',
        'Authorization: AWS4-HMAC-SHA256 Credential="AKIAIOSFODNN7EXAMPLE/20130524/us-east-1/s3/aws4_request", SignedHeaders="host;range;x-amz-date", Signature="fe5f80f77d5fa3beca038a248ff027d0445342fe2855ddc963176630326f1024"',
        'Authorization: CrazyCustom foo="bar", fizz="buzz"',
    ]
    
    authenticator = Authenticator()
    
    for test in tests:
        authenticator.authenticate(test)
        print()
    

    Which outputs:

    Authenticating using Digest scheme
    - username is Foobar
    - cnonce is 5ccc069c403ebaf9f0171e9517f40e41
    
    Authenticating using Bearer scheme
    - token is cn389ncoiwuencr
    
    Authenticating using Basic scheme
    - token is Zm9vOmJhcg==
    - username is foo
    - password is bar
    
    Authenticating using AWS4-HMAC-SHA256 scheme
    - signature is fe5f80f77d5fa3beca038a248ff027d0445342fe2855ddc963176630326f1024
    
    Authenticating using CrazyCustom scheme 
    This is a valid Authorization header, but we do not handle this scheme yet.
    

    In future if we wish to handle CrazyCustom we'll just add

    def auth_handle_crazycustom(params):
    
    0 讨论(0)
  • 2020-12-30 07:48

    The http digest Authorization header field is a bit of an odd beast. Its format is similar to that of rfc 2616's Cache-Control and Content-Type header fields, but just different enough to be incompatible. If you're still looking for a library that's a little smarter and more readable than the regex, you might try removing the Authorization: Digest part with str.split() and parsing the rest with parse_dict_header() from Werkzeug's http module. (Werkzeug can be installed on App Engine.)

    0 讨论(0)
提交回复
热议问题