Parse an HTTP request Authorization header with Python

前端未结

关注

 10  507

I need to take a header like this:

 Authorization: Digest qop=\"chap\",
     realm=\"testrealm@host.com\",
     username=\"Foobear\",
     response=\"6629fae


                      
              相关标签:


      
      
        
          10条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  走了就别回头了        
                
              
                            
                2020-12-30 07:22
              
            
            
                                                                       
Here's my pyparsing attempt:

text = """Authorization: Digest qop="chap",
    realm="testrealm@host.com",     
    username="Foobear",     
    response="6629fae49393a05397450978507c4ef1",     
    cnonce="5ccc069c403ebaf9f0171e9517f40e41" """

from pyparsing import *

AUTH = Keyword("Authorization")
ident = Word(alphas,alphanums)
EQ = Suppress("=")
quotedString.setParseAction(removeQuotes)

valueDict = Dict(delimitedList(Group(ident + EQ + quotedString)))
authentry = AUTH + ":" + ident("protocol") + valueDict

print authentry.parseString(text).dump()


which prints:

['Authorization', ':', 'Digest', ['qop', 'chap'], ['realm', 'testrealm@host.com'],
 ['username', 'Foobear'], ['response', '6629fae49393a05397450978507c4ef1'], 
 ['cnonce', '5ccc069c403ebaf9f0171e9517f40e41']]
- cnonce: 5ccc069c403ebaf9f0171e9517f40e41
- protocol: Digest
- qop: chap
- realm: testrealm@host.com
- response: 6629fae49393a05397450978507c4ef1
- username: Foobear


I'm not familiar with the RFC, but I hope this gets you rolling.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  感情败类        
                
              
                            
                2020-12-30 07:23
              
            
            
                                                                       
You can also use urllib2 as CheryPy does.

here is the snippet:

input= """
 Authorization: Digest qop="chap",
     realm="testrealm@host.com",
     username="Foobear",
     response="6629fae49393a05397450978507c4ef1",
     cnonce="5ccc069c403ebaf9f0171e9517f40e41"
"""
import urllib2
field, sep, value = input.partition("Authorization: Digest ")
if value:
    items = urllib2.parse_http_list(value)
    opts = urllib2.parse_keqv_list(items)
    opts['protocol'] = 'Digest'
    print opts


it outputs:

{'username': 'Foobear', 'protocol': 'Digest', 'qop': 'chap', 'cnonce': '5ccc069c403ebaf9f0171e9517f40e41', 'realm': 'testrealm@host.com', 'response': '6629fae49393a05397450978507c4ef1'}

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  再見小時候        
                
              
                            
                2020-12-30 07:25
              
            
            
                                                                       
Your original concept of using PyParsing would be the best approach.  What you've implicitly asked for is something that requires a grammar... that is, a regular expression or simple parsing routine is always going to be brittle, and that sounds like it's something you're trying to avoid.

It appears that getting pyparsing on google app engine is easy: How do I get PyParsing set up on the Google App Engine?

So I'd go with that, and then implement the full HTTP authentication/authorization header support from rfc2617.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  北荒        
                
              
                            
                2020-12-30 07:30
              
            
            
                                                                       
If those components will always be there, then a regex will do the trick:

test = '''Authorization: Digest qop="chap", realm="testrealm@host.com", username="Foobear", response="6629fae49393a05397450978507c4ef1", cnonce="5ccc069c403ebaf9f0171e9517f40e41"'''

import re

re_auth = re.compile(r"""
    Authorization:\s*(?P<protocol>[^ ]+)\s+
    qop="(?P<qop>[^"]+)",\s+
    realm="(?P<realm>[^"]+)",\s+
    username="(?P<username>[^"]+)",\s+
    response="(?P<response>[^"]+)",\s+
    cnonce="(?P<cnonce>[^"]+)"
    """, re.VERBOSE)

m = re_auth.match(test)
print m.groupdict()


produces:

{ 'username': 'Foobear', 
  'protocol': 'Digest', 
  'qop': 'chap', 
  'cnonce': '5ccc069c403ebaf9f0171e9517f40e41', 
  'realm': 'testrealm@host.com', 
  'response': '6629fae49393a05397450978507c4ef1'
}

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  暖寄归人        
                
              
                            
                2020-12-30 07:40
              
            
            
                                                                       
I would recommend finding a correct library for parsing http headers unfortunately I can't reacall any. :(

For a while check the snippet below (it should mostly work):

input= """
 Authorization: Digest qop="chap",
     realm="testrealm@host.com",
     username="Foob,ear",
     response="6629fae49393a05397450978507c4ef1",
     cnonce="5ccc069c403ebaf9f0171e9517f40e41"
"""

field, sep, value = input.partition(":")
if field.endswith('Authorization'):
   protocol, sep, opts_str = value.strip().partition(" ")

   opts = {}
   for opt in opts_str.split(",\n"):
        key, value = opt.strip().split('=')
        key = key.strip(" ")
        value = value.strip(' "')
        opts[key] = value

   opts['protocol'] = protocol

   print opts

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  盖世英雄少女心        
                
              
                            
                2020-12-30 07:42
              
            
            
                                                                       
Nadia's regex only matches alphanumeric characters for the value of a parameter.  That means it fails to parse at least two fields.  Namely, the uri and qop.  According to RFC 2617, the uri field is a duplicate of the string in the request line (i.e. the first line of the HTTP request).  And qop fails to parse correctly if the value is "auth-int" due to the non-alphanumeric '-'.

This modified regex allows the URI (or any other value) to contain anything but ' ' (space), '"' (qoute), or ',' (comma).  That's probably more permissive than it needs to be, but shouldn't cause any problems with correctly formed HTTP requests.

reg re.compile('(\w+)[:=] ?"?([^" ,]+)"?')


Bonus tip:  From there, it's fairly straight forward to convert the example code in RFC-2617 to python.  Using python's md5 API, "MD5Init()" becomes "m = md5.new()", "MD5Update()" becomes "m.update()" and "MD5Final()" becomes "m.digest()".
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     1
2
下一页
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复