Browser simulation - Python

前端 未结 4 2353
悲哀的现实
悲哀的现实 2021-02-20 11:41

I need to access a few HTML pages through a Python script, problem is that I need COOKIE functionality, therefore a simple urllib HTTP request won\'t work.

Any ideas?

4条回答
  •  长发绾君心
    2021-02-20 12:15

    Here's something that does cookies, and as a bonus does authentication for a site that requires a username and password.

    import urllib2
    import cookielib
    import string
    
    
    
    def cook():
        url="http://wherever"
        cj = cookielib.LWPCookieJar()
        authinfo = urllib2.HTTPBasicAuthHandler()
        realm="realmName"
        username="userName"
        password="passWord"
        host="www.wherever.com"
        authinfo.add_password(realm, host, username, password)
        opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj), authinfo)
        urllib2.install_opener(opener)
    
        # Create request object
        txheaders = { 'User-agent' : "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)" }
        try:
            req = urllib2.Request(url, None, txheaders)
            cj.add_cookie_header(req)
            f = urllib2.urlopen(req)
    
        except IOError, e:
            print "Failed to open", url
            if hasattr(e, 'code'):
                print "Error code:", e.code
    
        else:
    
            print f
            print f.read()
            print f.info()
            f.close()
            print 'Cookies:'
            for index, cookie in enumerate(cj):
                print index, " : ", cookie      
            cj.save("cookies.lwp")
    

提交回复
热议问题