urllib | 易学教程

Python read page from URL? Better documentation?

阅读更多关于 Python read page from URL? Better documentation?

问题 I'm having quite a bit of trouble with Python's documentation. Is there anything like the Mozilla Developer Network for it? I'm doing a Python puzzle website and I need to be able to read the content of the page. I saw the following posted on a site: import urllib2 urlStr = 'http://www.python.org/' try: fileHandle = urllib2.urlopen(urlStr) str1 = fileHandle.read() fileHandle.close() print ('-'*50) print ('HTML code of URL =', urlStr) print ('-'*50) except IOError: print ('Cannot open URL %s

Python urllib2 parse html problem

阅读更多关于 Python urllib2 parse html problem

问题 I am using mechanize to parse html of website, but with this website i got strange result. from mechanize import Browser br = Browser() r = br.open("http://www.heavenplaza.com") result = r.read() result is something which i can not understand. you can see here: http://paste2.org/p/1556077 Anyone can have some method to get that website HTML? with mechanize or urllib. Thanks 回答1: import urllib2, StringIO, gzip f = urllib2.urlopen("http://www.heavenplaza.com") data = StringIO.StringIO(f.read())

Loading url with cyrillic symbols

阅读更多关于 Loading url with cyrillic symbols

问题 I have to load some url with cyrillic symbols. My script should work with this: http://wincode.org/%D0%BF%D1%80%D0%BE%D0%B3%D1%80%D0%B0%D0%BC%D0%BC%D0%B8%D1%80%D0%BE%D0%B2%D0%B0%D0%BD%D0%B8%D0%B5/ If I'll use this in browser it would replaced into normal symbols, but urllib code fails with 404 error. How to decode correctly this url? When I'm using that url directly in code, like address = 'that address', it works perfect. But I used parsing page for getting this url. I have a list of urls

How to get python to successfully download large images from the internet

阅读更多关于 How to get python to successfully download large images from the internet

问题 So I've been using urllib.request.urlretrieve(URL, FILENAME) to download images of the internet. It works great, but fails on some images. The ones it fails on seem to be the larger images- eg. http://i.imgur.com/DEKdmba.jpg. It downloads them fine, but when I try to open these files photo viewer gives me the error "windows photo viewer cant open this picture because the file appears to be damaged corrupted or too large". What might be the reason it can't download these, and how can I fix

POST not receiving correct response

阅读更多关于 POST not receiving correct response

问题 I am trying to get the response(a table) with the help of urllib module of Python from this webpage: http://www.mcxindia.com/SitePages/indexhistory.aspx Here is what I have got: import httplib import urllib import urllib2 from BeautifulSoup import BeautifulSoup headers = { 'X-MicrosoftAjax': 'Delta=true', 'Cache-Control': 'no-cache', 'Content-Type': 'text/plain; charset=utf-8', 'X-Requested-With': 'XMLHttpRequest' } data = { 'ScrMgrIndexDetail': 'UpdatePanelIndexDetail|mBtnGo', '_

HTML data is hidden from urllib

阅读更多关于 HTML data is hidden from urllib

问题 How do I get the real content from this page: http://kursuskatalog.au.dk/da/course/74960/105E17-Demokrati-og-diktatur-i-komparativt-perspektiv All I get from the code below is some links to javascript and CSS files. Is there a way out of this? from urllib.request import urlopen html = urlopen("http://kursuskatalog.au.dk/da/course/74960/105E17-Demokrati-og-diktatur-i-komparativt-perspektiv") print(html.read()) Best regards, Kresten 回答1: Content in this URL is created with JavaScript after page

python urllib2 can't get google url

阅读更多关于 python urllib2 can't get google url

问题 I'm having a really tough time with getting the results page of this url with python's urllib2: http://www.google.com/search?tbs=sbi:AMhZZitAaz7goe6AsfVSmFw1sbwsmX0uIjeVnzKHjEXMck70H3j32Q-6FApxrhxdSyMo0OedyWkxk3-qYbyf0q1OqNspjLu8DlyNnWVbNjiKGo87QUjQHf2_1idZ1q_1vvm5gzOCMpChYiKsKYdMywOLjJzqmzYoJNOU2UsTs_1zZGWjU-LsjdFXt_1D5bDkuyRK0YbsaLVcx4eEk_1KMkcJpWlfFEfPMutxTLGf1zxD-9DFZDzNOODs0oj2j_1KG8FRCaMFnTzAfTdl7JfgaDf_1t5Vti8FnbeG9i7qt9wF6P-QK9mdvC15hZ5UR29eQdYbcD1e4woaOQCmg8Q1VLVPf4-kf8dAI7p3jM

How can I retrieve an image from a url and convert it to a PIL object

阅读更多关于 How can I retrieve an image from a url and convert it to a PIL object

问题 import urllib.request import Image urllib.request.urlretrieve("https://www.spriters-resource.com/download/50365/", "mario_image.PNG") img = Image.open("mario_image.PNG") I'm wondering how can I retrieve an image from a url and immediately convert it to a PIL Image without having to load the image from the file 回答1: import requests from PIL import Image r = requests.get(URL, stream=True) img = Image.open(r.raw) 来源： https://stackoverflow.com/questions/41050871/how-can-i-retrieve-an-image-from-a

Does urllib or urllib2 in Python 2.5 support https?

阅读更多关于 Does urllib or urllib2 in Python 2.5 support https?

问题 Thanks for the help in advance. I am puzzled that the same code works for python 2.6 but not 2.5. Here is the code import cgi, urllib, urlparse, urllib2 url='https://graph.facebook.com' req=urllib2.Request(url=url) p=urllib2.urlopen(req) response = cgi.parse_qs(p.read()) And here is the exception I got Traceback (most recent call last): File "t2.py", line 6, in <module> p=urllib2.urlopen(req) File "/home/userx/lib/python2.5/urllib2.py", line 124, in urlopen return _opener.open(url, data) File

Keeping url parameters in order when encoding with urllib

阅读更多关于 Keeping url parameters in order when encoding with urllib

问题 I am trying to simulate a get request with python. I have a dictionary of parameters and am using urllib.urlencode to urlencode them I notice that although the dictionary is of the form: { "k1":"v1", "k2":"v2", "k3":"v3", .. } upon urlencoding the order of the parameters is switched to: /?k1=v1&k3=v3%k2=v2... why does this happen and can I force the order in the dictionary to stay the same? 回答1: As you can see in the comments, Python dictionaries are not ordered, but there is an OrderedDict