urllib

Python read page from URL? Better documentation?

丶灬走出姿态 提交于 2019-12-11 08:08:26
问题 I'm having quite a bit of trouble with Python's documentation. Is there anything like the Mozilla Developer Network for it? I'm doing a Python puzzle website and I need to be able to read the content of the page. I saw the following posted on a site: import urllib2 urlStr = 'http://www.python.org/' try: fileHandle = urllib2.urlopen(urlStr) str1 = fileHandle.read() fileHandle.close() print ('-'*50) print ('HTML code of URL =', urlStr) print ('-'*50) except IOError: print ('Cannot open URL %s

Python urllib2 parse html problem

生来就可爱ヽ(ⅴ<●) 提交于 2019-12-11 07:39:37
问题 I am using mechanize to parse html of website, but with this website i got strange result. from mechanize import Browser br = Browser() r = br.open("http://www.heavenplaza.com") result = r.read() result is something which i can not understand. you can see here: http://paste2.org/p/1556077 Anyone can have some method to get that website HTML? with mechanize or urllib. Thanks 回答1: import urllib2, StringIO, gzip f = urllib2.urlopen("http://www.heavenplaza.com") data = StringIO.StringIO(f.read())

Loading url with cyrillic symbols

假装没事ソ 提交于 2019-12-11 07:26:33
问题 I have to load some url with cyrillic symbols. My script should work with this: http://wincode.org/%D0%BF%D1%80%D0%BE%D0%B3%D1%80%D0%B0%D0%BC%D0%BC%D0%B8%D1%80%D0%BE%D0%B2%D0%B0%D0%BD%D0%B8%D0%B5/ If I'll use this in browser it would replaced into normal symbols, but urllib code fails with 404 error. How to decode correctly this url? When I'm using that url directly in code, like address = 'that address', it works perfect. But I used parsing page for getting this url. I have a list of urls

How to get python to successfully download large images from the internet

时光怂恿深爱的人放手 提交于 2019-12-11 07:19:14
问题 So I've been using urllib.request.urlretrieve(URL, FILENAME) to download images of the internet. It works great, but fails on some images. The ones it fails on seem to be the larger images- eg. http://i.imgur.com/DEKdmba.jpg. It downloads them fine, but when I try to open these files photo viewer gives me the error "windows photo viewer cant open this picture because the file appears to be damaged corrupted or too large". What might be the reason it can't download these, and how can I fix

POST not receiving correct response

十年热恋 提交于 2019-12-11 05:56:50
问题 I am trying to get the response(a table) with the help of urllib module of Python from this webpage: http://www.mcxindia.com/SitePages/indexhistory.aspx Here is what I have got: import httplib import urllib import urllib2 from BeautifulSoup import BeautifulSoup headers = { 'X-MicrosoftAjax': 'Delta=true', 'Cache-Control': 'no-cache', 'Content-Type': 'text/plain; charset=utf-8', 'X-Requested-With': 'XMLHttpRequest' } data = { 'ScrMgrIndexDetail': 'UpdatePanelIndexDetail|mBtnGo', '_

HTML data is hidden from urllib

∥☆過路亽.° 提交于 2019-12-11 05:14:51
问题 How do I get the real content from this page: http://kursuskatalog.au.dk/da/course/74960/105E17-Demokrati-og-diktatur-i-komparativt-perspektiv All I get from the code below is some links to javascript and CSS files. Is there a way out of this? from urllib.request import urlopen html = urlopen("http://kursuskatalog.au.dk/da/course/74960/105E17-Demokrati-og-diktatur-i-komparativt-perspektiv") print(html.read()) Best regards, Kresten 回答1: Content in this URL is created with JavaScript after page

python urllib2 can't get google url

China☆狼群 提交于 2019-12-11 04:41:23
问题 I'm having a really tough time with getting the results page of this url with python's urllib2: http://www.google.com/search?tbs=sbi:AMhZZitAaz7goe6AsfVSmFw1sbwsmX0uIjeVnzKHjEXMck70H3j32Q-6FApxrhxdSyMo0OedyWkxk3-qYbyf0q1OqNspjLu8DlyNnWVbNjiKGo87QUjQHf2_1idZ1q_1vvm5gzOCMpChYiKsKYdMywOLjJzqmzYoJNOU2UsTs_1zZGWjU-LsjdFXt_1D5bDkuyRK0YbsaLVcx4eEk_1KMkcJpWlfFEfPMutxTLGf1zxD-9DFZDzNOODs0oj2j_1KG8FRCaMFnTzAfTdl7JfgaDf_1t5Vti8FnbeG9i7qt9wF6P-QK9mdvC15hZ5UR29eQdYbcD1e4woaOQCmg8Q1VLVPf4-kf8dAI7p3jM

How can I retrieve an image from a url and convert it to a PIL object

房东的猫 提交于 2019-12-11 04:36:18
问题 import urllib.request import Image urllib.request.urlretrieve("https://www.spriters-resource.com/download/50365/", "mario_image.PNG") img = Image.open("mario_image.PNG") I'm wondering how can I retrieve an image from a url and immediately convert it to a PIL Image without having to load the image from the file 回答1: import requests from PIL import Image r = requests.get(URL, stream=True) img = Image.open(r.raw) 来源: https://stackoverflow.com/questions/41050871/how-can-i-retrieve-an-image-from-a

Does urllib or urllib2 in Python 2.5 support https?

久未见 提交于 2019-12-11 03:56:14
问题 Thanks for the help in advance. I am puzzled that the same code works for python 2.6 but not 2.5. Here is the code import cgi, urllib, urlparse, urllib2 url='https://graph.facebook.com' req=urllib2.Request(url=url) p=urllib2.urlopen(req) response = cgi.parse_qs(p.read()) And here is the exception I got Traceback (most recent call last): File "t2.py", line 6, in <module> p=urllib2.urlopen(req) File "/home/userx/lib/python2.5/urllib2.py", line 124, in urlopen return _opener.open(url, data) File

Keeping url parameters in order when encoding with urllib

依然范特西╮ 提交于 2019-12-11 02:43:50
问题 I am trying to simulate a get request with python. I have a dictionary of parameters and am using urllib.urlencode to urlencode them I notice that although the dictionary is of the form: { "k1":"v1", "k2":"v2", "k3":"v3", .. } upon urlencoding the order of the parameters is switched to: /?k1=v1&k3=v3%k2=v2... why does this happen and can I force the order in the dictionary to stay the same? 回答1: As you can see in the comments, Python dictionaries are not ordered, but there is an OrderedDict