urllib

Python urllib cache

阅读更多关于 Python urllib cache

问题 I'm writing a script in Python that should determine if it has internet access. import urllib CHECK_PAGE = "http://64.37.51.146/check.txt" CHECK_VALUE = "true\n" PROXY_VALUE = "Privoxy" OFFLINE_VALUE = "" page = urllib.urlopen(CHECK_PAGE) response = page.read() page.close() if response.find(PROXY_VALUE) != -1: urllib.getproxies = lambda x = None: {} page = urllib.urlopen(CHECK_PAGE) response = page.read() page.close() if response != CHECK_VALUE: print "'" + response + "' != '" + CHECK_VALUE +

Multi threaded web scraper using urlretrieve on a cookie-enabled site

阅读更多关于 Multi threaded web scraper using urlretrieve on a cookie-enabled site

问题 I am trying to write my first Python script, and with lots of Googling, I think that I am just about done. However, I will need some help getting myself across the finish line. I need to write a script that logs onto a cookie-enabled site, scrape a bunch of links, and then spawn a few processes to download the files. I have the program running in single-threaded, so I know that the code works. But, when I tried to create a pool of download workers, I ran into a wall. #manager.py import Fetch

TypeError: can't use a string pattern on a bytes-like object, api

阅读更多关于 TypeError: can't use a string pattern on a bytes-like object, api

问题 import json import urllib.request, urllib.error, urllib.parse Name = 'BagFullOfHoles' #Random player Platform = 'xone'#pc, xbox, xone, ps4, ps3 url = 'http://api.bfhstats.com/api/playerInfo?plat=' + Platform + '&name=' + Name json_obj = urllib.request.urlopen(url) data = json.load(json_obj) print (data) TypeError: can't use a string pattern on a bytes-like object Just recently used 2to3.py and this error or others come up when I try to fix it . Anyone with any pointers? 回答1: The json_obj =

TypeError: can't use a string pattern on a bytes-like object, api

阅读更多关于 TypeError: can't use a string pattern on a bytes-like object, api

阅读更多关于 urllib

urllib 对url中的中文编解码解码 from urllib import parse str= "%e7%bd%91%e7%9b%98" data= parse.unquote(rawurl) print(data) >>> 网盘编码 from urllib.request import quote name = '网盘' data = quote(name) print(data) >>>'%E7%BD%91%E7%9B%98' 来源： https://www.cnblogs.com/0916m/p/11484367.html

Redirection url using urllib in Python 3

阅读更多关于 Redirection url using urllib in Python 3

问题 I would need to know what is my final URL when following redirections using urllib in Python 3. Let's say I've some code like : opener = urllib.request.build_opener() request = urllib.request.Request(url) u = opener.open(request) If my urls redirects to another website, how can I know this new website URL ? I've found nothing useful in documentation. Thanks for your help ! 回答1: You can simply use u.geturl() to get the URL you were redirected to (or the original one if no redirect happened).

Catching http errors

阅读更多关于 Catching http errors

问题 how can I catch the 404 and 403 errors for pages in python and urllib(2), for example? Are there any fast ways without big class-wrappers? Added info (stack trace): Traceback (most recent call last): File "test.py", line 3, in <module> page = urllib2.urlopen("http://localhost:4444") File "/usr/lib/python2.6/urllib2.py", line 126, in urlopen return _opener.open(url, data, timeout) File "/usr/lib/python2.6/urllib2.py", line 391, in open response = self._open(req, data) File "/usr/lib/python2.6

Remove newline in python with urllib

阅读更多关于 Remove newline in python with urllib

问题 I am using Python 3.x. While using urllib.request to download the webpage, i am getting a lot of \n in between. I am trying to remove it using the methods given in the other threads of the forum, but i am not able to do so. I have used strip() function and the replace() function...but no luck! I am running this code on eclipse. Here is my code: import urllib.request #Downloading entire Web Document def download_page(a): opener = urllib.request.FancyURLopener({}) try: open_url = opener.open(a)

python read file from a web URL

阅读更多关于 python read file from a web URL

问题 I am currently trying to read a txt file from a website. My script so far is: webFile = urllib.urlopen(currURL) This way, I can work with the file. However, when I try to store the file (in webFile ), I only get a link to the socket. Another solution I tried was to use read() webFile = urllib.urlopen(currURL).read() However this seems to remove the formating ( \n , \t etc) are removed. If I open the file like this: webFile = urllib.urlopen(currURL) I can read it line by line: for line in

How to log in to a website with urllib?

阅读更多关于 How to log in to a website with urllib?

问题 I am trying to log on this website: http://www.broadinstitute.org/cmap/index.jsp. I am using python 3.3 on Windows. I followed this answer https://stackoverflow.com/a/2910487/651779. My code: import http.cookiejar import urllib url = 'http://www.broadinstitute.org/cmap/index.jsp' values = {'j_username' : 'username', 'j_password' : 'password'} data = urllib.parse.urlencode(values) binary_data = data.encode('ascii') cookies = http.cookiejar.CookieJar() opener = urllib.request.build_opener(