urllib2

JSON.loads() ValueError Extra Data in Python

痞子三分冷 提交于 2020-01-01 19:41:16
问题 I'm trying to read individual values from a JSON feed. Here is an example of the feed data: { "sendtoken": "token1", "bytes_transferred": 0, "num_retries": 0, "timestamp": 1414395374, "queue_time": 975, "message": "internalerror", "id": "mailerX", "m0": { "binding_group": "domain.com", "recipient_domain": "hotmail.com", "recipient_local": "destination", "sender_domain": "domain.com", "binding": "mail.domain.com", "message_id": "C1/34-54876-D36FA645", "api_credential": "creds", "sender_local":

JSON.loads() ValueError Extra Data in Python

↘锁芯ラ 提交于 2020-01-01 19:41:07
问题 I'm trying to read individual values from a JSON feed. Here is an example of the feed data: { "sendtoken": "token1", "bytes_transferred": 0, "num_retries": 0, "timestamp": 1414395374, "queue_time": 975, "message": "internalerror", "id": "mailerX", "m0": { "binding_group": "domain.com", "recipient_domain": "hotmail.com", "recipient_local": "destination", "sender_domain": "domain.com", "binding": "mail.domain.com", "message_id": "C1/34-54876-D36FA645", "api_credential": "creds", "sender_local":

Python: urlretrieve PDF downloading

僤鯓⒐⒋嵵緔 提交于 2020-01-01 15:35:14
问题 I am using urllib's urlretrieve() function in Python in order to try to grab some pdf's from websites. It has (at least for me) stopped working and is downloading damaged data (15 KB instead of 164 KB). I have tested this with several pdf's, all with no success (ie random.pdf). I can't seem to get it to work, and I need to be able to download pdf's for the project I am working on. Here is an example of the kind of code I am using to download the pdf's (and parse the text using pdftotext.exe):

urllib2.HTTPError: HTTP Error 401 while querying using the new Bing API ( in azure marketplace )

感情迁移 提交于 2020-01-01 12:23:05
问题 So, I ve made corrections based on most of the answers under the same roof in stack overflow, I'm still unable to resolve this problem. queryBingFor = "Google Fibre" quoted_query = urllib.quote(queryBingFor) account_key = "dslfkslkdfhsehwekhrwkj2187iwekjfkwej3" rootURL = "https://api.datamarket.azure.com/Bing/Search/v1/" searchURL = rootURL + "Image?format=json&Query=" + quoted_query cred = base64.encodestring(accountKey) reqBing = urllib2.Request(url=searchURL) author = "Basic %s" % cred

urllib2.urlopen will hang forever despite of timeout

此生再无相见时 提交于 2020-01-01 03:24:07
问题 Hope this is quite a simple question, but it's driving me crazy. I'm using Python 2.7.3 on an out of the box installation of ubuntu 12.10 server. I kept zooming on the problem until I got to this snippet: import urllib2 x=urllib2.urlopen("http://casacinema.eu/movie-film-Matrix+trilogy+123+streaming-6165.html", timeout=5) It simply hangs forever, never goes on timeout. I'm evidently doing something wrong. Anybody could please help? Thank you very much indeed! Matteo 回答1: Looks like you are

HTTPS log in with urllib2

核能气质少年 提交于 2020-01-01 03:22:09
问题 I currently have a little script that downloads a webpage and extracts some data I'm interested in. Nothing fancy. Currently I'm downloading the page like so: import commands command = 'wget --output-document=- --quiet --http-user=USER --http-password=PASSWORD https://www.example.ca/page.aspx' status, text = commands.getstatusoutput(command) Although this works perfectly, I thought it'd make sense to remove the dependency on wget. I thought it should be trivial to convert the above to urllib2

10-穿墙代理的设置 | 01.数据抓取 | Python

天涯浪子 提交于 2020-01-01 02:56:27
10-穿墙代理的设置 郑昀 201005 隶属于《01.数据抓取》小节 我们访问 Twitter 等被封掉的网站时,需要设置 Proxy 。 1.使用HTTP Proxy 下面是普通HTTP Proxy的设置方式: 1.1. pycurl 的设置 _proxy_connect = "http://127.0.0.1:1984" c = pycurl .Curl() … c.setopt(pycurl.PROXY, _proxy_connect) 1.2.urllib2 的设置 req = urllib2.Request(link) proxy = urllib2.ProxyHandler({'http':'http://127.0.0.1:1984'}) opener = urllib2.build_opener(proxy,urllib2.HTTPHandler) urllib2.install_opener(opener) req.add_header('User-Agent', URLLIB2_USER_AGENT) urllib2.urlopen(link) … opener.close() 1.3.httplib 的设置 conn = httplib.HTTPConnection("127.0.0.1",1984) conn.request("GET", "http:/

urllib2 Error 403: Forbidden

情到浓时终转凉″ 提交于 2019-12-31 06:24:12
问题 I have posted to this site and received really helpful guidance, i return with another question. Where have i gone wrong here, I was prettty sure this is what is required to access information from various sites. In this case, the CME Group. import urllib2 url = "http://www.cmegroup.com/trading/energy/natural-gas/natural-gas.html" request= urllib2.Request(url) handle = urllib2.urlopen(request) content = handle.read() splitted_page = content.split("<span class=\"cmeSubHeading\">", 1); splitted

urllib2 Error 403: Forbidden

青春壹個敷衍的年華 提交于 2019-12-31 06:22:35
问题 I have posted to this site and received really helpful guidance, i return with another question. Where have i gone wrong here, I was prettty sure this is what is required to access information from various sites. In this case, the CME Group. import urllib2 url = "http://www.cmegroup.com/trading/energy/natural-gas/natural-gas.html" request= urllib2.Request(url) handle = urllib2.urlopen(request) content = handle.read() splitted_page = content.split("<span class=\"cmeSubHeading\">", 1); splitted

urllib2 Error 403: Forbidden

一曲冷凌霜 提交于 2019-12-31 06:22:13
问题 I have posted to this site and received really helpful guidance, i return with another question. Where have i gone wrong here, I was prettty sure this is what is required to access information from various sites. In this case, the CME Group. import urllib2 url = "http://www.cmegroup.com/trading/energy/natural-gas/natural-gas.html" request= urllib2.Request(url) handle = urllib2.urlopen(request) content = handle.read() splitted_page = content.split("<span class=\"cmeSubHeading\">", 1); splitted