urllib

python read file from a web URL

余生长醉 提交于 2019-12-01 11:03:16
I am currently trying to read a txt file from a website. My script so far is: webFile = urllib.urlopen(currURL) This way, I can work with the file. However, when I try to store the file (in webFile ), I only get a link to the socket. Another solution I tried was to use read() webFile = urllib.urlopen(currURL).read() However this seems to remove the formating ( \n , \t etc) are removed. If I open the file like this: webFile = urllib.urlopen(currURL) I can read it line by line: for line in webFile: print line This will should result in: "this" "is" "a" "textfile" But I get: 't' 'h' 'i' ... I

How to log in to a website with urllib?

自闭症网瘾萝莉.ら 提交于 2019-12-01 10:49:13
I am trying to log on this website: http://www.broadinstitute.org/cmap/index.jsp . I am using python 3.3 on Windows. I followed this answer https://stackoverflow.com/a/2910487/651779 . My code: import http.cookiejar import urllib url = 'http://www.broadinstitute.org/cmap/index.jsp' values = {'j_username' : 'username', 'j_password' : 'password'} data = urllib.parse.urlencode(values) binary_data = data.encode('ascii') cookies = http.cookiejar.CookieJar() opener = urllib.request.build_opener( urllib.request.HTTPRedirectHandler(), urllib.request.HTTPHandler(debuglevel=0), urllib.request

Python's urllib2 doesn't work on some sites

☆樱花仙子☆ 提交于 2019-12-01 09:45:35
问题 I found that you can't read from some sites using Python's urllib2(or urllib). An example... urllib2.urlopen("http://www.dafont.com/").read() # Returns '' These sites work when you visit the site with a browser. I can even scrape them using PHP(didn't try other languages). I have seen other sites with the same issue - but can't remember the URL at the moment. My questions are... What is the cause of this issue? Any workarounds? 回答1: I believe it gets blocked by the User-Agent. You can change

urllib.urlencode: TypeError not a valid non-string sequence or mapping object

时光总嘲笑我的痴心妄想 提交于 2019-12-01 08:37:13
I am trying to run following code but it is giving me below error: Traceback (most recent call last): File "put_message.py", line 43, in <module>translatedWord=getTranslatedValue(source_lang,source_word,dest_lang,apiKey) File "put_message.py", line 22, in getTranslatedValue source_word=urllib.urlencode(source_word) File "/usr/lib/python2.7/urllib.py", line 1318, in urlencode raise TypeError TypeError: not a valid non-string sequence or mapping object my program is given below: Script to Translate the data from one language to another import MySQLdb import json import urllib, urllib2 import

Obnoxious CryptographyDeprecationWarning because of missing hmac.compare_time function everywhere

血红的双手。 提交于 2019-12-01 05:39:20
Things were running along fine until one of my projects started printing this everywhere, at the top of every execution, at least once: local/lib/python2.7/site-packages/cryptography/hazmat/primitives/constant_time.py:26: CryptographyDeprecationWarning: Support for your Python version is deprecated. The next version of cryptography will remove support. Please upgrade to a 2.7.x release that supports hmac.compare_digest as soon as possible. I have no idea why it started and it's disrupting the applications'/tools' output, especially when it's being captured and consumed by other tools. Like

python urllib学习

江枫思渺然 提交于 2019-12-01 05:08:09
import json import urllib.request ua='Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36' r=urllib.request.urlopen('http://httpbin.org/get') req=urllib.request.Request("http://httpbin.org/user-agent") req.add_header('User-Agent',ua) r=urllib.request.urlopen(req) r=r.read().decode() rsp=json.loads(r) print(rsp.get('user-agent')) 看USERAGENT 来源: https://www.cnblogs.com/xupanfeng/p/11657754.html

What is the difference between <class 'str'> and <type 'str'>

旧城冷巷雨未停 提交于 2019-12-01 03:57:51
I am new to python. I'm confused by the <class 'str'> . I got a str by using: response = urllib.request.urlopen(req).read().decode() The type of 'response' is <class 'str'> , not <type 'str'> . When I try to manipulate this str in 'for loop': for ID in response: The 'response' is read NOT by line, BUT by character. I intend to put every line of 'response' into individual element of a list. Now I have to write the response in a file and use 'open' to get a string of <type 'str'> that I can use in 'for loop'. As mentioned by the commenters. In python3: >>>st = 'Hello Stack!' >>>type(st) <class

Again urllib.error.HTTPError: HTTP Error 400: Bad Request

别来无恙 提交于 2019-12-01 03:43:02
Hy! I tried to open web-page, that is normally opening in browser, but python just swears and does not want to work. import urllib.request, urllib.error f = urllib.request.urlopen('http://www.booking.com/reviewlist.html?cc1=tr;pagename=sapphire') And another way import urllib.request, urllib.error opener=urllib.request.build_opener() f=opener.open('http://www.booking.com/reviewlist.html?cc1=tr;pagename=sapphi re') Both options give one type of error: Traceback (most recent call last): File "<stdin>", line 1, in <module> File "C:\Python34\lib\urllib\request.py", line 461, in open response =

Python如何使用urllib2获取网络资源

馋奶兔 提交于 2019-12-01 01:58:28
urllib2是Python的一个获取URLs(Uniform Resource Locators)的组件。他以urlopen函数的形式提供了一个非常简单的接口, 这是具有利用不同协议获取URLs的能力,他同样提供了一个比较复杂的接口来处理一般情况,例如:基础验证,cookies,代理和其他。 它们通过handlers和openers的对象提供。 urllib2支持获取不同格式的URLs(在URL的”:”前定义的字串,例如:”ftp”是”ftp:python.ort/”的前缀),它们利用它们相关网络协议(例如FTP,HTTP) 进行获取。这篇教程关注最广泛的应用–HTTP。 对于简单的应用,urlopen是非常容易使用的。但当你在打开HTTP的URLs时遇到错误或异常,你将需要一些超文本传输协议(HTTP)的理解。 最权威的HTTP文档当然是RFC 2616( http://rfc.net/rfc2616.html)。这是一个技术文档,所以并不易于阅读。这篇HOWTO教程的目的是展现如何使用urllib2 , 并提供足够的HTTP细节来帮助你理解。他并不是urllib2的文档说明,而是起一个辅助作用。 获取 URLs 最简单的使用urllib2将如下所示 view plaincopy to clipboardprint? 1. import urllib2 2. 3.

How to download a webpage that require username and password?

大兔子大兔子 提交于 2019-12-01 01:40:14
For example, I want to download this page after inserting username and password: http://forum.ubuntu-it.org/ I have tryed with wget but doesn't work. Is there a solution with python ? You can test with these username and password: username: johnconnor password: hellohello Like @robert says, use mechanize. To get you started: from mechanize import Browser b = Browser() b.open("http://forum.ubuntu-it.org/index.php") b.select_form(nr=0) b["user"] = "johnconnor" b["passwrd"] = "hellohello" b.submit() response = b.response().read() if "Salve <b>johnconnor</b>" in response: print "Logged in!" Try