问题
This works for me:
import xml.etree.ElementTree as ET
from urllib2 import urlopen
url = 'http://example.com'
# this url points to a `xml` page
tree = ET.parse(urlopen(url))
However, when I switch to requests
, something was wrong:
import requests
import xml.etree.ElementTree as ET
url = 'http://example.com'
# this url points to a `xml` page
tree = ET.parse(requests.get(url))
The trackback error is showed below:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
in ()
----> 1 tree = ET.parse(requests.get(url, proxies={'http': '192.168.235.36:7788'}))
/usr/lib/python2.7/xml/etree/ElementTree.py in parse(source, parser)
1180 def parse(source, parser=None):
1181 tree = ElementTree()
-> 1182 tree.parse(source, parser)
1183 return tree
1184
/usr/lib/python2.7/xml/etree/ElementTree.py in parse(self, source, parser)
645 close_source = False
646 if not hasattr(source, "read"):
--> 647 source = open(source, "rb")
648 close_source = True
649 try:
TypeError: coercing to Unicode: need string or buffer, Response found
So, my question is: wha is wrong with requests
in my situation and how can I make it work ET
with requests
?
回答1:
You are passing the requests
respones object to ElementTree; you want to pass in the raw file object instead:
r = requests.get(url, stream=True)
ET.parse(r.raw)
.raw
returns the 'file-like' socket object, from which ElementTree.parse()
will read, just like it'll read from the urllib2
response (which is itself a file-like object).
Concrete example:
>>> r = requests.get('http://www.enetpulse.com/wp-content/uploads/sample_xml_feed_enetpulse_soccer.xml', stream=True)
>>> tree = ET.parse(r.raw)
>>> tree
<xml.etree.ElementTree.ElementTree object at 0x109dadc50>
>>> tree.getroot().tag
'spocosy'
If you have a compressed URL, the raw socket (like urllib2
) returns the compressed data undecoded; in that case you can use the ET.fromstring()
method on the binary response content:
r = requests.get(url)
ET.fromstring(r.content)
回答2:
You're not feeding ElementTree the response text, but the requests
Response
object itself, which is why you get the type error: need string or buffer, Response found
. Do this instead:
r = requests.get(url)
tree = ET.fromstring(r.text)
来源:https://stackoverflow.com/questions/16933637/python-typeerror-while-using-xml-etree-elementree-and-requests