How to call Twitter's Streaming/Filter Feed with urllib2/httplib?

无人久伴 提交于 2019-12-11 16:21:40

问题


Update:

I switched this back from answered as I tried the solution posed in cogent Nick's answer and switched to Google's urlfetch:

logging.debug("starting urlfetch for http://%s%s" % (self.host, self.url))
result = urlfetch.fetch("http://%s%s" % (self.host, self.url), payload=self.body, method="POST", headers=self.headers, allow_truncated=True, deadline=5)
logging.debug("finished urlfetch")

but unfortunately finished urlfetch is never printed - I see the timeout happen in the logs (it returns 200 after 5 seconds), but execution doesn't seem tor return.


Hi All-

I'm attempting to play around with Twitter's Streaming (aka firehose) API with Google App Engine (I'm aware this probably isn't a great long term play as you can't keep the connection perpetually open with GAE), but so far I haven't had any luck getting my program to actually parse the results returned by Twitter.

Some code:

logging.debug("firing up urllib2")
req = urllib2.Request(url="http://%s%s" % (self.host, self.url), data=self.body, headers=self.headers)
logging.debug("called urlopen for %s %s, about to call urlopen" % (self.host, self.url))
fobj = urllib2.urlopen(req)
logging.debug("called urlopen")

When this executes, unfortunately, my debug output never shows the called urlopen line printed. I suspect what's happening is that Twitter keeps the connection open and urllib2 doesn't return because the server doesn't terminate the connection.

Wireshark shows the request being sent properly and a response returned with results.

I tried adding Connection: close to my request header, but that didn't yield a successful result.

Any ideas on how to get this to work?


回答1:


urllib on App Engine is a thin wrapper around the urlfetch API. You're right about what's happening: Twitter's streaming API never terminates its response, so it times out, and urlfetch throws an exception.

If you use urlfetch directly, you can set the timeout (up to 10 seconds), and set allow_truncated to True so you can get the partial result. The Twitter streaming API really isn't a good match for App Engine, though, because App Engine requests are limited to 30 seconds of execution time, and urlfetch requests can't send back results progressively, or take more than 10 seconds. Using Twitter's 'standard' API would be a better option.



来源:https://stackoverflow.com/questions/2543339/how-to-call-twitters-streaming-filter-feed-with-urllib2-httplib

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!