How to grab streaming data from twitter connect with pycurl using nltk - regular expression

问题

I am newbie in Python and given a task from my boss to do this :

Grab streaming data from twitter connect with pycurl and output in JSON
Parsing using NLTK and Regular Expression
Save it to database file(mySQL) or file base(txt)

Note : this is the url that i want to grab ('http://search.twitter.com/search.json?geocode=-0.789275%2C113.921327%2C1.0km&q=+near%3Aindonesia+within%3A1km&result_type=recent&rpp=10')

Is there anyone know how to grab a streaming data from twitter using the step above ?

Your help would be very grateful :)

回答1:

I would look at pattern: it's a very nice web mining library, and it comes with a Twitter mining api as well. The documentation is pretty good too.

Otherwise, look at https://dev.twitter.com/docs/twitter-libraries for twitter libraries, and getting the stream should be pretty straightforward too.

来源：https://stackoverflow.com/questions/6853943/how-to-grab-streaming-data-from-twitter-connect-with-pycurl-using-nltk-regular

标签

regex

streaming

real-time

nltk

pycurl

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!