How to grab streaming data from twitter connect with pycurl using nltk - regular expression

旧时模样 提交于 2020-01-01 19:05:12

问题


I am newbie in Python and given a task from my boss to do this :

  1. Grab streaming data from twitter connect with pycurl and output in JSON
  2. Parsing using NLTK and Regular Expression
  3. Save it to database file(mySQL) or file base(txt)

Note : this is the url that i want to grab ('http://search.twitter.com/search.json?geocode=-0.789275%2C113.921327%2C1.0km&q=+near%3Aindonesia+within%3A1km&result_type=recent&rpp=10')

Is there anyone know how to grab a streaming data from twitter using the step above ?

Your help would be very grateful :)


回答1:


I would look at pattern: it's a very nice web mining library, and it comes with a Twitter mining api as well. The documentation is pretty good too.

Otherwise, look at https://dev.twitter.com/docs/twitter-libraries for twitter libraries, and getting the stream should be pretty straightforward too.



来源:https://stackoverflow.com/questions/6853943/how-to-grab-streaming-data-from-twitter-connect-with-pycurl-using-nltk-regular

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!