问题
I saw this previous post but I have not been able to adapt the answer to get my code to work.
I am trying to filter on the term bruins and need to reference cacert.pem since for authentication on my Windows machine. Lastly, I have written a function to parse each response (my.function) and need to include this as well.
postForm("https://stream.twitter.com/1/statuses/sample.json",
userpwd="user:pass",
cainfo = "cacert.pem",
a = "bruins",
write=my.function)
I am looking to stay completely within R and unfortunately need to use Windows.
Simply, how can I include the search term(s) that I want such that the response is filtered?
Thanks in advance.
回答1:
Alright, so I've looked at what you're doing, and some of what you're working on may be helped by examining the Twitter API methods, although it can be difficult to figure out how to translate some of the examples into R (via the RCurl Package).
What you're currently trying is very close to what you need to do, you simply need to change two things.
First of all, you're querying the url for the random sample of statuses. This url returns a random sample of roughly 1% of all tweets.
If you're interested in collecting only tweets about specific keywords, you want to use the filter API url: "https://stream.twitter.com/1/statuses/filter.json"
After changing that, you simply need to change your parameter from "a" to "postfields", and the parameter you'd be passing would look like: "track=bruins"
Finally, you should use the getURL function, to open a continuous stream, so all tweets with your keywords can be collected, rather than using the postForm command (which I believe is intended for HTML forms).
so your final function call should look like the following:
getURL("https://stream.twitter.com/1/statuses/filter.json",
userpwd="Username:Password",
cainfo = "cacert.pem",
write=my.function,
postfields="track=bruins")
回答2:
For manipulating twitter, use the twitteR
package.
library(twitteR)
searchTwitter("bruins")
You can include other parameters (like cainfo
) in the call to searchTwitter
, and they should get passed getForm
underneath.
回答3:
I don't think the Streaming API is currently included in twitteR - the search api is different (it's backward looking, whereas streaming is "current looking").
From my understanding, streaming is quite different to how lots of APIs work typically work; rather than pulling data from a web service and having a defined object returned, you're setting up a "pipe" for Twitter to push data to you and you then listen for that response.
You also need to worry about OAuth I think (which twitteR does deal with).
Is there any reason that you want to keep in R? I've used python successfully with the Streaming API and a package called tweepy to write data to a MySQL database and then use R to query and analyse the data.
回答4:
Last time I checked, twitteR did not talk to the streaming API. Moreover, as far as I know, very few publicly-available Twitter Streaming API connection libraries in any language honor Twitter's recommendations on reconnecting when Streaming disconnects / throws an error.
My recommendation is to access Streaming via a library that's actively maintained, write the re-connection protocol yourself if you have to, and persist the data into a database that handles JSON natively. I'm about to embark on a project of this nature and will be writing the collector in Perl, doing my own re-connect logic and persisting into either PostgreSQL or MongoDB. Most likely it will be MongoDB; PostgreSQL doesn't get native JSON till 9.2.
回答5:
Late to the game, I know, but you'll want to use the "streamR" package for access to Twitter's streaming API.
来源:https://stackoverflow.com/questions/8797608/rcurl-twitter-streaming-api-keyword-filtering