How to retrieve multiple tweets from tweet_id using R

前端 未结 1 1545
孤城傲影
孤城傲影 2021-01-28 19:56

I am using the twitteR package in R to extract tweets based on their ids. But I am unable to do this for multiple tweet ids without hitting either a rate limit

相关标签:
1条回答
  • 2021-01-28 20:36

    I have come across the same issue recently. For retrieving tweets in bulk, Twitter recommends using the lookup-method provided by its API. That way you can get up to 100 tweets per request.

    Unfortunately, this has not been implemented in the twitteR package yet; so I've tried to hack together a quick function (by re-using lots of code from the twitteR package) to use that API method:

    lookupStatus <- function (ids, ...){
      lapply(ids, twitteR:::check_id)
    
      batches <- split(ids, ceiling(seq_along(ids)/100))
    
      results <- lapply(batches, function(batch) {
        params <- parseIDs(batch)
        statuses <- twitteR:::twInterfaceObj$doAPICall(paste("statuses", "lookup", 
                                                             sep = "/"),
                                                       params = params, ...)
        twitteR:::import_statuses(statuses)
      })
      return(unlist(results))
    }
    
    parseIDs <- function(ids){
      id_list <- list()
      if (length(ids) > 0) {
        id_list$id <- paste(ids, collapse = ",")
      }
      return(id_list)
    }
    

    Make sure that your vector of ids is of class character (otherwise there can be a some problems with very large IDs).

    Use the function like this:

    ids <- c("432656548536401920", "332526548546401821")
    tweets <- lookupStatus(ids, retryOnRateLimit=100)
    

    Setting a high retryOnRateLimit ensures you get all your tweets, even if your vector of IDs has more than 18,000 entries (100 IDs per request, 180 requests per 15-minute window).

    As usual, you can turn the tweets into a data frame with twListToDF(tweets).

    0 讨论(0)
提交回复
热议问题