How do I pull tweets from a user for specific dates on python?

て烟熏妆下的殇ゞ 提交于 2020-05-30 08:13:31

问题


I am trying to download tweets from the Reuters (@reuters) twitter account for the month of November 2019.

I am using tweepy on python and this is my code:

pip install tweepy
import tweepy as tw

#Keys
consumer_key = "..."
consumer_secret = "..."
access_token = "..."
access_token_secret = "..."

# Login
auth = tw.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tw.API(auth, wait_on_rate_limit=True)

#Get user's tweets
tweets = tw.Cursor(api.user_timeline,
                   id="reuters",
                   lang="en",
                   since="2019-11-01",
                   until="2019-11-30").items()

all_tweets = [tweet.text for tweet in tweets]

all_tweets[:100]

The "until" parameter does not seem to be working because the tweets that my code pulls include latest tweets.


回答1:


Someone has already answered this question. Please do have a look here:

tweepy get tweets between two dates




回答2:


The tweepy library only supports Twitter's older standard search API at this time, and the standard search only covers 7 days of history. In order to search as far back as November 2019, you would need to use either the premium full-archive search API, or the enterprise full-archive search. These APIs are both commercial, but the premium API has a free tier called "sandbox" that would also work. In Python, you could use the search-tweets library.

The timeline method mentioned in the other answer would also be an option, but it would depend on Tweets from November being within the scope of the timeline API, which supports up to 3200 Tweets back from today.




回答3:


Below are two simple ways we can extract the tweets for specific duration and for specific user. Solution 1: using TwitterAPI. As mentioned by andy_piper you need premium or sandbox access, premium account is too expensive. Until you are not extracting huge corpus from twitter, it’s more than enough to have sandbox account which is free. You can simply enable sandbox account Using https://developer.twitter.com/en/pricing/aaa-all which will give you access to archive with limited number of request.

create dev environment label linking to your twitter account: go to dev environment in your twitter account and create corresponding label for sandbox. once you configured labels. Below code will extract corresponding tweets.(change maxResults correspondingly)

from TwitterAPI import TwitterAPI
Product = 'fullarchive'
label = 'Dev'
api = TwitterAPI(consumer_key, consumer_secret, access_token, access_token_secret)
tweets = api.request('tweets/search/%s/:%s' % (Product, label),
{'query' : 'from:reuters', 'maxResults': '10', 'fromDate':'201911010000', 'toDate':'201911300000'}) 

for tweet in tweets:
  print(tweet['id'])

Solution 2 : using GetOldTweet3 api, I won’t prefer this way since not sure about the licence, but it work like charm without even twitter developer account but bit suspicious with the privacy policy of twitter, here’s the code anyway.

import GetOldTweets3 as got
username = 'reuters'
count = 100
tweetCriteria = got.manager.TweetCriteria().setUsername(username)\
                                    .setMaxTweets(count).setSince("2019-11-01")\
                                       .setUntil("2019-11-30")\
tweets = got.manager.TweetManager.getTweets(tweetCriteria)
for tweet in tweets:
  print(tweet.id,tweet.author_id,tweet.date)

Reference: https://pypi.org/project/GetOldTweets3/ https://github.com/geduldig/TwitterAPI/blob/master/examples/premium_search.py



来源:https://stackoverflow.com/questions/61725154/how-do-i-pull-tweets-from-a-user-for-specific-dates-on-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!