问题
I am new to twitter api and I have spend tremendous amount of time trying to figure this out.
I would like to extract a large number (100k - 1m) of tweets for a given search term from most recent tweets. I tried working with tweepy and I was able to setup a stream but I need the data from past as well.
I also tried following code but it only gives me 100 at a time and I don't understand how to use since_id and max_id to run through past tweets. Also if someone knows how to extract hashtags from a post. Currently I am splitting the words in posts and finding words with "#" but api.search has an attribute 'hash' and I am not sure how to call it.
results = api.search(q=movies[0],count=100,lang='en')
Any guidance would be appreciated.
回答1:
You can add this to result[]
by doing:
results = []
#Get the first 1000 items based on the search query and store it
for tweet in tweepy.Cursor(api.search, q='%23Trump').items(1000):
results.append(tweet)
回答2:
You will want to use a Tweepy Cursor. To create a Cursor, pass it the api method, and any parameters:
cursor = tweepy.Cursor(api.search, q=movies[0], count=100, lang='en')
Then, iterate over the results returned by the Cursor's items
method. You can pass in an optional limit of results:
for item in cursor.items(limit=20): # the limit can be omitted
# do something with the item
回答3:
Total archive is limited to 3200 tweets but there is a Daily limit of 1500.
来源:https://stackoverflow.com/questions/20751894/twitter-api-searching-tweets-for-hashtags