How to create a pandas dataframe using Tweepy?

前端 未结 2 1327
梦谈多话
梦谈多话 2021-01-01 01:19

In Python 3 I made program to extract posts and likes in Twitter:

import tweepy
import pandas as pd

consumer_key = \'\'
consumer_secret = \'\'
access_token          


        
相关标签:
2条回答
  • 2021-01-01 02:01

    Here's an easy way:

    import os
    import tweepy
    import pandas as pd
    
    # use os.environ.get to obtain other environment variables
    # from ~/.bashrc or ~/.zshrc etc., so they aren't in your code
    consumer_key = os.environ.get('c_key')
    consumer_secret = # os...
    access_token = # os...
    access_token_secret = # os...
    auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
    auth.set_access_token(access_token, access_token_secret)
    api = tweepy.API(auth)
    
    results = api.search(q='cheese', count=100)
    
    json_data = [r._json for r in results]
    
    df = pd.io.json.json_normalize(json_data)
    
    0 讨论(0)
  • 2021-01-01 02:13

    Importing the required libraries that we are going to use:

    import pandas as pd
    import numpy as np
    import tweepy
    import json
    

    Providing our keys to connect to Twitter API:

    consumer_key = '....'
    consumer_secret = '....'
    access_token = '....'
    access_secret = '....'
    

    The next step is creating an OAuthHandler instance...

    auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
    

    ...and then gain access to the Twitter API.

    auth.set_access_token(access_token, access_secret)
    

    Finally we create an API object that we are going to use it to fetch the tweets:

    api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)
    

    Fetching the last 20 tweets from FC Barcelona twitter account:

    last_20_tweets_of_FC_Barcelona = api.user_timeline('FCBarcelona')

    Then in this code block we isolate the json part of each tweepy status object that we have downloaded and we add them all into a list....

    my_list_of_dicts = []
    for each_json_tweet in last_20_tweets_of_FC_Barcelona:
        my_list_of_dicts.append(each_json_tweet._json)
    

    ...and then we write this list into a txt file:

    with open('tweet_json_Barca.txt', 'w') as file:
            file.write(json.dumps(my_list_of_dicts, indent=4))
    

    Now we are going to create a DataFrame from the tweet_json.txt file:

    my_demo_list = []
    with open('tweet_json_Barca.txt', encoding='utf-8') as json_file:  
        all_data = json.load(json_file)
        for each_dictionary in all_data:
            tweet_id = each_dictionary['id']
            text = each_dictionary['text']
            favorite_count = each_dictionary['favorite_count']
            retweet_count = each_dictionary['retweet_count']
            created_at = each_dictionary['created_at']
            my_demo_list.append({'tweet_id': str(tweet_id),
                                 'text': str(text),
                                 'favorite_count': int(favorite_count),
                                 'retweet_count': int(retweet_count),
                                 'created_at': created_at,
                                })
            #print(my_demo_list)
            tweet_json = pd.DataFrame(my_demo_list, columns = 
                                      ['tweet_id', 'text', 
                                       'favorite_count', 'retweet_count', 
                                       'created_at'])
    
    0 讨论(0)
提交回复
热议问题