Parse User name for extracting user location Twitter

时间秒杀一切 提交于 2020-01-11 14:51:42

问题


I am trying to scrape user location with respect to user names from twitter.

Input: The user list has more than 50K User names

AkkiPritam,6.77E+17,12/15/2015,#chennaifloods
AkkiPritam,6.77E+17,12/15/2015,#bhoomikatrust
AkkiPritam,6.77E+17,12/15/2015,#akshaykumar
gischethans,6.77E+17,12/15/2015,#chennaifloods
mid_day,6.77E+17,12/15/2015,#bollywood
mid_day,6.77E+17,12/15/2015,#chennaifloods
Nanthivarman16,6.77E+17,12/15/2015,#admkfails
Nanthivarman16,6.77E+17,12/15/2015,#jayafails
Nanthivarman16,6.77E+17,12/15/2015,#stickergovt
Nanthivarman16,6.77E+17,12/15/2015,#chennaifloods
AdilaMatra,6.77E+17,12/15/2015,#chennaifloods
AdilaMatra,6.77E+17,12/15/2015,#climatechange
AdilaMatra,6.77E+17,12/15/2015,#delhichokes
AdilaMatra,6.77E+17,12/15/2015,#smog
HDFCERGOGIC,6.77E+17,12/15/2015,#chennaifloods
HDFCERGOGIC,6.77E+17,12/15/2015,#tnfloods
ImSoorej,6.77E+17,12/15/2015,#chennaifloods
ImSoorej,6.77E+17,12/15/2015,#chennaimicr

Code: I want to find geo location possibly geo coordinates.

from __future__ import print_function
import tweepy
from tweepy import OAuthHandler
from tweepy import Stream
from tweepy.streaming import StreamListener
import pandas as pd
import csv

consumer_key = 'xyz'
consumer_secret = 'xyz'
access_token = 'xyz'
access_token_secret = 'xyz'

data = pd.read_csv('user_keyword.csv')
df = ['user_name', 'user_id', 'date', 'keyword']

def get_user_details(username):
        userobj = api.get_user(username)
        return userobj

if __name__ == '__main__':
    #authenticating the app (https://apps.twitter.com/)
    auth = tweepy.auth.OAuthHandler(consumer_key, consumer_secret)
    auth.set_access_token(access_token, access_token_secret)
    api = tweepy.API(auth)

    username = df['user_name']
    userOBJ = get_user_details(username)
    print(userOBJ.location)

Error: Trouble parsing the usernames into program.

Traceback (most recent call last):
  File "user_profile_location.py", line 38, in <module>
    username = df['user_name']
TypeError: list indices must be integers, not str

回答1:


You are using 'data' to define your DataFrame and 'df' for what I think should be the columns of the DataFrame

data = pd.read_csv('user_keyword.csv')
df = ['user_name', 'user_id', 'date', 'keyword']

I assume that the user_keyword.csv file has no header, try adding:

data.columns = df

It will change the column names to the values stored in df. Then later instead of:

username = df['user_name']

Try:

username = data['user_name']

Keep in mind that now username is a whole column so get_user_details(username) should not be expecting a single string.



来源:https://stackoverflow.com/questions/38524071/parse-user-name-for-extracting-user-location-twitter

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!