Loading Large Twitter JSON Data (7GB+) into Python

后端 未结 2 1112
长发绾君心
长发绾君心 2021-01-14 22:24

I\'ve set up a public stream via AWS to collect tweets and now want to do some preliminary analysis. All my data was stored on an S3 bucket (in 5mb files).

I downlo

2条回答
  •  无人及你
    2021-01-14 22:50

    I'm a VERY new user, but I might be able to offer a partial solution. I believe your formatting is off. You can't just import it as JSON without it being in JSON format. You should be able to fix this if you can get the tweets into a data frame (or separate data frames) and then use the "DataFrame.to_json" command. You WILL need Pandas if not already installed.

    Pandas - http://pandas.pydata.org/pandas-docs/stable/10min.html

    Dataframe - http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_json.html

提交回复
热议问题