Read a csv file from aws s3 using boto and pandas

前端未结

关注

 3  500

一整个雨季

I have already read through the answers available here and here and these do not help.

I am trying to read a csv object from S3 bucket and have

相关标签:

3条回答

萌比男神i

2021-02-04 07:57

Here is what I have done to successfully read the df from a csv on S3.

import pandas as pd
import boto3

bucket = "yourbucket"
file_name = "your_file.csv"

s3 = boto3.client('s3') 
# 's3' is a key word. create connection to S3 using default config and all buckets within S3

obj = s3.get_object(Bucket= bucket, Key= file_name) 
# get object and file (key) from bucket

initial_df = pd.read_csv(obj['Body']) # 'Body' is a key word

0 讨论(0)

粉色の甜心

2021-02-04 08:09

Maybe you can try to use pandas read_sql and pyathena:

from pyathena import connect
import pandas as pd

conn = connect(s3_staging_dir='s3://bucket/folder',region_name='region')
df = pd.read_sql('select * from database.table', conn)

0 讨论(0)

刺人心

2021-02-04 08:15

This worked for me.

import pandas as pd
import boto3
import io

s3_file_key = 'data/test.csv'
bucket = 'data-bucket'

s3 = boto3.client('s3')
obj = s3.get_object(Bucket=bucket, Key=s3_file_key)

initial_df = pd.read_csv(io.BytesIO(obj['Body'].read()))

0 讨论(0)