Is it possible to get the contents of an S3 file without downloading it using boto3?

后端 未结 2 658
心在旅途
心在旅途 2021-02-08 01:21

I am working on a process to dump files from a Redshift database, and would prefer not to have to locally download the files to process the data. I saw that

2条回答
  •  遥遥无期
    2021-02-08 02:18

    This may or may not be relevant to what you want to do, but for my situation one thing that worked well was using tempfile:

    import tempfile
    import boto3
    import PyPDF2
    
    bucket_name = 'my_bucket'
    s3 = boto3.resource('s3')
    temp = tempfile.NamedTemporaryFile()
    s3.Bucket(bucket_name).download_file(key_name, temp.name)
    pdfFileObj = open(temp.name,'rb')
    pdfReader = PyPDF2.PdfFileReader(pdfFileObj)
    [... do what you will with your file ...]
    temp.close()
    

提交回复
热议问题