Is it possible to get the contents of an S3 file without downloading it using boto3?

后端未结

关注

 2  664

I am working on a process to dump files from a Redshift database, and would prefer not to have to locally download the files to process the data. I saw that


                      
              相关标签:


      
      
        
          2条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  醉梦人生        
                
              
                            
                2021-02-08 01:54
              
            
            
                                                                       
If you have a mybucket S3 bucket, which contains a beer key, here is how to download and fetch the value without storing it in a local file:

import boto3
s3 = boto3.resource('s3')
print s3.Object('mybucket', 'beer').get()['Body'].read()

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  遥遥无期        
                
              
                            
                2021-02-08 02:18
              
            
            
                                                                       
This may or may not be relevant to what you want to do, but for my situation one thing that worked well was using tempfile:

import tempfile
import boto3
import PyPDF2

bucket_name = 'my_bucket'
s3 = boto3.resource('s3')
temp = tempfile.NamedTemporaryFile()
s3.Bucket(bucket_name).download_file(key_name, temp.name)
pdfFileObj = open(temp.name,'rb')
pdfReader = PyPDF2.PdfFileReader(pdfFileObj)
[... do what you will with your file ...]
temp.close()

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复