Upload image available at public URL to S3 using boto

后端 未结 9 1208
广开言路
广开言路 2020-12-23 13:48

I\'m working in a Python web environment and I can simply upload a file from the filesystem to S3 using boto\'s key.set_contents_from_filename(path/to/file). However, I\'d l

相关标签:
9条回答
  • 2020-12-23 14:27

    S3 doesn't support remote upload as of now it seems. You may use the below class for uploading an image to S3. The upload method here first tries to download the image and keeps it in memory for sometime until it gets uploaded. To be able to connect to S3 you will have to install AWS CLI using command pip install awscli, then enter few credentials using command aws configure:

    import urllib3
    import uuid
    from pathlib import Path
    from io import BytesIO
    from errors import custom_exceptions as cex
    
    BUCKET_NAME = "xxx.yyy.zzz"
    POSTERS_BASE_PATH = "assets/wallcontent"
    CLOUDFRONT_BASE_URL = "https://xxx.cloudfront.net/"
    
    
    class S3(object):
        def __init__(self):
            self.client = boto3.client('s3')
            self.bucket_name = BUCKET_NAME
            self.posters_base_path = POSTERS_BASE_PATH
    
        def __download_image(self, url):
            manager = urllib3.PoolManager()
            try:
                res = manager.request('GET', url)
            except Exception:
                print("Could not download the image from URL: ", url)
                raise cex.ImageDownloadFailed
            return BytesIO(res.data)  # any file-like object that implements read()
    
        def upload_image(self, url):
            try:
                image_file = self.__download_image(url)
            except cex.ImageDownloadFailed:
                raise cex.ImageUploadFailed
    
            extension = Path(url).suffix
            id = uuid.uuid1().hex + extension
            final_path = self.posters_base_path + "/" + id
            try:
                self.client.upload_fileobj(image_file,
                                           self.bucket_name,
                                           final_path
                                           )
            except Exception:
                print("Image Upload Error for URL: ", url)
                raise cex.ImageUploadFailed
    
            return CLOUDFRONT_BASE_URL + id
    
    0 讨论(0)
  • 2020-12-23 14:29

    For a 2017-relevant answer to this question which uses the official 'boto3' package (instead of the old 'boto' package from the original answer):

    Python 3.5

    If you're on a clean Python install, pip install both packages first:

    pip install boto3

    pip install requests

    import boto3
    import requests
    
    # Uses the creds in ~/.aws/credentials
    s3 = boto3.resource('s3')
    bucket_name_to_upload_image_to = 'photos'
    s3_image_filename = 'test_s3_image.png'
    internet_image_url = 'https://docs.python.org/3.7/_static/py.png'
    
    
    # Do this as a quick and easy check to make sure your S3 access is OK
    for bucket in s3.buckets.all():
        if bucket.name == bucket_name_to_upload_image_to:
            print('Good to go. Found the bucket to upload the image into.')
            good_to_go = True
    
    if not good_to_go:
        print('Not seeing your s3 bucket, might want to double check permissions in IAM')
    
    # Given an Internet-accessible URL, download the image and upload it to S3,
    # without needing to persist the image to disk locally
    req_for_image = requests.get(internet_image_url, stream=True)
    file_object_from_req = req_for_image.raw
    req_data = file_object_from_req.read()
    
    # Do the actual upload to s3
    s3.Bucket(bucket_name_to_upload_image_to).put_object(Key=s3_image_filename, Body=req_data)
    
    0 讨论(0)
  • 2020-12-23 14:31

    I have tried as following with boto3 and it works me:

    import boto3;
    import contextlib;
    import requests;
    from io import BytesIO;
    
    s3 = boto3.resource('s3');
    s3Client = boto3.client('s3')
    for bucket in s3.buckets.all():
      print(bucket.name)
    
    
    url = "@resource url";
    with contextlib.closing(requests.get(url, stream=True, verify=False)) as response:
            # Set up file stream from response content.
            fp = BytesIO(response.content)
            # Upload data to S3
            s3Client.upload_fileobj(fp, 'aws-books', 'reviews_Electronics_5.json.gz')
    
    0 讨论(0)
  • 2020-12-23 14:36

    Using the boto3 upload_fileobj method, you can stream a file to an S3 bucket, without saving to disk. Here is my function:

    import boto3
    import StringIO
    import contextlib
    import requests
    
    def upload(url):
        # Get the service client
        s3 = boto3.client('s3')
    
        # Rember to se stream = True.
        with contextlib.closing(requests.get(url, stream=True, verify=False)) as response:
            # Set up file stream from response content.
            fp = StringIO.StringIO(response.content)
            # Upload data to S3
            s3.upload_fileobj(fp, 'my-bucket', 'my-dir/' + url.split('/')[-1])
    
    0 讨论(0)
  • 2020-12-23 14:40

    Here is how I did it with requests, the key being to set stream=True when initially making the request, and uploading to s3 using the upload.fileobj() method:

    import requests
    import boto3
    
    url = "https://upload.wikimedia.org/wikipedia/en/a/a9/Example.jpg"
    r = requests.get(url, stream=True)
    
    session = boto3.Session()
    s3 = session.resource('s3')
    
    bucket_name = 'your-bucket-name'
    key = 'your-key-name' # key is the name of file on your bucket
    
    bucket = s3.Bucket(bucket_name)
    bucket.upload_fileobj(r.raw, key)
    
    0 讨论(0)
  • 2020-12-23 14:43

    Ok, from @garnaat, it doesn't sound like S3 currently allows uploads by url. I managed to upload remote images to S3 by reading them into memory only. This works.

    def upload(url):
        try:
            conn = boto.connect_s3(settings.AWS_ACCESS_KEY_ID, settings.AWS_SECRET_ACCESS_KEY)
            bucket_name = settings.AWS_STORAGE_BUCKET_NAME
            bucket = conn.get_bucket(bucket_name)
            k = Key(bucket)
            k.key = url.split('/')[::-1][0]    # In my situation, ids at the end are unique
            file_object = urllib2.urlopen(url)           # 'Like' a file object
            fp = StringIO.StringIO(file_object.read())   # Wrap object    
            k.set_contents_from_file(fp)
            return "Success"
        except Exception, e:
            return e
    

    Also thanks to How can I create a GzipFile instance from the “file-like object” that urllib.urlopen() returns?

    0 讨论(0)
提交回复
热议问题