1

How to approach html file inputs to S3 without pre-signed URLs

Say I have a form where users can submit image files and then I want to post them to my AWS S3 bucket. I don't want to pass them directly using pre-signed URLs because I'm compressing/modifying the images.

Is it okay if I first save them to my filesystem and then use the s3_client.upload_file and get the images from the filesystem?

The thing is, I'm hosting my website on Heroku which works with ephemeral filesystems and at some point, the images will disappear from my static folders. However, I was taking this as an advantage in this case since I need the image in my filesystem only by the time I finish uploading them to my S3 bucket. Would this be a good approach?

Code

If I try to s3_client.upload_file without having the image in my filesystem yet, the client will throw an error.

with Image.open(image) as i:
    i = Image.open(image)
    
    # If the image is for background, create multiple sizes.
    if is_background:
        img_1920_1920 = i.resize((1920, 1920), Image.LANCZOS)
        img_400_400 = i.resize((800, 533), Image.LANCZOS)

        # Add images to s3 bucket
        current_app.s3_client.upload_file(safe_filename, current_app.config['AWS_BUCKET_NAME'], os.path.join(image_path, '1920_1920/', safe_filename))
        current_app.s3_client.upload_file(safe_filename, current_app.config['AWS_BUCKET_NAME'], os.path.join(image_path, '800_533/', safe_filename))

Potetial solution

I tried to do it this way, but I don't know if it is good practice given the circumstances (website hosted on Heroku).

  • Compress/modify images and save them to filesystem
  • Post them to AWS S3 as soon as they finish compressing
with Image.open(image) as i:
    i = Image.open(image)
    
    # If the image is for background, create multiple sizes.
    if is_background:
        img_1920_1920 = i.resize((1920, 1920), Image.LANCZOS)
        img_400_400 = i.resize((800, 533), Image.LANCZOS)

        # Save images in filesystem
        img_1920_1920.save( os.path.join(image_path, '1920_1920/', safe_filename), optimize=True, quality=85)
        img_400_400.save( os.path.join(image_path, '800_533/', safe_filename), optimize=True, quality=85)

        # Add images to s3 bucket
        current_app.s3_client.upload_file(os.path.join(image_path, '1920_1920/', safe_filename), current_app.config['AWS_BUCKET_NAME'], os.path.join(image_path, '1920_1920/', safe_filename))
        current_app.s3_client.upload_file(os.path.join(image_path, '800_533/', safe_filename), current_app.config['AWS_BUCKET_NAME'], os.path.join(image_path, '800_533/', safe_filename))

Submitted July 28th 2020 by Admin

Answers
0

Your solution seems perfectly reasonable to me, and I do similar things on some of the systems I have worked on - uploads first going to the website's file system (in my case ec2), getting some post-upload processing and then moving the resultant files to s3.

If you wanted to completely bypass using the local file system you would often do:

  1. upload directly to s3 using pre-signed urls
  2. s3 event notifications to notify aws lambda that a new file has arrived
  3. custom lambda code to perform the post-upload processing

I have used the above approach as well, but it does add some complexity, which is a disadvantage, but generally would scale better if you had a lot of uploads going on at the same time.

That said, start simple with your solution and scale it up only if/when you need it.

Admin | 1 year ago



Relevant Questions