Disable Cloudfront cache if file is not found

问题

I created a Cloudfront distribution in front of an S3 bucket with a RoutingRule to redirect to a lambda function if the requested file is not found. I´m using this to resize images.

Desired flow:

Request the file to Cloudfront
File not found in Cloudfront check S3
File not found in S3 redirect to the lambda function
Lambda will find the original file, resize it and redirect back to the Cloudfront url.

Redirect rule set on s3 website:

<RoutingRules>
  <RoutingRule>
    <Condition>
      <KeyPrefixEquals/>
      <HttpErrorCodeReturnedEquals>404</HttpErrorCodeReturnedEquals>
    </Condition>
    <Redirect>
      <Protocol>https</Protocol>
      <HostName>mylambda.execute-api.us-east-1.amazonaws.com</HostName>
      <ReplaceKeyPrefixWith>/?key=</ReplaceKeyPrefixWith>
      <HttpRedirectCode>307</HttpRedirectCode>
    </Redirect>
  </RoutingRule>
</RoutingRules>

I´m having a problem with the step 4 when the lambda function redirects back to the original url Cloudfront cached the 404? and the routing rule from S3 is redirecting again to the lambda function causing a loop.

I confirmed that the lambda function generated the file.
if I invalidate the file on Cloudfront I successfully see it served from S3)

I tried adding a 0 TTL to the 404 error page but didn´t help.

the redirect rule returns a 307 status code [Temporary Redirect]. But I don´t know how to set a 0 TTL on this. I couldn´t find the option on the Cloudfront custom error response page.

According to this article. the 307 is cached. need to set a rule for it... somewhere .

This is a follow up question on RoutingRules on AWS S3 Static website hosting

I appreciate your help.

Update: 1. Removed the RoutingRule on S3 2. Added a new origin to the Cloudfront distribution (API gateway)

the lambda function now returns

    return {
        statusCode: "200",
        body: "image converted",
    };

Checking Cloudwatch logs I don´t see the lambda function getting invoked and when I go to https://myCloudfront.cloudfront.net/photos/resized/test.jpg

I only see a plain 404

I also added a custom error page with a 0 TTL for 404

the good news is if I go to the api gateway passing key=/photos/resized/test.jpg and then go to https://my.cloudfront.net/photos/resized/test.jpg it works. it reads the image correctly.

I think the problem is with the failover that´s not triggering the api gateway call.

回答1:

You could, of course, use a Lambda@Edge Origin Response trigger, to modify the response and set the header you want. This would, in a sense, be the "most correct" and therefore "most desirable" solution, but only in a theoretical sense, because it introduces unnecessary cost and complexity.

Default TTL is the value that is used by CloudFront, internally, when no Cache-Control response header is found... so, you could set that to 0 and include correct Cache-Control headers when creating your S3 objects, so that the Default TTL wouldn't be used except for the redirects. What I don't like about this is that there's no header to insist that the browser also not cache the redirect.

But you don't actually need to return a redirect to the browser, here. You don't need a redirect at all.

With CloudFront’s Origin Failover capability, you can setup two origins for your distributions - primary and secondary, such that your content is served from your secondary origin if CloudFront detects that your primary origin is unavailable.

https://aws.amazon.com/about-aws/whats-new/2018/11/amazon-cloudfront-announces-support-for-origin-failover/

The word "unavailable" is unnecessarily vague, here, because this feature does more that that, and it will do what you want. To set what triggers origin "failover"...

You can choose any combination of the following status codes: 500, 502, 503, 504, 404, or 403.

https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/high_availability_origin_failover.html#concept_origin_groups.creating

So just the normal bucket behavior is sufficient, no redirect rules required.

Note that with this setup, the final response is the only thing that can be cached by CloudFront -- whether that reaponse is from the primary (S3) or secondary (Lambda via API Gateway) origin -- so this elmiminates the issue with transient responses being cached.

Note also that despite the use of the word "failover," CloudFront does not maintain a conceptual model of the state of the origin, so each request stands on its own and will go to the primary origin even when other requests are "failing."

来源：https://stackoverflow.com/questions/61024036/disable-cloudfront-cache-if-file-is-not-found

标签

amazon-web-services

amazon-s3

amazon-cloudfront