Getting 403 (Forbidden) when loading AWS CloudFront file

59,918

Solution 1

When restricting access to S3 content using a bucket policy that inspects the incoming Referer: header, you need to do a little bit of custom configuration to "outsmart" CloudFront.

It's important to understand that CloudFront is designed to be a well-behaved cache. By "well-behaved," I mean that CloudFront is designed to never return a response that differs from what the origin server would have returned. I'm sure you can see that is an important factor.

Let's say I have a web server (not S3) behind CloudFront, and my web site is designed so that it returns different content based on an inspection of the Referer: header... or any other http request header, like User-Agent: for example. Depending on your browser, I might return different content. How would CloudFront know this, so that it would avoid serving a user the wrong version of a certain page?

The answer is, it wouldn't be able to tell -- it can't know this. So, CloudFront's solution is not to forward most request headers to my server at all. What my web server can't see, it can't react to, so the content I return cannot vary based on headers I don't receive, which prevents CloudFront from caching and returning the wrong response, based on those headers. Web caches have an obligation to avoid returning the wrong cached content for a given page.

"But wait," you object. "My site depends on the value from a certain header in order to determine how to respond." Right, that makes sense... so we have to tell CloudFront this:

Instead of caching my pages based on just the requested path, I need you to also forward the Referer: or User-Agent: or one of several other headers as sent by the browser, and cache the response for use on other requests that include not only the same path, but also the same values for the extra header(s) that you forward to me.

However, when the origin server is S3, CloudFront doesn't support forwarding most request headers, on the assumption that since static content is unlikely to vary, these headers would just cause it to cache multiple identical responses unnecessarily.

Your solution is not to tell CloudFront that you're using S3 as the origin. Instead, configure your distribution to use a "custom" origin, and give it the hostname of the bucket to use as the origin server hostname.

Then, you can configure CloudFront to forward the Referer: header to the origin, and your S3 bucket policy that denies/allows requests based on that header will work as expected.

Well, almost as expected. This will lower your cache hit ratio somewhat, since now the cached pages will be cached based on path + referring page. It an S3 object is referenced by more than one of your site's pages, CloudFront will cache a copy for each unique request. It sounds like a limitation, but really, it's only an artifact of proper cache behavior -- whatever gets forwarded to the back-end, almost all of it, must be used to determine whether that particular response is usable for servicing future requests.

See http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/distribution-web-values-specify.html#DownloadDistValuesForwardHeaders for configuring CloudFront to whitelist specific headers to send to your origin server.

Important: don't forward any headers you don't need, since every variant request reduces your hit rate further. Particularly when using S3 as the back-end for a custom origin, do not forward the Host: header, because that is probably not going to do what you expect. Select the Referer: header here, and test. S3 should begin to see the header and react accordingly.

Note that when you removed your bucket policy for testing, CloudFront would have continued to serve the cached error page unless you flushed your cache by sending an invalidation request, which causes CloudFront to purge all cached pages matching the path pattern you specify, over the course of about 15 minutes. The easiest thing to do when experimenting is to just create a new CloudFront distribution with the new configuration, since there is no charge for the distributions themselves.

When viewing the response headers from CloudFront, note the X-Cache: (hit/miss) and Age: (how long ago this particular page was cached) responses. These are also useful in troubleshooting.


Update: @alexjs has made an important observation: instead of doing this using the bucket policy and forwarding the Referer: header to S3 for analysis -- which will hurt your cache ratio to an extent that varies with the spread of resources over referring pages -- you can use the new AWS Web Application Firewall service, which allows you to impose filtering rules against incoming requests to CloudFront, to allow or block requests based on string matching in request headers.

For this, you'd need to connect the distribution to S3 as as S3 origin (the normal configuration, contrary to what I proposed, in the solution above, with a "custom" origin) and use the built-in capability of CloudFront to authenticate back-end requests to S3 (so the bucket contents aren't directly accessible if requested from S3 directly by a malicious actor).

See https://www.alexjs.eu/preventing-hotlinking-using-cloudfront-waf-and-referer-checking/ for more on this option.

Solution 2

Also, it may be something simple. When you first upload a file to an S3 bucket, it is non-public, even if other files in that bucket are public, and even if the bucket itself is public.

To change this in the AWS Console, check the box next to the folder that you want to make public (the folder you just uploaded), and choose "Make public" from the menu.

The files in that folder (and any subfolders) will be made public, and you'll be able to serve the files from S3.

For the AWS CLI, add the "--acl public-read" option in your command, like so:

aws s3 cp index.html s3://your.remote.bucket --acl public-read

Solution 3

I identified another reason why CloudFront can return a 403 (Bad request). Maybe thats an edge case but I would like to share with you.

CloudFront implements a forward loop detection mechanism to prevent from forwarding-loop attacks.
You cannot cascade more than 2 CloudFront distributions as orgins according to the AWS support.

Lets assume you have configured CloudFront A with CloudFront B as an origin and from CloudFront B you have configured CloudFront C as an origin, and from CloudFront C you have an S3 bucket as an origin.

A --> B --> C --> S3 bucket (can return a 403 error)

If you request a file from CloudFront A that is located in the S3 bucket at the end of the cascade, the CloudFront C will return a 403 (Bad request).

If your cascade just consists of 2 CloudFront distributions and an S3 bucket at the end, the request of a file from the S3 origin works.

A --> B --> S3 bucket (works)

Solution 4

For me, I had to give CodePipeline access to my S3 bucket policy. For example something like this:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "PublicReadGetObject",
            "Effect": "Allow",
            "Principal": "*",
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::mys3bucket/*"
        }
    ]
}

Solution 5

I was getting 403 error from cloudfront for POST requests, where my origin was a domain name instead of an s3 bucket.

The reason was that POST is not allowed by default by cloudfront. I enabled POST from the Behaviors tab in the console, and then it worked.

enter image description here

Share:
59,918

Related videos on Youtube

Shina
Author by

Shina

Design + Tech

Updated on December 21, 2021

Comments

  • Shina
    Shina over 2 years

    I'm working on a video app and storing the files on AWS S3, using the default URL like https://***.amazonaws.com/*** works fine but I have decided to use CloudFront which is faster for content delivery.

    Using CF, I keep getting 403 (Forbidden) using this URL https://***.cloudfront.net/***. Did I miss anything?

    Everything works fine until I decide to load the contents from CloudFront which points to my bucket.

    Any solution please?

    • Michael - sqlbot
      Michael - sqlbot over 8 years
      You haven't given us much to go on. Are you using pre-signed URLs? Does your bucket policy deny requests based on certain request parameters?
    • Shina
      Shina over 8 years
      @Michael-sqlbot I'm not using pre-signed URL, just the standard config. The policy I set was to accept only my URL to load the files.
    • Michael - sqlbot
      Michael - sqlbot over 8 years
      So, you are using a bucket policy with something like "Condition":{ "StringLike":{"aws:Referer":["http://www.example.com/*"]} }?
    • Shina
      Shina over 8 years
      @Michael-sqlbot Exactly, and even deleting the policy just for testing didn't help. I'm a bit confused
    • Michael - sqlbot
      Michael - sqlbot over 8 years
      I suspect you may have been seeing cached error responses from CloudFront after you removed the policy restriction. Answer coming up.
    • alexjs
      alexjs about 8 years
      If I'm reading this correctly, please note that you can now do Referer checking at CloudFront using the WAF rather than using the S3 approach. I've covered this here. (I'm also going to update my post to mention @Michael-sqlbot's answer, which is v neat)
    • Michael - sqlbot
      Michael - sqlbot about 8 years
      @alexjs you are absolutely correct: AWS Web Application Firewall can block (or allow) requests into CloudFront based on string matching of request headers. My mindset when writing the answer was centered around the current configuration, and I overlooked this alternative... which could result in significantly better cache hit ratios by avoiding the page-to-page cache variants caused by forwarding a header to the origin. Thanks, also, for the blog mention.
    • mayankcpdixit
      mayankcpdixit over 4 years
      add /index.html in your url & check?
  • Efren
    Efren over 5 years
    When using cloudfront to access S3, you ought to use the origin access ID, rather than exposing the S3 bucket to the public. Then the bucket can grant permission on the bucket policies (this is can actually be done automatically if using the console to setup cloudfront).
  • Patrick Chu
    Patrick Chu about 5 years
    You're right, this is the preferred way for Cloudfront (the one I personally use). I guess my answer is more a reminder that even if you mark your bucket public, you also need to mark each individual file public as well.
  • vam
    vam over 3 years
    yeah, changing principal to * might be the solution to most of the cases where we usually configure it to be only accessible from cloudfront
  • vam
    vam over 3 years
    Reloading the page and redirecting to index.html is the biggest problem here. In case, if developers come from any of the pre-rendered application using gatsby or pre-rendered.io, here is what you may try. Make sure you point the origin to s3 bucket website end point and update the bucket policy to have Principal to be '*'. This applied to the cases where the entire application is static content and you use a different bucket for client only routes.
  • Admin
    Admin over 2 years
    Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.