Why is this response being cached?

26,975

When "Expires" and "Cache-Control" headers are not specified, but a "Last-Modified" header is specified, browsers have to guess at how long they should keep the document in cache. Some browsers do use algorithms that let the page remain in cache for a day or more.

Google caching best practices guide states:

Last-Modified is a "weak" caching header in that the browser applies a heuristic to determine whether to fetch the item from cache or not. (The heuristics are different among different browsers.)


Mozilla (Firefox) has an HTTP Caching FAQ that outlines their algorithm for this situation (although it is possible that the algorithm has changed since the document is dated 2002):

...we look for a "Last-Modified" header. If this header is present, then the cache's freshness lifetime is equal to the value of the "Date" header minus the value of the "Last-modified" header divided by 10.

So in your case where the difference between modified and now is 15 days, then Firefox would cache the resource for 1.5 days.

It appears that all major browsers use the same 10% rule that Firefox implements. A question has been asked on StackOveflow asking for these heuristics. Different answers there for different browsers show that they all have similar implementations. There are answers for Internet Explorer, and Webkit (Chrome and Safari).


The size of the browser's cache will probably be the limiting factor for a file that the caching algorithm determines may be kept for more than a day. Browsers generally have a setting for the amount of disk space they use for cache. Many users also clear their cache when they close their browser. So the amount of time for which such a file is cached usually depends on:

  • The amount of cache space the browser has allocated
  • The number of websites that a user visits (and the size of those sites)
  • Whether or not the user has closed their browser
Share:
26,975

Related videos on Youtube

Robar
Author by

Robar

Updated on September 18, 2022

Comments

  • Robar
    Robar over 1 year

    I have a client whose site's index.html currently comes back with these headers:

    Accept-Ranges:    bytes
    Connection:       Keep-Alive
    Content-Encoding: gzip
    Content-Length:   3658
    Content-Type:     text/html
    Date:             Thu, 10 Oct 2013 07:36:27 GMT
    ETag:             "4aa95e1-2ed2-4e721324728b7"
    Keep-Alive:       timeout=5, max=100
    Last-Modified:    Tue, 24 Sep 2013 13:34:30 GMT
    Server:           Apache/2.2.22
    Vary:             Accept-Encoding,User-Agent

    I'm obviously going to recommend that they add Expires or Cache-Control as appropriate, but I'm confused: Chrome caches this resource and uses it from cache (not sending a request at all), even after several hours (for instance, it reused a copy it cached yesterday at 1:30 p.m. this morning at 8:30 a.m.). I can see this quite clearly in the Chrome console's Network tab, where it shows the request and has 200 (OK) in grey in the Status column and (from cache) in the Size column. (I haven't changed Chrome's caching defaults.)

    I realize that the spec allows user agents to make their own decision in the absense of direction from the headers. Is that what's happening here? Chrome sees it was last modified several days ago and feels free to use a version that's (say) up to a day out of date? Or is there something I'm missing?

  • master_dodo
    master_dodo almost 7 years
    Can you please clarify " then Firefox would cache the resource for 1.5 days." From which date, it will cache until 1.5 days? If it's already 15 days, then it would have already expired, isn't it? And since NOW minus last modified will be forever increasing, you mean, it'll be cached forever!
  • Stephen Ostermiller
    Stephen Ostermiller almost 7 years
    Not forever. For 1/10 of the time between the last modified header and the time of download. If it has been 15 days for you that could mean that it has been 150 days since the file was last modified.