A browser cache triggered chicken pecking

Small knowledge, big challenge! This article is participating in the creation activity of “Essential Tips for Programmers”.

origin

20210917 17:47, an ordinary evening, I was as usual code code, the group partner asked a question: the same request, why Chrome is 200 disk cache, Firefox is 304?

200 hit strong cache to see if the expiration date is set. But pretty soon, pretty soon, snap, the little buddy replied: No.

And also threw a screenshot in my face, big screenshots in my face random shot, alas, really hurt…

At this point, I had fallen into a hole that my friends had unwittingly dug. I believed that the bosses already knew what I was talking about, but AT that time, I did not realize that something was wrong and fell into deep self-doubt.

Let me describe some of the actions I did afterwards.

A recurring problem

Since the problem has become complicated, the mere talk has not solved the problem, so it is necessary to move the small hands of the rich.

What is the first step to solving the problem?

You don’t even know what the problem looks like. How can you solve the problem at its root?

Keep clicking the refresh button in your browser and you’ll see that Chrome is indeed 200 and Firefox is 304, as you can see below.

Is it really very strange, Chrome and Firefox two people have a temper again? Let’s look at Chrome and see if it can hit a strong cache.

The first is to notice that there are two types of Chrome request: a normal 200 response, and a 200 (from disk cache), which I initially thought hit a strong cache.

Now let’s take a closer look at the request header and body.

Take a look at the normal request, emmm…… There’s nothing special about it, just normal. So the first one is the request header, cache-control, no-cache, pragma, no-cache. Then the response is normal, a generic 200, with etag and last-modified tags in the corresponding header, which are useless for the current request and can be ignored.

Then let’s look at the bone of contention — 200 (from disk cache) :

How to say, originally in my opinion is also very normal a request, in a small partner’s explanation, I also successfully into the pit 🤦♀️……

So this request has eTAG last-Modified, right? So this is a negotiation cache, so the negotiation cache should return 304, right? But if you look at this request, it returns 200. That logic made sense to me, but then I lost it.

Vegetable chicken solution attempt

Confusion is confusion, and the problem still needs to be solved. After one operation, I found a problem with (hao) point (wu) relation: Cache-control-max-age-0-no-cache-but-browser-bypasses-server-query-and-hits (hits)

The specification states that no-cache does not allow the next request to return cached data, but Chrome does not send a request at all when it hits the back button to return to the page, so even no-cache will return cached data.

This doesn’t have anything to do with the negotiated cache returning 200, but it does show that Chrome has a history of playing against the rules, so a negotiated cache returning 200 might trigger some quirks. Solved the case, the link jilted to his small partner to experience, I went off work first.

However goose, this session of small partner is not good flicker 🤦♀️

Forget it, tomorrow things tomorrow, headache things put in mind, I will now happy next class, I said!

A vegetable chicken with a pig’s head in its face was inspired

The next day early arrived at the company, had to say that the rest of the brain is bright, yesterday did not notice the details today also dawdle out, work and rest together sincere not I cheat. There is only one truth!

Let me reveal the truth step by step.

Negotiation cache? No cache at all

The first is the one that led me astray:

That’s a very problematic statement. ETag and Last Modified are indeed used to negotiate the cache, but they are not used to indicate whether the current request hit the negotiated cache. These two messages are used to allow the client to apply if-none-match and if-modified-since to the request header of subsequent requests to the server to determine whether the cache is available, i.e. to negotiate the cache. So these two items have nothing to do with whether the current request hits the cache. In fact, the request in the screenshot does not use caching at all, and clearly shows Status Code: 200. In other words, this is a normal Ajax request response, so it makes sense to look at cache-control: no-cache in the request header.

What does it mean? It means that people should not think when they are healthy. Obviously is full of loopholes in a word, actually also looks so reasonable can not refute.

Quirks? No, no, no, no, no, no, no, no

Then look at the Chrome hit strong cache problem:

Now if you look at this picture, it’s a little bit easier to look at.

Status Code: 200(from disk cache) indicates that the request hit the strong cache. Why? Ya request can not, this request header is directly the last cache header, and you have what relationship.

Look for cache-control and Pragma directives that Control strong caching in Request Headers. Obviously, nothing. That is, the Ajax request does not have a cache control directive in the request header, and then the request hits the strong cache.

Why is that?

I looked into the specification and found that “heuristic expiration” is enabled when no cache controls are in place, and then in the case of screenshot requests an expiration time is calculated based on the Last request time and the last-Modified interval returned, generally no more than 10% of that interval. If the time between the last request time and the last modification time of the resource is 1 day, the allowed cache time is no more than 2.4 hours, and the specific time may have to be found in the Chrome source code. If MY understanding is wrong, please leave a message to inform you.

To the Internet is not scientific students put the original post, to prevent my English level limited understanding and do not know:

13.2.2 Heuristic Expiration

Since origin servers do not always provide explicit expiration times, HTTP caches typically assign heuristic expiration times, employing algorithms that use other header values (such as the Last-Modified time) to estimate a plausible expiration time. The HTTP/1.1 specification does not provide specific algorithms, but does impose worst-case constraints on their results. Since heuristic expiration times might compromise semantic transparency, they ought to used cautiously, and we encourage origin servers to provide explicit expiration times as much as possible.

13.2.4 Expiration Calculations

// There are some extraneous omissions here

If none of Expires, Cache-Control: max-age, or Cache-Control: S-maxage (see section 14.9.3) appears in the response, and the response does not include other restrictions on caching, the cache MAY compute a freshness lifetime using a heuristic. The cache MUST attach Warning 113 to any response whose age is more than 24 hours if such warning has not already been added.

Also, if the response does have a Last-Modified time, the heuristic expiration value SHOULD be no more than some fraction of the interval since that time. A typical setting of this fraction might be 10%.

That explains all of Chrome’s problems.

Is it Firefox? And innocent.

What about Firefox? Is there something wrong with this kid? No, no, no. There’s just been a little misunderstanding.

Through my sOU side (SUO) operation, I found the impact of user operation behavior on the cache. I’m paraphrasing the conclusion here: opening a new window, entering the address bar, pressing the back button, pressing the refresh button, and pressing the force refresh button all trigger different cache control behaviors in the browser.

So why do differences in user behavior result in differences in caching policies reflected in differences in browser behavior? Here’s a bit of a development habit: Typically, we use Chrome for development, and then occasionally use Firefox and other browsers for supplemental verification. As a result, Chrome usually does not close the page and uses “enter in the address bar” or “hit the refresh button”, while Firefox uses “open a new window”. As a result, the browser uses different cache policies.

The end result is the group partner’s discovery that the two browsers actually cache the same request differently.

The truth

The difference between 1, 200 and 304 is not entirely due to the browser, but mainly because ff reopens the browser and the browser forces the request to hit the negotiated cache. This is not entirely true, because in the actual testing process, it was found that Hitting the refresh button in Chrome does not necessarily force a request, but hitting the refresh button multiple times in a short period of time still hits the strong cache. This is supposed to be Chrome optimization, and the effect of user actions on the cache is somewhat inaccurate.

2. The default cache logic is different for the browser back button, the address bar return, the refresh button, the page automatically triggering the request (clicking the request button, etc.).

This is a get request. The get request specification states that it can be cached if it meets the cache requirements. However, the request header does not have cache control instructions by default, so the “heuristic expiration” policy is followed and strong caching is hit.

Next time I will write a good article on the specification of common caching behaviors mentioned above and the exact implementation of browser behaviors

harvest

~~Harvest? CAI guy (JI) image is more stable.~~

A small problem, time-consuming two days before and after, although the knowledge points involved are the knowledge points known before, but there is a lot of harvest.

The first is to solve the problem to have a good look (laugh and cry), a lot of times when a lot of problems entangled together, need to seriously identify one problem after another, rather than take it for granted that all the problems are merged together, take it for granted that all are a problem.

Then do not always feel that there is magic in the code world, sometimes almost in fact is far away, encounter uncertain place must go to the documentation to find information to determine.

Finally, I still need to understand English a little, the document is painful!! Sauce,

The resources

specification
Impact of user operations on cache
cache-control-max-age-0-no-cache-but-browser-bypasses-server-query-and-hits
Chrome ignores the ETag header and just uses the in memory cache/disk cache