preface

HTTP related knowledge has always been one of the necessary knowledge of the front-end road ~ and I have recently found that the HTTP cache this piece of knowledge has been a little strange, this picture and text and everyone to review the HTTP cache knowledge ~

Without further discussion, the following diagram is the HTTP cache flow, to give you a basic idea ~

Composition of HTTP packets

Before introducing HTTP caching in detail, the composition of HTTP packets is briefly introduced

  1. Header containing attributes: Additional information (cookies, cache information, etc.) Cache-related rule information is contained in the header

Common request headers:



Common request headers:
Accept: text/html,image/*                                      The type that the browser can accept
Accept-Charset: ISO-8859-1                                     The type of encoding that the browser can accept
Accept-Encoding: gzip,compress                                 The browser can accept compressed encoding types
Accept-Language: en-us,zh-cn                                   # What languages and countries the browser can accept
Host: www.lks.cn:80                                            # host and port requested by the browser
If-Modified-Since: Tue, 11 Jul 2000 18: struck GMT               A page cache time
Referer: http://www.lks.cn/index.html                          Which page is the request coming from
User-Agent: Mozilla / 4.0 compatible; MSIE 5.5; Windows NT 5.0   # Browser-related information
Cookie:                                                        The browser temporarily stores the message sent by the server
Connection: Close1.0 / Keep - Alive1.1                             # Features of the version of HTTP requests
Date: Tue, 11 Jul 2000 18:23:51GMT                             # Request the time of the website
Allow:GET                                                      # GET = POST
Keep-Alive:5                                                   # connection time; 5
Connection:keep-alive                                          Is it a long connection
Cache-Control:max-age=300                                      The maximum cache time is 300sCopy the code


Accept: text/html,image/*                                      The type that the browser can accept
Accept-Charset: ISO-8859-1                                     The type of encoding that the browser can accept
Accept-Encoding: gzip,compress                                 The browser can accept compressed encoding types
Accept-Language: en-us,zh-cn                                   # What languages and countries the browser can accept
Host: www.lks.cn:80                                            # host and port requested by the browser
If-Modified-Since: Tue, 11 Jul 2000 18: struck GMT               A page cache time
Referer: http://www.lks.cn/index.html                          Which page is the request coming from
User-Agent: Mozilla / 4.0 compatible; MSIE 5.5; Windows NT 5.0   # Browser-related information
Cookie:                                                        The browser temporarily stores the message sent by the server
Connection: Close1.0 / Keep - Alive1.1                             # Features of the version of HTTP requests
Date: Tue, 11 Jul 2000 18:23:51GMT                             # Request the time of the website
Allow:GET                                                      # GET = POST
Keep-Alive:5                                                   # connection time; 5
Connection:keep-alive                                          Is it a long connection
Cache-Control:max-age=300                                      The maximum cache time is 300sCopy the code
  1. The body that contains the data that the HTTP request really wants to transfer

HTTP cache rules

HTTP caches are classified according to whether a request needs to be sent to the server again. They can be classified into two types (force cache and comparison cache). If force cache takes effect, it does not need to interact with the server, while comparison cache needs to interact with the server regardless of whether it takes effect. Two types of cache rules can exist at the same time, and the force cache has a higher priority than the comparison cache. That is, when the force cache rule is executed, if the cache takes effect, the cache is directly used and the comparison cache rule is not executed.

Mandatory cache

For mandatory caches, there are two fields in the header to indicate an expiration rule (Expires/ cache-control), which refers to the expiration date of the current resource

For your understanding, we assume that the browser has a cache database for storing cached information. When the client requests data for the first time, no corresponding cached data exists in the cache database. Therefore, it requests the server and stores the data to the cache database after the server returns the request.

Expires or cache-control rules

The Expires Expires value is the expiration date returned by the server. The response to an HTTP request tells the browser that the browser can cache data from the browser before the expiration date without having to request it again. However, Expires is an HTTP 1.0 thing, and the default browser now uses HTTP 1.1 by default, so its role is largely ignored. One drawback to Expires is that the returned expiration time is the server-side time. The problem is that the time is set locally on the client, so it can cause errors, so cache-control is used instead starting with HTTP 1.1.

Cache-control is used to define the Cache directives that all caching mechanisms must follow. These directives are specific directives, including public, private, no-cache(indicating that it can be stored, However, it cannot be used to respond to client requests before rechecking its validity.), no-store, max-age, s-maxage, and must-revalidate, etc. The time set in cache-Control overrides the time specified in Expires.

Compared to the cache

Comparison caches, as the name suggests, require a comparison to determine whether caches can be used. The first time the browser requests data, the server returns the cache id along with the data to the client, which backs up both to the cache database. When requesting data again, the client sends the backup cache ID to the server. The server checks the backup cache ID. After the check succeeds, the server returns the 304 status code to inform the client that the backup data is available.



When the comparative cache takes effect, the status code is 304, and the packet size and request time are greatly reduced.

The reason is that after the comparison, the server returns only the header, notifies the client to use the cache through the status code, and does not need to return the packet body to the client.

For comparison caching, the passing of cache identity is one of the most important things we need to understand. It is passed between request headers and response headers. There are two types of passing of cache identity, which we will discuss separately.

Last-modified/If Modified – Since the rules

Last-modified: When the server responds to a request, it tells the browser when the resource was Last Modified.

If-modified-since: This field is used to notify the server of the last modification time of the resource returned by the server during the last request. When the server receives the request, it finds the if-modified-since header and compares it with the last modification time of the requested resource. If the last modification time of a resource is greater than if-modified-since, it indicates that the resource has been Modified again. The status code 200 is returned in response to the entire resource content. If the last modification time of a resource is less than or equal to if-modified-since, the resource has not been Modified, the browser responds to HTTP 304, telling the browser to continue using the saved cache.

Etag/ if-none-match rule (priority over last-modified/if-modified-since)

Etag: unique identifier of server resources. The browser can cache data based on the Etag value to save bandwidth. Etag can help prevent synchronous update resources from overwriting each other if they have changed. The ETag priority is higher than last-Modified.

If-none-match: this field is used to notify the server of the unique identifier of cached data in the client segment when the server is requested again. After receiving the request, the server finds if-none-match header and compares it with the unique identifier of the requested resource. If the header is different, it indicates that the resource has been changed again. In this case, the server responds to the entire resource content and returns the status code 200. If yes, the resource has not been modified. In this case, the browser responds to HTTP 304, telling the browser to continue using the saved cache.

Looking at this, you might ask, why do you need an Etag when you already have last-Modified to know if the local cache is up to date? Mainly based on the following reasons: Last-modified is only accurate to the second. If some resource is Modified more than once in a second, it cannot accurately indicate the freshness of the file. If some resource is generated regularly, when the content is unchanged but last-Modified is changed, The file cache is not used. The server may not obtain the correct resource change time or the time on the proxy server may be inconsistent with that on the proxy server.

Impact of user operations on cache

Requests that cannot be cached:

  • HTTP headers contain cache-control :no-cache, pragma:no-cache, or cache-control :max-age=0 requests that tell the browser not to Cache

  • Dynamic requests that require input based on cookies, authentication information, etc., cannot be cached

  • Cache-control: max-age; Firefox: cache-control :Public; HTTPS cache-control :Public;

  • Requests that do not contain last-Modified /Etag or cache-Control /Expires in the HTTP response header cannot be cached

  • Current browser implementations do not cache responses to POST requests (nor should they be semantically), and the specification does not allow a return status code of 304. However, this does not mean that POST responses cannot be cached, according to RFC 7231 – Hypertext Transfer Protocol (HTTP/1.1): In Semantics and Content, if the corresponding response to the POST request contains Freshness related information, the response can also be cached, see the link above for details.

I’Ve covered HTTP caching in general here, but this is just the tip of the iceberg. If you are interested, you can go to the RFC document

www.zhihu.com/questio…

www.cnblogs.com/chenqf…

zhuanlan.zhihu.com/p/…

zhuanlan.zhihu.com/p/…

www.zhihu.com/questio…

www.zhihu.com/questio…

louiszhai.github.io/20…