Please indicate the source of reprint:

  1. HTTP header parsing: blog garden
  2. HTTP header field: Zhihu

The article directories

1. The Web server related to HTTP

2. The HTTP header


Web server associated with HTTP protocol

Before explaining HTTP headers, it’s worth taking a look at the Web server that works with HTTP.

Use one host to achieve multiple domain names

The HTTP/1.1 specification explicitly allows a single Web server to implement multiple domain names. Even though there is only one server on the physical level, as long as you use the function of virtual host (also known as virtual server), you can assume that there are multiple servers.

Virtual host, also can be called virtual server, is a single host on the host group, the implementation of multi-domain services, can run multiple websites or services technology. Please refer to wiki for details. Portal: Wiki: virtual host

But there is a problem with having multiple web domain names on the same server, because there is only one physical server, which means there is only one IP address. After the DNS service resolves the domain name to an IP address, it needs to figure out which domain name to visit once the request is received.

There are actually two ways to solve this problem. First, the Host field must be added to the request header field when sending the request, indicating the Host name of the request. The other is to use different IP addresses on one server to manage multiple services.

Communication data forwarding program: proxy

The proxy server sits between the server and the client, receiving requests sent by the client and forwarding them to the server, as well as receiving responses returned by the server and forwarding them to the client. The packet capture software Fiddler and Charles commonly used by front-end engineers capture packets through proxies.

The basic behavior of the proxy server is to receive the request sent by the client and forward it to the server. The proxy does not change the request URI, but directly sends the request to the target server with resources ahead. The server that holds the resource entity is called the source server, and the response from the source server passes through the proxy server to the client. Via header field information is appended each time it passes through the proxy server, indicating the proxy server information passed through. Otherwise, we don’t know who’s with who…

In summary, using a proxy server has the following benefits

1. Use the cache technology to reduce bandwidth consumption on the server network

2. Access control for specific websites (control which sites can access the server, which can not access, access filtering function)

Proxies can be used in a variety of ways, classified according to two benchmarks, whether they are cached (cached proxy) or whether they modify messages (transparent proxy). Details can be found in the wiki. Portal: Wiki: proxy server

Save a cache of resources

The caching technique mentioned above refers to the copy of resources stored on the local disk of the proxy server or client. Using caching reduces access to the source server (reading unexpired cached resources from proxy servers or browsers), thus saving traffic and communication time.

The advantage of caching (proxy) servers is that caching avoids multiple requests for resources from the source server. So the client can get the resource from the browser or proxy server nearby, and the source server does not have to process the same request multiple times.

However, no matter the browser or proxy server cache resources, there is a cache expiration situation. If the cache is not expired, then the cache resource can be read directly; If the cache expires, the proxy server will fetch the updated resource from the source server again. Instead of making a request to the server immediately, the browser makes a conditional GET request (if-modified-since and last-Modified fields).

Just to summarize a little bit.

1. A Web server can be configured with multiple domain names. You need to add the Host field to indicate the requested Host name or multiple IP addresses to manage different services.

2. The basic behavior of the proxy server is to forward the request sent by the client to the server, and then directly forward the request resources to the source server. The proxy server or browser can be used to cache the response to reduce the waste of bandwidth resources caused by the same request to the source server.

The HTTP header

Header fields shared by request headers and response headers include common header fields, entity header fields, and other header fields. The header field unique to the request header is the request header field, and the header field unique to the response header is the response header field. HTTP/1.1 defines 47 header fields.

The following is a brief description of each field.

HTTP/1.1 generic header field

The common header field refers to the header used by both the request and response packets.

The cache-control directives:

Ability to control the working behavior of the cache. Instructions are optional arguments, separated by a ‘,’. The cache-control directive can be used when requesting and responding.

Public: cache response instruction. Make it clear that other users can also take advantage of the cache.

Private: cache response instructions. The proxy server only provides cache resources for certain users. The proxy server does not return cache resources for requests sent by other users.

No-cache: The purpose is to prevent an expired resource from being returned from the cache. If the request sent by the client contains the no-cache command, the client will not receive the cached response. The proxy server must then forward the client request to the source server. If the server returns a response that contains the no-cache directive, the proxy server cannot cache the resource. The source server will no longer verify the validity of the resource proposed in the proxy server request and will not cache the response resource.

No-store: disables the proxy server from caching response resources.

S-maxage: indicates that in the case of a public proxy server, the cache will be returned if the cache has not expired for a specified time. This directive has no effect on a server that repeatedly returns a response to the same user. In addition, when the S-maxage directive is used, processing of the Expires header field and the Max-age directive is ignored. For example, cache-control: s-maxage=600(seconds) indicates that Cache resources can be returned if the Cache on the public proxy server has expired for less than 10 minutes.

Max-age: the format is cache-control: max-age=600(seconds). If the client sends a request that contains a max-age directive indicating that the cache has expired within a specified time, the client receives the cached resource. If max-age is 0, the proxy server needs to forward requests to the source server.

The proxy server does not confirm the validity of the resource when the source server returns a response containing a max-age directive, and the max-age value represents the maximum time the resource has been cached.

When HTTP/1.1 proxy servers encounter a simultaneous Expires field, max-age directives are processed in preference to the Expires field.

Min-fresh: Requires the proxy server to return cached resources that have not expired for at least the specified time. For example, cache-Control: min-fresh=60(seconds) If min-fresh is set to 60 seconds, the response within 60 seconds can be returned, but the response beyond 60 seconds cannot be returned.

Max-stale: indicates that the client will accept the stale cache after the cache expires within a specified time. If no parameter value is specified, the client will accept the response no matter how long it takes.

Only -if-cached: indicates that the client will ask for the target resource to be returned only if the proxy server has cached it locally. That is, the directive asks the proxy server not to reload the response or revalidate the resource. If there is no response from the local cache of the request proxy server, the status code 504 Gateway Timeout is returned

Must-revalidate: indicates that the proxy server revalidates to the source server whether the cache of the response to be returned is still valid. If the proxy fails to access the source server to obtain valid resources again, the proxy server returns a 504(Gateway Timeout) status code to the client. The requested max-stale directive is also ignored.

Proxy-revalidate: requires the proxy server to verify the validity of the cached response.

No-transform: The cache cannot change the media type of the entity body, either in the request or in the response header.

Connection

The Connection field does two things

Controls the header field that is not forwarded to the proxy server: The format is Connection: name of the header field that is not forwarded. Between the client sending the request and the server returning the response, use the Connection field to control not forwarding to the proxy server

Persistent Connection: Connection: keep-alive. The default connection for HTTP/1.1 is persistent. With only one TCP connection, the client and server can communicate with each other many times over HTTP. A persistent connection is not terminated until either party explicitly states that it needs to disconnect the TCP connection.

Pragma

This header field is defined only as backward compatibility with HTTP/1.0. Pragma: no-cache. Used only in the response header to indicate that the proxy server cannot cache the response.

The pargma header field has the same effect as the no-cache directive, but for compatibility with HTTP versions, the HTTP response header contains both the following fields.

Trailer

The Trailer field specifies in advance which header fields are recorded after the message body. Mainly used in HTTP/1.1 version of the block transfer encoding.

Transfer-Encoding

The transfer-Encoding field specifies the Encoding mode used to transmit the packet body. It is valid only for block transmission codes.

HTTP/1.1 200 OK Transfer-encoding: chunked Connection: Keep alive - cfo < - hexadecimal (decimal 3312)...... 392 3312 - byte block data < - hexadecimal (decimal 914)... 914 - byte block data · · · · · ·Copy the code

In the above example, the transfer-Encoding field value effectively uses block Transfer Encoding and is divided into block data of 3312 bytes and 914 bytes.

Upgrade

The Upgrade field detects whether HTTP and other protocols can communicate with a later version. If this field is used when WebSocket is used, HTTP is used to upgrade HTTP to WebSocket during HTTP communication. After that, the server side returns the 101 Switching Protocols status code to indicate that the protocol conversion is successful, and then the WebSocket protocol can be used for full-duplex bidirectional communication. For those unfamiliar with WebSocket, please refer to this article. Portal: WebSocket protocol resolution

Via

The Via field is used to trace the transmission path of request and response packets between the client and the server. When a packet passes through a proxy server or gateway, information about the server is appended to the Via field before it is forwarded. Normally the Via field is used in conjunction with the Max-forwards field. See this article for an explanation of the Max-forwards field. Portal: Max-forward

Request header field

Accept

The Accept field informs the server of the media types that the user agent can handle and the relative priority of the media types. You can use the type/subtype form to specify multiple media types at once, increasing the priority of the media type by q=, ranging from 1.0 to 0. The default is 1.0

Accept: q = 1.0 application/json; Q = 0.8 text/plain; Q = 0.7 * / *Copy the code

Accept-Charset

The accept-charset field is used to inform the server of the character set supported by the user agent and the relative priority of the character set. In addition, multiple character sets can be specified at once. As with the Accept field, the weight Q value can be used to indicate the relative priority.

Accept-Encoding

The accept-Encoding field is used to inform the server of the content Encoding supported by the user agent and the relative priority of the content Encoding. Content encodings include Gzip, Compress, Deflate, Identity (the default encoding format that does not perform compression), etc.

Accept-Language

Accept-language is used to tell the server which natural Language sets (Chinese or English) the user agent can handle, and the relative priority of the natural Language sets. Multiple natural Language sets can be specified at once

Accept-Language: zh-CN,zh; Q = 0.9, en. Q = 0.8Copy the code

Authorization

The Authorization field is used to tell the server the authentication information (certificate) of the user agent. Typically, the user agent that wants to authenticate with the server adds the field Authorization to the request after receiving the returned 401 status code response.

Host

The Host field tells the server the Internet Host name and port number of the requested resource. When the request is sent to the server, the DNS service is used to resolve the domain name into an IP address. If more than one domain name (virtual host) is deployed under the same IP address, the server cannot understand which domain name corresponds to the request. Therefore, you need to use the Host field to explicitly indicate the requested Host name.

If-none-match If-none-match is used together with Etag. If the value of the if-none-match field is inconsistent with the Etag field, the server processes the request. If so, the server side returns 304 Not Modified.

In typical usage, when a URL is requested, the Web server returns the resource and its corresponding Etag value, which is placed in the HTTP response header.

Etag: "686897696a7c876b7e"
Copy the code

The client can then decide whether to cache the resource and Etag. Later, If the client wants to request the same URL again, it will send a request containing the saved Etag and if-none-match fields.

If-None-Match: "686897696a7c876b7e"
Copy the code

After a client request, the server may compare the Etag of the client with the Etag of the current version resource. If the ETag values match, which means the resource has Not changed, the server sends back a very short response containing the HTTP status of “304 Not Modified.” The 304 status tells the client that its cached version is up to date and should be used. However, if the ETag values do not match, which means that the resource has probably changed, a complete response (200 OK) will be returned, including the contents of the resource, as if the ETag had not been used. In this case, the client can replace the previous cached version with the newly returned resource and the new ETag.

If-Modified-Since

The if-modified-since field matches the last-Modified field of the response header. The server returns a 200 OK status code if the last-Modified value is later than the Last value, indicating that the resource has been updated. The server returns a 304 Not Modified status code if the last-Modified value is earlier than the Last value, indicating that the resource has Not been updated. When used in conjunction with if-none-match, the if-modified-since field is ignored unless the server does not support if-none-match. If-modified-since is used to verify the validity of local resources owned by the proxy server or client.

If-Range

The if-range field tells the server that If the specified if-range field value is the same as the Etag value of the requested resource, it will be processed as a Range request. The response header will contain the Content-Range field, indicating the number of Range bytes returned. Otherwise, all resources are returned. This field is used in conjunction with the Range field.

Proxy-Authorization

Proxy-Authorization: Basic dFDGADdjgjadfDSFJ5
Copy the code

After receiving the authentication information from the proxy server, the client sends a request containing the header field to inform the server of the information required for authentication.

Referer

The Referer field tells the server the URI of the original resource requested.

Response header field

Accept-Ranges

The AccPET-Ranges field is used to tell the client whether the server can process a range request to specify a portion of the server’s resources. There are two types of field values that can be specified, bytes for range requests and None for range requests.

ETag

The server assigns a corresponding ETag value to each resource. When the resource is updated, the ETag value also needs to be updated. The ETag field is usually used in conjunction with the if-none-match field. If the ETag value matches the if-none-match value, the requested resource has Not changed, and the server returns the 304 Not Modified status code. If they do not match each other, a 200 OK status code is returned. In addition, etags are divided into strong etags and weak Etags. They are distinguished by whether the ETag identifier begins with “W/”, for example

"123456789"-- a strong ETag validator W/"123456789"-- a weak ETag validatorCopy the code

See the Wiki for details. Portal: HTTP ETag

Proxy-authenticate && www-authenticate proxy-authenticate sends authentication information required by the Proxy server to the client. It is usually used together with the proxy-authorization field.

The www-Authenticate field is used for HTTP access authentication. It is usually used in conjunction with the Authorization field.

Entity head field

The entity header field is the header used by the entity part contained in the request message and response message, and is used to supplement entity-related information such as the update time of the content. Allow

The format is Allow: GET, POST. The Allow field is used to inform clients of HTTP methods that can be supported. When receiving an unsupported HTTP Method, the server returns a response with the status code 405 Method Not Allowed.

Content-Encoding

This field tells the client how the server chooses to encode the content for the body of the entity. There are four main content encoding methods: Gzip, COMPRESS, Deflate and Identity.

Content-Language && Content-Length

Content-language tells the client the natural Language set adopted by the entity body. Content-length tells the client the size of the entity body.

Content-Range && Content-Type

Content-range tells the client which part of the entity returned in response matches the Range request, and this field is for the Range request. Field values, in bytes, represent the current sent portion and the entire entity size. The format is content-range: bytes 5001-10000/10000

Content-type tells the client the media Type used by the entity body. The media Type is the same as the Accept field.

Expries

The Expries field is used to tell the client the expiration time of the resource. If a proxy server receives a response with an Expires field, it caches the resource. When the same resource is requested within the specified time, the cached resource is returned. When the specified time expires, the proxy server forwards the request to the source server. If you don’t want the proxy server to cache the resource, you can set the Expires field to the same value as the Date field. On the browser side, a conditional request (if-Modified-since and last-modifed fields) is made first rather than immediately to the source server when the requested resource expires.

When a Expires field meets a max-age directive in a cache-Control field, the max-age directive takes precedence.

The field served for the Cookie

As HTTP is a stateless protocol, cookies and HTTP are needed to achieve user state management. See this article for an explanation of cookies. Portal: Front-end storage solution


The resources

1. Illustrated HTTP

2.MDN web docs

3. Wikipedia