This article is participating in the “Network protocol must know must know” essay contest.

To learn more about HTTP, go to the Network Protocol HTTP section here.

To learn more about HTTPS, go to the Network Protocol Security section here.

Cookie

Cookie is a small piece of information stored in the client “only 4KB”, it was invented because it needs to solve the problem of HTTP stateless, so that the background can record the login status according to Cookie, but the security problem of Cookie will be gradually abandoned, but many websites still continue to use, www.baidu.com, for example, has a lot of cookies.

Cookies are mainly used for the following three aspects:

  • Session state management (such as user login status, shopping cart, game score, or other information that needs to be logged)
  • Personalization (such as user-defined Settings, themes, etc.)
  • Browser behavior tracking (e.g. tracking and analyzing user behavior, etc.)

The use of cookies

Set-Cookie:name=value

Header: Cookie:name=value

JavaScript access: document.cookie

You can set the following fields: Expiration time, domain, path, Validity period, and applicable site

Expiration time

Session cookies do not need to set the expiration time, when the page is closed cookies will be deleted.

Persistent cookies require an expiration date, which can be set through Expires or max-age.

  • ExpiresSet an expired value for Cookie deletionThe date of
  • Max-ageSets the number of seconds in which a Cookie will expire
  • Internet Explorer 6, 7, and 8 are not supportedmax-age, supported by all browsersexpires
Set-Cookie: id=123; Expires=Fri Sep 03 2021 16:56:41 GMT
Cookie:id=123; Expires=Fri Sep 03 2021 16:56:41 GMT
Copy the code

Cookie scope

The Domain and Path identifiers define the scope of the Cookie: the urls to which the Cookie should be allowed to be sent.

Domain (domain)

Domain specifies which hosts can accept cookies. If this parameter is not specified, the default value is Origin and the subdomain name is not included. If Domain is specified, subdomain names are generally included. Therefore, specifying the Domain is less restrictive than omitting it. So this can be helpful when subdomains need to share information about users.

Set-Cookie:Domain=developer.mozilla.org
Cookie:Domain=developer.mozilla.org
Copy the code
Path (path)

The Path flag specifies which paths under the host can accept cookies, and the child paths will be matched.

For example, if Path=/docs is set, the following addresses will match:

  • /docs
  • /docs/Web/
  • /docs/Web/HTTP
Set-Cookie:Path=/docs
Cookie:Path=/docs
Copy the code
Applicable site (Samesite Cookie)

In order to solve this problem at the source, Google drafted a draft to improve the HTTP protocol, that is, to add the Samesite attribute to the set-cookie response header, which is used to indicate that the Cookie is a “peer Cookie”, and the peer Cookie can only be used as a first-party Cookie. This Cookie cannot be used as a third-party Cookie. Samesite has Strict and Lax, None:

Samesite=Strict

This mode, called strict mode, indicates that the Cookie cannot be used as a third-party Cookie under any circumstances, without exception. For example, b.com sets the following Cookie:

When we make any request to b.com under a.com, the Cookie foo will not be included in the Cookie request header, but bar will. For example, if the Cookie used by Taobao website to identify whether users log in or not is set to Samesite=Strict, then after users click on the link of Baidu search page or even Tmall page to enter Taobao, Taobao will not be logged in, because Taobao’s server will not accept that Cookie. Any request to Taobao from other websites will not carry the Cookie.

Set-Cookie: foo=1; Samesite=Strict
Set-Cookie: bar=2; Samesite=Lax
Set-Cookie: baz=3
Copy the code

Samesite=Lax is called loose mode, which is stricter than Strict: if the request is one of these (changing the current page or opening a new page) and is also a GET request, then the Cookie can be used as a third-party Cookie. For example, b.com sets the following Cookie:

Set-Cookie: foo=1; Samesite=Strict Set-Cookie: bar=2; Samesite=Lax Set-Cookie: Baz =3 When a user clicks a link from a.com to enter b.com, the Cookie foo will not be included in the Cookie request header, but bar and Baz will. That is to say, the user will not be affected by the link between different websites. However, bar will not send the request if it is an asynchronous request to B.com from a.com, or if the page jump is triggered by a POST submission of the form.Copy the code

If SamesiteCookie is set to Strict, the browser does not carry cookies in any cross-domain requests, nor does it carry new tag reopens, so there is little chance of CSRF attacks.

And jump rotor domain name or a new TAB to reopen the site just logged in, the previous Cookie will not exist. In particular, there are login websites, so we open a new label to enter, or jump to the sub-domain site, need to log in again. For users, the experience may not be very good.

If SamesiteCookie is set to Lax, other web sites can use cookies when they jump to the page, ensuring the login status of the user when the page is opened from the outer domain connection. But correspondingly, its security is also relatively low.

In summary, SamesiteCookie is a possible alternative to same-origin validation, but requires Strict Lax to be used properly.

security

Use the HttpOnly attribute to prevent access to cookie values through JavaScript.

Cookies using Secure should only be sent to the server through requests that are encrypted by the HTTPS protocol.

However, these two attributes are not very useful as a security precaution. HttpOnly does not use cookies, but it can be very inconvenient for developers to use cookies when needed.

Set-Cookie:Secure; HttpOnly
Copy the code

Advantages and disadvantages of Cookies

advantages

  1. The amount of data stored is small, which is a problem in the scenario, simple and light.
  2. Data persistence does not require any server resources, since cookies are stored on the client and sent to the server for reading, for example, to remember the user’s behavior, such as setting the subject.

disadvantages

1. The amount of data to be stored is small, which is a problem in scenarios. Only 4kb is suitable for storing large data.

2. Performance problems. Cookies will be added to the request header whenever a request is sent, which will waste bandwidth.

3. Security issues, cookies are easy to be used, such as XSS attacks.

Request method

Request methods are HTTP methods of obtaining resources, but different methods of obtaining different resources, to make the semantics more explicit.

Restful architectural design is the “add, delete, modify, and search” of different resources or operations, which requires relevant semantics to be used and more explicit.

But some companies don’t use it, only post requests go anywhere, haha.

In short, it depends on the company’s technical architecture to use useful benefits rather than unnecessary conveniences.

GET

The GET method requests a representation of a specified resource, and requests using GET should only be used to GET data.

Such as images, JavaScript, CSS, and login graphic captcha.

HEAD

The HEAD method requests header information of the resource, and these headers are the same as those returned in GET requests. One use of this request method is to obtain the size of a large file before downloading it, which can save bandwidth resources, and there is no response body.

This can be summarized as a GET request with no response body.

POST

The POST method is used to submit an entity to a specified resource, often resulting in a state change or side effect on the server.

The most widely used, and some do not use restful architecture, almost all use POST requests.

PUT

The PUT method replaces the request for the target resource with the request submission parameter.

Equivalent to an update operation.

DELETE

The DELETE method deletes the specified resource.

This operation is equivalent to deleting.

CONNECT

The CONNECT method establishes a tunnel to the server identified by the target resource.

OPTIONS

The OPTIONS method describes the supported communication OPTIONS for the target resource.

TRACE

The TRACE method performs a message loopback test along the path to the target resource.

In fact, we will find that trace requests are made automatically quite often during project development

PATCH

The PATCH method is used to apply partial changes to resources, which is different from putting all changes.

The head

Contains request headers and response headers.

Request header: a message sent from the client to the server to determine the client’s health or login status, etc

Response header: is the information passed by the server to the client, the return information used for the client to judge, etc

Both request headers and response headers can be customized. Key :value is case insensitive

Such as request header token: xxxYYy

Because there are too many fields, please refer to all header fields

Request header field

Name Description Example
Accept Response acceptable media type Accept: text/html
Accept-Charset Acceptable character types Accept-Charset: utf-8
Accept-Encoding Supported file compression methods Accept-Encoding: gzip, deflate
Accept-Language Embrace that language Accept-Language: zh-CN
Access-Control-Request-Method, Access-Control-Request-Headers Set the request acceptance mode and request header under the non-same-origin policy Access-Control-Request-Method: GET
Cache-Control Used to set the cache policy Cache-Control: no-cache
Connection Set long connections. Cannot be used in HTTP2.0 Connection: keep-alive
Content-Encoding Is used to tell the client how to decode to get inContent-TypeMedia type content identified in. Content-Encoding: gzip
Content-Length Used to indicate the size of the message body (request body) to be sent to the recipient Content-Length: 348
Content-Type Sets the request body type Content-Type: application/x-www-form-urlencoded
Cookie The server saves the Settings on the client through the set-cookie field Cookie: $Version=1; Skin=new;
Expect Indicates that the client requires specific server behavior Expect: 100-continue
Forwarded Raw information that the client connects to the Web server via an HTTP proxy Forwarded: for = 192.0.2.60; proto=http; By = 203.0.113.43 Forwarded: for = 192.0.2.43, for = 198.51.100.17
Host The domain name of the server (for the virtual host) and the TCP port number on which the server is listening. If the port is a standard port for the requested service, you can omit the port number. Starting with HTTP/1.1 is required. If the request is generated directly in HTTP/2, it should not be used. Host: en.wikipedia.org:8080``Host: en.wikipedia.org
If – Match (http1.1) This operation is performed only if the entity provided by the client matches the same entity on the server. This is mainly used for methods like PUT to update only resources that have not changed since the user last updated them. If-Match: "737060cd8c284d8af7ad3082f209582d"
If-Modified-Since If the resource has not changed, 304 is returned and the negotiation cache is accessed. If-Modified-Since: Sat, 29 Oct 1994 19:43:31 GMT
If-None-Match If the resource has not changed, 304 is returned If-None-Match: “737060cd8c284d8af7ad3082f209582d”
Origin Request header fieldOriginIndicates which site the request is coming from. This field only indicates the server name and does not contain any path information. This header is used for CORS or POST requests. In addition to containing no path information, this field is associated withRefererHeader fields are similar. Origin: http://www.example-social-network.com
Range Only a portion of the entity is requested. Bytes are numbered from 0. Range: bytes=500-999
Referer This is the address of the previous web page that links to the currently requested page. (The word “referrer” is misspelled in the RFC, as well as in most implementations, so much so that it has become standard usage and is considered the correct term.) Referer: http://en.wikipedia.org/wiki/Main_Page
Transfer-Encoding A form of encoding used to securely transmit entities to users. Methods currently defined are: chunked, Compress, Deflate, gzip, identity. Do not use with HTTP/2. Transfer-Encoding: chunked
User-Agent Used to allow the network protocol peer to identify the application type, operating system, software developer, and version number of the user-agent software making the request. The user-agent: Mozilla / 5.0 (X11; Linux x86_64; The rv: 12.0) Gecko / 20100101 Firefox 12.0
Via The proxy server that notifies the server to send the request. Via: 1.0 fred, 1.1 example.com (Apache/1.1)
Warning A general warning about possible problems with entity principals. Warning: 199 Miscellaneous warning

Common non-standard request fields

Field name Description Example
DNT The request firstDNT (Do Not TRack indicates a preference for site tracking. It allows users to specify whether they care more about privacy or customize content. DNT: 1 (Do Not Track Enabled)DNT: 0 (Do Not Track Disabled)
X-Forwarded-For X-Forwarded-For(XFF) Can be used to obtain the IP address of the client that initiated the request when the client accesses the server through an HTTP proxy or load balancing server. This header becomes the de facto standard. In case the message flow from the client to the server is blocked, the server-side access log can only record the IP address of the proxy server or load balancer server. X-forwarded-for comes in handy if you want to get the IP address of the client that made the original request. X-forwarded-for: Client1, proxy1, and proxy2 X-Forwarded-For: 129.78.138.66, 129.78.64.103
X-Forwarded-Host Is a de facto standard header used to determine the use in client-initiated requestsHostSpecifies the initial domain name. The domain name or port number of a reverse proxy (such as a load balancer, A CDN, etc.) may be different from that of the source server that handles the request. In this case, X-Forwarded-Host can be used to determine which domain name was originally accessed. X-Forwarded-Host: en.wikipedia.org:8080``X-Forwarded-Host: en.wikipedia.org
X-Forwarded-Host X-Forwarded-ProtoXFP is a de facto standard header that determines the transport protocol (HTTP or HTTPS) used to connect a client to a proxy server or load balancer. The server access log records the transport protocol used to connect the load balancing server to the load balancing server, not the protocol used to connect the client to the load balancing server. To determine the protocol used between the client and the load balancer, X-Forwarded-Proto comes in handy. X-Forwarded-Proto: https

Response header field

Field name Description Example
Access-Control-Allow-Origin, Access-Control-Allow-Credentials, Access-Control-Expose-Headers, Access-Control-Max-Age, Access-Control-Allow-Methods, Access-Control-Allow-Headers Specify which sites can participate in CORS Access-Control-Allow-Origin: *
Accept-Patch The server uses HTTP response headersAccept-PatchNotifying the browser of the media-type of the request is understood by the server. Accept-Patch: text/example; charset=utf-8
Accept-Ranges The server uses HTTP response headersAccept-RangesIdentify partial requests that it supports. The specific value of the field is used to define the unit of scope request. Accept-Ranges: bytes
Age AgeThe message header contains the length of time, in seconds, that the object has been stored in the cache broker. Age: 12
Allow AllowThe header field is used to enumerate the collection of HTTP methods supported by the resource. If the server returns the status code 405Method Not Allowed: The header field must be returned to the client at the same time. if AllowThe value of the header field is null, indicating that the resource will not accept requests using any HTTP method. This is possible, for example, when the server needs to temporarily block any access to the resource. Allow: GET, HEAD
Alt-Svc Alt-SvcAn Alternative Service is an Alternative Service. This header lists alternative access methods for the current site. It is generally used to achieve backward compatibility while providing support for emerging protocols such as QUIC. Alt - Svc: HTTP / 1.1 = "http2.example.com: 8001"; ma=7200
Cache-Control All the caching mechanisms from the server to the client whether they can cache this object. It’s measured in seconds Cache-Control: max-age=3600
Connection Set long connections. Cannot be used in HTTP2.0 Connection: close
Content-Encoding The response comes back to the encoding format Content-Encoding: gzip
Content-Language Response back language Content-Language: zh-cn
Content-Length The length of the returned data Content-Length: 348
Content-Location Content-LocationThe header specifies the address options for the data to be returned. The primary use is to specify the URL that is the result of content negotiation for the resource to be accessed.LocationwithContent-LocationIs different from the formerLocationSpecify the destination address of a redirected request (or the URL of the newly created file), while the latterContent-LocationPoints to the direct address of the accessible resource without further content negotiation.LocationThat corresponds to the response, and thetaContent-LocationThis corresponds to the entity to be returned. Content-Location: /index.html
Content-Range In response to the firstContent-RangeDisplays the location of a piece of data in the entire file. Content-Range: bytes 21010-47021/47022
Content-Type The type of data returned Content-Type: text/html; charset=utf-8
Date Returns the time when the response was created Date: Tue, 15 Nov 1994 08:12:31 GMT
ETag ETagThe HTTP response header is an identifier for a specific version of the resource. This makes caching more efficient and saves bandwidth because the Web server does not need to send the full response if the content does not change. If content changes, using ETag helps prevent simultaneous updates of resources from overwriting each other (” midair collisions “). Be sure to generate a new Etag value if the resource in a given URL changes. Etags are therefore similar to fingerprints and may also be used by some servers for tracking. Comparing ETAGS can quickly determine if this resource has changed, but it can also be persisted permanently by the tracing server. ETag: "737060cd8c284d8af7ad3082f209582d"
Expires Response expiration time Expires: Thu, 01 Dec 1994 16:00:00 GMT
Last-Modified The Last-ModifiedIs a response header that contains the date and time when the resource identified by the source server was modified. It is often used as a validator to determine whether received or stored resources are consistent with each other. Due to the accuracy ratioETagIt’s low, so it’s a backup mechanism. containsIf-Modified-SinceIf-Unmodified-SinceThis field is used in the header condition request. Last-Modified: Tue, 15 Nov 1994 12:45:26 GMT
Location Developer.mozilla.org/zh-CN/docs/… Example 1: Location: http://www.w3.org/pub/WWW/People.htmlExample 2: Location: /pub/WWW/People.html
Set-Cookie The cookie is set on the server Set-Cookie: UserID=JohnDoe; Max-Age=3600; Version=1
Via ViaIs a generic header that is added by the proxy server and applies to both forward and reverse proxies, appearing in both the request and response headers. This header can be used to track message forwarding, prevent circular requests, and identify the ability of the message sender to support the protocol in the request or response delivery chain. Via: 1.0 fred, 1.1 example.com (Apache/1.1)

Common non-standard response fields

Field name Description Example
Content-Security-Policy HTTP response header **Content-Security-Policy** Allows the site manager to control what resources the user agent can load for a given page. With a few exceptions, the policy set involves specifying the server’s source and script end point. This will help prevent cross-site scripting from attacking XSS Content-Security-Policy: <policy-directive>; <policy-directive>

Status code

The HTTP status code describes the status of the current request response.

Responses fall into five categories: message response (100-199), success response (200-299), redirection (300-399), client error (400-499), and server error (500-599).

Information Response (100-199)

Status code describe
100 Continue This temporary response indicates that everything so far is ok, and the client should continue to request and ignore it if it is done.
101 Switching Protocol Responding clientUpgrade (en-US)The protocol that is sent and indicates that the server is also switching.
103 Early Hints Mainly used withLinkThe link header is used together to allow the user agent to start preloading resources while the server is still preparing the response.

Successful response (200299)

Status code describe
200 OK The request succeeded.
201 Created The request was successful and a new resource was created and returned either in the response body or in the response header. This is usually the response that is returned after a POST request, or some PUT request.
202 Accepted Request received, but not yet responded, no result. This means that there is no asynchronous response to indicate the result of the current request, and that other processes and services are expected to handle the request, or batch processing.
204 No Content HTTP **204 No Content** Success status response code, indicating that the request has been successful, but the client client does not need to leave the current page. By default, 204 responses are cacheable. aETagThe header is included in this type of response. The usage convention is inPUT204 No Content is returned for resource updates that do not require changes to the page currently displayed to the user. Returns if a resource was created201 Created. If the page should be changed to a newly updated page, use instead200
205 Reset Content The server successfully processed the request and did not return anything. But unlike the 204 response, the response that returns this status code requires the requester to reset the document view. This response is primarily used to reset the form immediately after receiving user input so that the user can easily start another input. Like the 204 response, this response is disallowed from containing any message body and ends with the first blank line after the header. Now are partial refresh, this status code is generally not used.
206 Partial Content The server has successfully processed some of the GET requests. HTTP download tools such as FlashGet or Xunlei use this type of response to implement breakpoint continuation or break up a large document into multiple download segments at the same time. The request must contain a Range header indicating the Range of content the client expects, and may contain if-range as a request condition.

Redirect (300399)

Status code describe
300 Multiple Choice The requested resource has a selection of feedback messages, each with its own specific address and browser-driven negotiation message. The user or browser can choose a preferred address for redirection.
301 Moved Permanently (http1.0) HTTP 301Permanent redirectionThe requested resource has been moved to the specified resourceLocationThe url specified in the header is fixed and will not change. The search engine makes corrections based on that response. Although the standard requires that the browser should not modify the HTTP Method and body when it receives the response and redirects, some browsers may have problems. So it’s better to deal with itGETHEADUse 301 for method and otherwise308In place of 301.
302 Found (http1.0) The requested resource now temporarily responds to the request from a different URI. Since such redirects are temporary, the client should continue to send future requests to the original address. The response is cacheable only if specified in cache-Control or Expires.
303 See Other The response to the current request can be found at another URI, and the client should access that resource using GET. This method exists primarily to allow the output of POST requests activated by the script to be redirected to a new resource.
304 Not Modified If the client sends a conditional GET request and the request is granted, the content of the document has not changed (since the last access or according to the conditions of the request), the server should return this status code. The 304 response disallows the inclusion of a message body and therefore always ends with the first blank line after the message header.
307 Temporary Redirect (http1.1) The requested resource now temporarily responds to the request from a different URI. Since such redirects are temporary, the client should continue to send future requests to the original address. The response is cacheable only if specified in cache-Control or Expires.
308 Permanent Redirect (http1.1) This means that the resource is now permanently located byLocation:Another URI specified by the HTTP Response header. This has to do with301 Moved Permanently HTTPThe response code has the same semantics, but the user agent cannot change the HTTP method used: if used in the first requestPOSTMust be used in the second requestPOST.

Client error (400499)

Status code describe
400 Bad Request The current request cannot be understood by the server. The client should not re-submit this request unless it is modified. 2. The request parameters are incorrect.
401Unauthorized The current request does not have permissions, or is not logged in, etc
403 Forbidden Maybe the server is blocked, maybe the site is blocked
404 Not Found The current request could not be found, either because the address is wrong or there is no interface in the background
405 Method Not Allowed The current request mode does not allow
406 Not Acceptable The content characteristics of the requested resource do not satisfy the condition in the request header, and therefore the response entity cannot be generated.
408 Request Timeout The request timed out. The client did not finish sending a request within the time the server was waiting. The client can submit this request again at any time without making any changes.
409 Conflict The request could not be completed because of a conflict with the current state of the requested resource. This code is allowed to be used only when the user is deemed capable of resolving the conflict and will resubmit a new request. The response should contain enough information for the user to discover the source of the conflict.
429 Too Many Requests The user is sending too many requests in a given amount of time (” limiting the request rate “).

Server error (500599)

Status code describe
500 Internal Server Error The server encountered a situation that it did not know how to handle.
501 Not Implemented This request method is not supported by the server and cannot be processed. onlyGETandHEADServer support is required.
503 Service Unavailable Most likely a server restart
504 Gateway Timeout Return this error code when the server, acting as a gateway, cannot get a response in time.
505 HTTP Version Not Supported The server does not support the HTTP protocol version used in the request.

reference

“Browser Working Principles and Practices”

Perspective HTTP protocol — Chrono

MDN

If you don’t understand anything, or if there are any inadequacies or mistakes in my article, please point them out in the comments section. Thank you for reading.