This article is participating in the “Network protocol must know must know” essay contest.
To learn more about HTTP, go to the Network Protocol HTTP section here.
To learn more about HTTPS, go to the Network Protocol Security section here.
Cookie
Cookie is a small piece of information stored in the client “only 4KB”, it was invented because it needs to solve the problem of HTTP stateless, so that the background can record the login status according to Cookie, but the security problem of Cookie will be gradually abandoned, but many websites still continue to use, www.baidu.com, for example, has a lot of cookies.
Cookies are mainly used for the following three aspects:
- Session state management (such as user login status, shopping cart, game score, or other information that needs to be logged)
- Personalization (such as user-defined Settings, themes, etc.)
- Browser behavior tracking (e.g. tracking and analyzing user behavior, etc.)
The use of cookies
Set-Cookie:name=value
Header: Cookie:name=value
JavaScript access: document.cookie
You can set the following fields: Expiration time, domain, path, Validity period, and applicable site
Expiration time
Session cookies do not need to set the expiration time, when the page is closed cookies will be deleted.
Persistent cookies require an expiration date, which can be set through Expires or max-age.
Expires
Set an expired value for Cookie deletionThe date ofMax-age
Sets the number of seconds in which a Cookie will expire- Internet Explorer 6, 7, and 8 are not supported
max-age
, supported by all browsersexpires
Set-Cookie: id=123; Expires=Fri Sep 03 2021 16:56:41 GMT
Cookie:id=123; Expires=Fri Sep 03 2021 16:56:41 GMT
Copy the code
Cookie scope
The Domain and Path identifiers define the scope of the Cookie: the urls to which the Cookie should be allowed to be sent.
Domain (domain)
Domain specifies which hosts can accept cookies. If this parameter is not specified, the default value is Origin and the subdomain name is not included. If Domain is specified, subdomain names are generally included. Therefore, specifying the Domain is less restrictive than omitting it. So this can be helpful when subdomains need to share information about users.
Set-Cookie:Domain=developer.mozilla.org
Cookie:Domain=developer.mozilla.org
Copy the code
Path (path)
The Path flag specifies which paths under the host can accept cookies, and the child paths will be matched.
For example, if Path=/docs is set, the following addresses will match:
/docs
/docs/Web/
/docs/Web/HTTP
Set-Cookie:Path=/docs
Cookie:Path=/docs
Copy the code
Applicable site (Samesite Cookie)
In order to solve this problem at the source, Google drafted a draft to improve the HTTP protocol, that is, to add the Samesite attribute to the set-cookie response header, which is used to indicate that the Cookie is a “peer Cookie”, and the peer Cookie can only be used as a first-party Cookie. This Cookie cannot be used as a third-party Cookie. Samesite has Strict and Lax, None:
Samesite=Strict
This mode, called strict mode, indicates that the Cookie cannot be used as a third-party Cookie under any circumstances, without exception. For example, b.com sets the following Cookie:
When we make any request to b.com under a.com, the Cookie foo will not be included in the Cookie request header, but bar will. For example, if the Cookie used by Taobao website to identify whether users log in or not is set to Samesite=Strict, then after users click on the link of Baidu search page or even Tmall page to enter Taobao, Taobao will not be logged in, because Taobao’s server will not accept that Cookie. Any request to Taobao from other websites will not carry the Cookie.
Set-Cookie: foo=1; Samesite=Strict
Set-Cookie: bar=2; Samesite=Lax
Set-Cookie: baz=3
Copy the code
Samesite=Lax is called loose mode, which is stricter than Strict: if the request is one of these (changing the current page or opening a new page) and is also a GET request, then the Cookie can be used as a third-party Cookie. For example, b.com sets the following Cookie:
Set-Cookie: foo=1; Samesite=Strict Set-Cookie: bar=2; Samesite=Lax Set-Cookie: Baz =3 When a user clicks a link from a.com to enter b.com, the Cookie foo will not be included in the Cookie request header, but bar and Baz will. That is to say, the user will not be affected by the link between different websites. However, bar will not send the request if it is an asynchronous request to B.com from a.com, or if the page jump is triggered by a POST submission of the form.Copy the code
If SamesiteCookie is set to Strict, the browser does not carry cookies in any cross-domain requests, nor does it carry new tag reopens, so there is little chance of CSRF attacks.
And jump rotor domain name or a new TAB to reopen the site just logged in, the previous Cookie will not exist. In particular, there are login websites, so we open a new label to enter, or jump to the sub-domain site, need to log in again. For users, the experience may not be very good.
If SamesiteCookie is set to Lax, other web sites can use cookies when they jump to the page, ensuring the login status of the user when the page is opened from the outer domain connection. But correspondingly, its security is also relatively low.
In summary, SamesiteCookie is a possible alternative to same-origin validation, but requires Strict Lax to be used properly.
security
Use the HttpOnly attribute to prevent access to cookie values through JavaScript.
Cookies using Secure should only be sent to the server through requests that are encrypted by the HTTPS protocol.
However, these two attributes are not very useful as a security precaution. HttpOnly does not use cookies, but it can be very inconvenient for developers to use cookies when needed.
Set-Cookie:Secure; HttpOnly
Copy the code
Advantages and disadvantages of Cookies
advantages
- The amount of data stored is small, which is a problem in the scenario, simple and light.
- Data persistence does not require any server resources, since cookies are stored on the client and sent to the server for reading, for example, to remember the user’s behavior, such as setting the subject.
disadvantages
1. The amount of data to be stored is small, which is a problem in scenarios. Only 4kb is suitable for storing large data.
2. Performance problems. Cookies will be added to the request header whenever a request is sent, which will waste bandwidth.
3. Security issues, cookies are easy to be used, such as XSS attacks.
Request method
Request methods are HTTP methods of obtaining resources, but different methods of obtaining different resources, to make the semantics more explicit.
Restful architectural design is the “add, delete, modify, and search” of different resources or operations, which requires relevant semantics to be used and more explicit.
But some companies don’t use it, only post requests go anywhere, haha.
In short, it depends on the company’s technical architecture to use useful benefits rather than unnecessary conveniences.
GET
The GET method requests a representation of a specified resource, and requests using GET should only be used to GET data.
Such as images, JavaScript, CSS, and login graphic captcha.
HEAD
The HEAD method requests header information of the resource, and these headers are the same as those returned in GET requests. One use of this request method is to obtain the size of a large file before downloading it, which can save bandwidth resources, and there is no response body.
This can be summarized as a GET request with no response body.
POST
The POST method is used to submit an entity to a specified resource, often resulting in a state change or side effect on the server.
The most widely used, and some do not use restful architecture, almost all use POST requests.
PUT
The PUT method replaces the request for the target resource with the request submission parameter.
Equivalent to an update operation.
DELETE
The DELETE method deletes the specified resource.
This operation is equivalent to deleting.
CONNECT
The CONNECT method establishes a tunnel to the server identified by the target resource.
OPTIONS
The OPTIONS method describes the supported communication OPTIONS for the target resource.
TRACE
The TRACE method performs a message loopback test along the path to the target resource.
In fact, we will find that trace requests are made automatically quite often during project development
PATCH
The PATCH method is used to apply partial changes to resources, which is different from putting all changes.
The head
Contains request headers and response headers.
Request header: a message sent from the client to the server to determine the client’s health or login status, etc
Response header: is the information passed by the server to the client, the return information used for the client to judge, etc
Both request headers and response headers can be customized. Key :value is case insensitive
Such as request header token: xxxYYy
Because there are too many fields, please refer to all header fields
Request header field
Name | Description | Example |
---|---|---|
Accept | Response acceptable media type | Accept: text/html |
Accept-Charset | Acceptable character types | Accept-Charset: utf-8 |
Accept-Encoding | Supported file compression methods | Accept-Encoding: gzip, deflate |
Accept-Language | Embrace that language | Accept-Language: zh-CN |
Access-Control-Request-Method, Access-Control-Request-Headers | Set the request acceptance mode and request header under the non-same-origin policy | Access-Control-Request-Method: GET |
Cache-Control | Used to set the cache policy | Cache-Control: no-cache |
Connection | Set long connections. Cannot be used in HTTP2.0 | Connection: keep-alive |
Content-Encoding | Is used to tell the client how to decode to get inContent-Type Media type content identified in. |
Content-Encoding: gzip |
Content-Length | Used to indicate the size of the message body (request body) to be sent to the recipient | Content-Length: 348 |
Content-Type | Sets the request body type | Content-Type: application/x-www-form-urlencoded |
Cookie | The server saves the Settings on the client through the set-cookie field | Cookie: $Version=1; Skin=new; |
Expect | Indicates that the client requires specific server behavior | Expect: 100-continue |
Forwarded | Raw information that the client connects to the Web server via an HTTP proxy | Forwarded: for = 192.0.2.60; proto=http; By = 203.0.113.43 Forwarded: for = 192.0.2.43, for = 198.51.100.17 |
Host | The domain name of the server (for the virtual host) and the TCP port number on which the server is listening. If the port is a standard port for the requested service, you can omit the port number. Starting with HTTP/1.1 is required. If the request is generated directly in HTTP/2, it should not be used. | Host: en.wikipedia.org:8080``Host: en.wikipedia.org |
If – Match (http1.1) | This operation is performed only if the entity provided by the client matches the same entity on the server. This is mainly used for methods like PUT to update only resources that have not changed since the user last updated them. | If-Match: "737060cd8c284d8af7ad3082f209582d" |
If-Modified-Since | If the resource has not changed, 304 is returned and the negotiation cache is accessed. | If-Modified-Since: Sat, 29 Oct 1994 19:43:31 GMT |
If-None-Match | If the resource has not changed, 304 is returned | If-None-Match: “737060cd8c284d8af7ad3082f209582d” |
Origin | Request header fieldOrigin Indicates which site the request is coming from. This field only indicates the server name and does not contain any path information. This header is used for CORS or POST requests. In addition to containing no path information, this field is associated withReferer Header fields are similar. |
Origin: http://www.example-social-network.com |
Range | Only a portion of the entity is requested. Bytes are numbered from 0. | Range: bytes=500-999 |
Referer | This is the address of the previous web page that links to the currently requested page. (The word “referrer” is misspelled in the RFC, as well as in most implementations, so much so that it has become standard usage and is considered the correct term.) | Referer: http://en.wikipedia.org/wiki/Main_Page |
Transfer-Encoding | A form of encoding used to securely transmit entities to users. Methods currently defined are: chunked, Compress, Deflate, gzip, identity. Do not use with HTTP/2. | Transfer-Encoding: chunked |
User-Agent | Used to allow the network protocol peer to identify the application type, operating system, software developer, and version number of the user-agent software making the request. | The user-agent: Mozilla / 5.0 (X11; Linux x86_64; The rv: 12.0) Gecko / 20100101 Firefox 12.0 |
Via | The proxy server that notifies the server to send the request. | Via: 1.0 fred, 1.1 example.com (Apache/1.1) |
Warning | A general warning about possible problems with entity principals. | Warning: 199 Miscellaneous warning |
Common non-standard request fields
Field name | Description | Example |
---|---|---|
DNT | The request firstDNT (Do Not TRack indicates a preference for site tracking. It allows users to specify whether they care more about privacy or customize content. |
DNT: 1 (Do Not Track Enabled)DNT: 0 (Do Not Track Disabled) |
X-Forwarded-For | X-Forwarded-For (XFF) Can be used to obtain the IP address of the client that initiated the request when the client accesses the server through an HTTP proxy or load balancing server. This header becomes the de facto standard. In case the message flow from the client to the server is blocked, the server-side access log can only record the IP address of the proxy server or load balancer server. X-forwarded-for comes in handy if you want to get the IP address of the client that made the original request. |
X-forwarded-for: Client1, proxy1, and proxy2 X-Forwarded-For: 129.78.138.66, 129.78.64.103 |
X-Forwarded-Host | Is a de facto standard header used to determine the use in client-initiated requestsHost Specifies the initial domain name. The domain name or port number of a reverse proxy (such as a load balancer, A CDN, etc.) may be different from that of the source server that handles the request. In this case, X-Forwarded-Host can be used to determine which domain name was originally accessed. |
X-Forwarded-Host: en.wikipedia.org:8080``X-Forwarded-Host: en.wikipedia.org |
X-Forwarded-Host | X-Forwarded-Proto XFP is a de facto standard header that determines the transport protocol (HTTP or HTTPS) used to connect a client to a proxy server or load balancer. The server access log records the transport protocol used to connect the load balancing server to the load balancing server, not the protocol used to connect the client to the load balancing server. To determine the protocol used between the client and the load balancer, X-Forwarded-Proto comes in handy. |
X-Forwarded-Proto: https |
Response header field
Field name | Description | Example |
---|---|---|
Access-Control-Allow-Origin, Access-Control-Allow-Credentials, Access-Control-Expose-Headers, Access-Control-Max-Age, Access-Control-Allow-Methods, Access-Control-Allow-Headers | Specify which sites can participate in CORS | Access-Control-Allow-Origin: * |
Accept-Patch | The server uses HTTP response headersAccept-Patch Notifying the browser of the media-type of the request is understood by the server. |
Accept-Patch: text/example; charset=utf-8 |
Accept-Ranges | The server uses HTTP response headersAccept-Ranges Identify partial requests that it supports. The specific value of the field is used to define the unit of scope request. |
Accept-Ranges: bytes |
Age | Age The message header contains the length of time, in seconds, that the object has been stored in the cache broker. |
Age: 12 |
Allow | Allow The header field is used to enumerate the collection of HTTP methods supported by the resource. If the server returns the status code 405Method Not Allowed: The header field must be returned to the client at the same time. if Allow The value of the header field is null, indicating that the resource will not accept requests using any HTTP method. This is possible, for example, when the server needs to temporarily block any access to the resource. |
Allow: GET, HEAD |
Alt-Svc | Alt-Svc An Alternative Service is an Alternative Service. This header lists alternative access methods for the current site. It is generally used to achieve backward compatibility while providing support for emerging protocols such as QUIC. |
Alt - Svc: HTTP / 1.1 = "http2.example.com: 8001"; ma=7200 |
Cache-Control | All the caching mechanisms from the server to the client whether they can cache this object. It’s measured in seconds | Cache-Control: max-age=3600 |
Connection | Set long connections. Cannot be used in HTTP2.0 | Connection: close |
Content-Encoding | The response comes back to the encoding format | Content-Encoding: gzip |
Content-Language | Response back language | Content-Language: zh-cn |
Content-Length | The length of the returned data | Content-Length: 348 |
Content-Location | Content-Location The header specifies the address options for the data to be returned. The primary use is to specify the URL that is the result of content negotiation for the resource to be accessed.Location withContent-Location Is different from the formerLocation Specify the destination address of a redirected request (or the URL of the newly created file), while the latterContent-Location Points to the direct address of the accessible resource without further content negotiation.Location That corresponds to the response, and thetaContent-Location This corresponds to the entity to be returned. |
Content-Location: /index.html |
Content-Range | In response to the firstContent-Range Displays the location of a piece of data in the entire file. |
Content-Range: bytes 21010-47021/47022 |
Content-Type | The type of data returned | Content-Type: text/html; charset=utf-8 |
Date | Returns the time when the response was created | Date: Tue, 15 Nov 1994 08:12:31 GMT |
ETag | ETag The HTTP response header is an identifier for a specific version of the resource. This makes caching more efficient and saves bandwidth because the Web server does not need to send the full response if the content does not change. If content changes, using ETag helps prevent simultaneous updates of resources from overwriting each other (” midair collisions “). Be sure to generate a new Etag value if the resource in a given URL changes. Etags are therefore similar to fingerprints and may also be used by some servers for tracking. Comparing ETAGS can quickly determine if this resource has changed, but it can also be persisted permanently by the tracing server. |
ETag: "737060cd8c284d8af7ad3082f209582d" |
Expires | Response expiration time | Expires: Thu, 01 Dec 1994 16:00:00 GMT |
Last-Modified | The Last-Modified Is a response header that contains the date and time when the resource identified by the source server was modified. It is often used as a validator to determine whether received or stored resources are consistent with each other. Due to the accuracy ratioETag It’s low, so it’s a backup mechanism. containsIf-Modified-Since 或 If-Unmodified-Since This field is used in the header condition request. |
Last-Modified: Tue, 15 Nov 1994 12:45:26 GMT |
Location | Developer.mozilla.org/zh-CN/docs/… | Example 1: Location: http://www.w3.org/pub/WWW/People.html Example 2: Location: /pub/WWW/People.html |
Set-Cookie | The cookie is set on the server | Set-Cookie: UserID=JohnDoe; Max-Age=3600; Version=1 |
Via | Via Is a generic header that is added by the proxy server and applies to both forward and reverse proxies, appearing in both the request and response headers. This header can be used to track message forwarding, prevent circular requests, and identify the ability of the message sender to support the protocol in the request or response delivery chain. |
Via: 1.0 fred, 1.1 example.com (Apache/1.1) |
Common non-standard response fields
Field name | Description | Example |
---|---|---|
Content-Security-Policy | HTTP response header **Content-Security-Policy ** Allows the site manager to control what resources the user agent can load for a given page. With a few exceptions, the policy set involves specifying the server’s source and script end point. This will help prevent cross-site scripting from attacking XSS |
Content-Security-Policy: <policy-directive>; <policy-directive> |
Status code
The HTTP status code describes the status of the current request response.
Responses fall into five categories: message response (100-199), success response (200-299), redirection (300-399), client error (400-499), and server error (500-599).
Information Response (100-199)
Status code | describe |
---|---|
100 Continue | This temporary response indicates that everything so far is ok, and the client should continue to request and ignore it if it is done. |
101 Switching Protocol | Responding clientUpgrade (en-US)The protocol that is sent and indicates that the server is also switching. |
103 Early Hints | Mainly used withLink The link header is used together to allow the user agent to start preloading resources while the server is still preparing the response. |
Successful response (200
–299
)
Status code | describe |
---|---|
200 OK | The request succeeded. |
201 Created | The request was successful and a new resource was created and returned either in the response body or in the response header. This is usually the response that is returned after a POST request, or some PUT request. |
202 Accepted | Request received, but not yet responded, no result. This means that there is no asynchronous response to indicate the result of the current request, and that other processes and services are expected to handle the request, or batch processing. |
204 No Content | HTTP **204 No Content ** Success status response code, indicating that the request has been successful, but the client client does not need to leave the current page. By default, 204 responses are cacheable. aETag The header is included in this type of response. The usage convention is inPUT 204 No Content is returned for resource updates that do not require changes to the page currently displayed to the user. Returns if a resource was created201 Created . If the page should be changed to a newly updated page, use instead200 。 |
205 Reset Content | The server successfully processed the request and did not return anything. But unlike the 204 response, the response that returns this status code requires the requester to reset the document view. This response is primarily used to reset the form immediately after receiving user input so that the user can easily start another input. Like the 204 response, this response is disallowed from containing any message body and ends with the first blank line after the header. Now are partial refresh, this status code is generally not used. |
206 Partial Content | The server has successfully processed some of the GET requests. HTTP download tools such as FlashGet or Xunlei use this type of response to implement breakpoint continuation or break up a large document into multiple download segments at the same time. The request must contain a Range header indicating the Range of content the client expects, and may contain if-range as a request condition. |
Redirect (300
–399
)
Status code | describe |
---|---|
300 Multiple Choice | The requested resource has a selection of feedback messages, each with its own specific address and browser-driven negotiation message. The user or browser can choose a preferred address for redirection. |
301 Moved Permanently (http1.0) | HTTP 301 Permanent redirection The requested resource has been moved to the specified resourceLocation The url specified in the header is fixed and will not change. The search engine makes corrections based on that response. Although the standard requires that the browser should not modify the HTTP Method and body when it receives the response and redirects, some browsers may have problems. So it’s better to deal with itGET 或 HEAD Use 301 for method and otherwise308 In place of 301. |
302 Found (http1.0) | The requested resource now temporarily responds to the request from a different URI. Since such redirects are temporary, the client should continue to send future requests to the original address. The response is cacheable only if specified in cache-Control or Expires. |
303 See Other | The response to the current request can be found at another URI, and the client should access that resource using GET. This method exists primarily to allow the output of POST requests activated by the script to be redirected to a new resource. |
304 Not Modified | If the client sends a conditional GET request and the request is granted, the content of the document has not changed (since the last access or according to the conditions of the request), the server should return this status code. The 304 response disallows the inclusion of a message body and therefore always ends with the first blank line after the message header. |
307 Temporary Redirect (http1.1) | The requested resource now temporarily responds to the request from a different URI. Since such redirects are temporary, the client should continue to send future requests to the original address. The response is cacheable only if specified in cache-Control or Expires. |
308 Permanent Redirect (http1.1) | This means that the resource is now permanently located byLocation: Another URI specified by the HTTP Response header. This has to do with301 Moved Permanently HTTP The response code has the same semantics, but the user agent cannot change the HTTP method used: if used in the first requestPOST Must be used in the second requestPOST . |
Client error (400
–499
)
Status code | describe |
---|---|
400 Bad Request | The current request cannot be understood by the server. The client should not re-submit this request unless it is modified. 2. The request parameters are incorrect. |
401Unauthorized | The current request does not have permissions, or is not logged in, etc |
403 Forbidden | Maybe the server is blocked, maybe the site is blocked |
404 Not Found | The current request could not be found, either because the address is wrong or there is no interface in the background |
405 Method Not Allowed | The current request mode does not allow |
406 Not Acceptable | The content characteristics of the requested resource do not satisfy the condition in the request header, and therefore the response entity cannot be generated. |
408 Request Timeout | The request timed out. The client did not finish sending a request within the time the server was waiting. The client can submit this request again at any time without making any changes. |
409 Conflict | The request could not be completed because of a conflict with the current state of the requested resource. This code is allowed to be used only when the user is deemed capable of resolving the conflict and will resubmit a new request. The response should contain enough information for the user to discover the source of the conflict. |
429 Too Many Requests | The user is sending too many requests in a given amount of time (” limiting the request rate “). |
Server error (500
–599
)
Status code | describe |
---|---|
500 Internal Server Error | The server encountered a situation that it did not know how to handle. |
501 Not Implemented | This request method is not supported by the server and cannot be processed. onlyGET andHEAD Server support is required. |
503 Service Unavailable | Most likely a server restart |
504 Gateway Timeout | Return this error code when the server, acting as a gateway, cannot get a response in time. |
505 HTTP Version Not Supported | The server does not support the HTTP protocol version used in the request. |
reference
“Browser Working Principles and Practices”
Perspective HTTP protocol — Chrono
MDN
If you don’t understand anything, or if there are any inadequacies or mistakes in my article, please point them out in the comments section. Thank you for reading.