Hello, brave friends, Hello, everyone, I am your mouth strong king xiaowu, healthy body, brain is not sick.

I have a wealth of hair loss techniques that can make you a veteran.

A look will be a waste is my main purpose, food to scratch my feet is my characteristics, humble with a trace of strong, stupid people have silly fortune is the biggest comfort to me.

Welcome to this 20-minute series of essays to solidify your HTTP knowledge.

preface

This article focuses on HTTP1.1 and is a bit longer; The content is based on the Illustrated HTTP, supplemented by other materials and supplemented by my own personal observations. Let’s follow the author into the world of HTTP.

Initial knowledge of HTTP

HTTP, which stands for hypertext Transfer Protocol, is a simple request-response protocol that specifies what messages a client may send to a server and what responses it may receive.

πŸ‘‰ connectionless

Limit each connection to one request. The server disconnects from the client after processing the request and receiving the reply, thus saving transmission time.

To Keep the connection Alive, use keep-alive and WebSocket

πŸ‘‰ stateless

The protocol has no memory capability for transaction processing, which means that if previous information is required for subsequent processing, it must be retransmitted. As a result, the amount of data required for each connection increases.

To save status, you can use Session and Cookie

The HTTP message

The information used for HTTP interaction is HTTP packets, which are classified into request packets and response packets

The packet consists of the header, blank line, and packet body

We use curl -v https://www.baidu.com to look at the message structure

πŸ‘Ί The packet header consists of the request line, status line, and header field

  • The initial behavior request line of the request message, including [request method, request URI, HTTP version]

  • The initial behavior status line of the response message, including HTTP version, status code, reason phrase.

  • The first field

    • Common header field: the header used by both request and response packets

    • Request header field

    • Response header field

    • Entity header field: Used to supplement information related to the entity

    • Other: headers not defined in the RFC, such as header fields for Cookie services (Cookie, set-cookie)

πŸ‘Ί Blank line is used to separate the packet head from the packet body

πŸ‘Ί The packet body is specific data. The request packet corresponds to the request body, and the response packet corresponds to the response body

HTTP request methods

GET & POST

GET is used to obtain resources, and POST is used to upload resources

  • From a caching perspective: GET requests are actively cached by browsers, but POST requests are not.

  • From the perspective of encoding: GET encodes THE URL, while POST is unrestricted.

  • From a parameter perspective: the GET parameter is exposed in the URL and only accepts ASCII characters; POST is placed in the request body with unlimited parameters, which is more suitable for transmitting sensitive information.

  • From an idempotent point of view: GET is idempotent, POST is not;

πŸ‘‰ Tips: What is idempotency?

Under the same conditions, if a single request and repeated requests have the same impact on resources, the operation is idempotent.

HEAD

The same as a GET request, except that the requested entity data is not returned, only the response header.

Hyperlink probes can be used to confirm the existence of a resource or check for updates. Many hyperlink probes are implemented based on the HEAD method.

OPTIONS

Query the methods supported by the server

Why does the ❓ browser automatically issue OPTIONS requests

When making a cross-domain request, if it is not a simple request, the browser will automatically trigger a pre-check request (OPTIONS request) for us to confirm whether the target resource supports cross-domain. If the request is simple, the precheck is not triggered.

That is, once the precheck is triggered, the request will be sent twice, affecting the performance. The following optimization can be made:

  • Codomain does not fire

  • Try to avoid OPTIONS requests: Above, set content-Type for POST requests: After the preview application/json request, it can be set to application/x – WWW – form – urlencoded | multipart/form – data | text/plain one to avoid triggering preview.

  • Set access-Control-max-age in seconds, usually 10 minutes.

πŸ‘Ί Talk about cross-domains

Browsers comply with the same origin policy. If the protocol, domain name, and port are the same, the browser is considered to be the same as the browser. When the request is initiated, if the source is different, it generates cross-domain; Cross-domain is a security policy that browsers impose on JavaScript.

πŸ‘‰ CORS: cross-domain resource sharing

CORS is when a server sets specific fields in the response header to resolve a cross-domain problem and sends the precheck request described above.

  • Access-control-allow-origin: used to tell clients to Allow cross-domain sources. The value can be set to * table any source

  • Access-control-allow-methods: Method used to tell clients to Allow cross-domain requests

  • Access-control-allow-headers: Used to tell the client to Allow cross-domain custom request Headers

  • Access-control-allow-credentials: Specifies whether Cookie information is allowed

    To operate cookies, the following conditions must be met:

    • Set the access-Control-allow-credentials in the server response header to true

    • Access-control-allow-origin cannot be set to *. You need to specify the source. You can dynamically specify the source by obtaining the Origin field in the request header

    • When the client initiates a request, the withCredentials must be set to true

  • Access-control-max-age: specifies the validity period for sending precheck requests. During this period, no precheck requests are sent

πŸ‘‰ json

JSONP uses script tags to make cross-domain requests without the same origin restriction. It is implemented through the SRC attribute of script tags, so it can only process GET requests.

The principle is that the front end defines a method, which receives a parameter; The back end returns the method and passes the data as a parameter; When the script content is loaded, the method is called and the front end receives the data successfully.

πŸ‘‰ Nginx

Use Nginx as a proxy to solve cross-domain problems

PUT & DELETE

PUT is used to update resources and DELETE is used to DELETE resources

RESTful style uses PUT & DELETE; In practice, you can evaluate whether to use PUT & DELETE or POST.

CONNECT

Create a tunnel for the proxy server

The HTTP status code

1xx ~ The request is being processed

πŸ‘‰ 101 Switching Protocols: Used to switch Protocols, such as upgrading from HTTP to WebSocket, sending this status code if the server agrees to the change

2xx ~ The request is processed normally

πŸ‘‰ 200 OK: The request is successful

πŸ‘‰ 204 No Content: The request is successful, but No Content is returned

πŸ‘‰ 206 Partial Content: The client sends a Range request and returns the entity Content specified by content-range in the response packet

3xx ~ The browser needs to perform some special operations

πŸ‘‰ 301 Moved Permanently: Permanently redirects, such as domain name migration

πŸ‘‰ 302 Found: Temporary redirection

πŸ‘‰ 304 Not Modified: The server resource is Not Modified and can directly use the client cache that has Not expired

4xx ~ An error occurs on the client

πŸ‘‰ 400 Bad Request: The syntax of the client Request is incorrect and the server cannot understand it

πŸ‘‰ 403 Forbidden: The server rejects client requests

πŸ‘‰ 404 Not Found: The resource does Not exist on the server

πŸ‘‰ 408 Request Timeout: Indicates that the Request times out

πŸ‘‰ 429 Too Many Request: Too Many requests are sent in a short period of time

5xx ~ An error occurs on the server

πŸ‘‰ 500 Internal Server Error: indicates an Internal Server Error

πŸ‘‰ 503 Service Unavailable: The server is temporarily overloaded or is down for maintenance and cannot process requests at this time

HTTP cache

Strong and negotiated caching

The difference between the two is whether the local cache needs to be verified to the server; The strong cache directly uses the local cache. The negotiated cache determines whether to use the local cache after negotiation with the server.

Strong πŸ‘‰ cache

Cache-control: max-age= XXX,public to access the resource again within XXX seconds, using the local Cache. Public: allows the client and proxy server (CDN) to cache.

Tips: Always check Disable cache when testing

Problems: No resource updates are sensed and the cache is not refetched if it does not expire

❓ that should not use a strong cache, instead of a negotiated cache

No, the biggest problem with negotiating caches is that you have to validate the resource to the server every time. The meaning of caching is to reduce requests, use more local resources, give users a better experience and reduce the pressure on the server. The best practice is to hit strong caches whenever possible and invalidate client caches when updating resources.

Therefore, when packaging static resource files, hash the content to generate file names. The requested resource URL changes and the browser reloads the resource.

The browser automatically allocates storage to memory cache or disk cache.

πŸ‘‰ Negotiation cache

πŸ‘Ί last-modified and If – Modified – Since

As shown in the figure, the volume of the first request is 8.7KB, and after hitting the cache, the volume is 225B.

The first time a resource is requested, the server returns last-Modified in the header field. When a resource is requested again, the browser automatically carries if-modified-since, which is compared to last-modified.

If they are the same, 304 is returned and the cache is hit. If not, return 200 and send the latest resource.

Disadvantages: Last-Modified records a time format accurate to the second. If the resource changes within 1s, last-Modified does not perceive the change and cannot return the new resource.

πŸ‘Ί Etag and If – None – Match

The first time a resource is requested, the server returns an Etag in the header field. When the resource is requested again, the browser automatically carries if-none-match, which is compared with Etag.

If they are the same, 304 is returned and the cache is hit. If not, return 200 and send the latest resource.

In contrast to last-Modified, Etag is a unique identifier that is generated based on the contents of the file and changes when the file changes. When both exist together, Etag takes precedence over Last-Modified.

πŸ‘‰ CDN cache

The full name of CDN is content distribution network, which can be likened to train ticket agent; The server distributes the content to CDN nodes all over the country, and users obtain the required content nearby, reducing the request time; At the same time, it also plays the role of diversion, reducing the load pressure for the server. The key technology of CDN is content storage and content distribution.

  • The browser cache improves the second access speed of the page, while the CDN cache is optimized at the network level and helps us improve the first access speed.

  • CDN service chamber will provide strong brush function, we can manually refresh CDN cache;

  • HTTP is a request-response protocol. In CDN and server, THE CLIENT is the requester and the server is the responder. In CDN and browser, the browser is the client and the CDN server is the responder.

  • Since they both follow the HTTP protocol, the Cache method is the same, cache-control: s-maxage Cache. The concept of s-maxage is the same as that of max-age. It is used to set the cache time and its priority is higher than that of max-age.

The preceding figure shows the result of ping chat.deeruby.com on the local server and server after the CDN is configured. Different IP addresses are pinged, indicating that different CDN nodes are pinged.

HTTP State Management

HTTP is a stateless protocol in which each request is independent and irrelevant. But in practical application, we need to save some states, such as login state, and so on, so Cookie is introduced.

Cookie is data in the form of key & value carried by the server through the set-cookie field and sent to the browser. The browser saves it locally and sends it to the server through the Cookie field the next time it requests the same server.

  • Cookies are stored on the client side

  • Is key & value data, stored as a string, and the amount of data stored is small -> eg: k1=v1; k2=v2; k3=v3;

  • Not shared across domains

  • You can view it in Application -> Cookies in your browser

πŸ‘Ί Path: specifies the Path in which the CooKie takes effect. / indicates that all paths in the current domain take effect

πŸ‘Ί domain: The domain name specified by the domain attribute can match the end. For example, after example.com is specified, v1.example.com or v2.example.com can send cookies

πŸ‘Ί Expires & max-age: sets the Cookie validity period. If the Cookie Expires, the Cookie will be deleted and not sent to the server

πŸ‘Ί HttpOnly: Prevents JavaScript scripts from getting cookies

When HttpOnly is set, the value obtained by document.cookie is null

πŸ‘Ί Secure: Cookies can only be transmitted over HTTPS

πŸ‘Ί SameSite: Used to determine whether cookies can be passed across sites

Cross-site: the same valid top-level domain + secondary domain is called same-site, such as A.deeruby.com and B.deeruby.com same-site.

SameSite=None: Allows cookies to be carried across sites. If you want to set SameSite=None, cookies must be set to Secure, which means that the Cookie will only be sent over HTTPS

SameSite = Strict: Cross-site cookies are forbidden

SameSite = Lax: Cookies can only be carried if the GET method submits the form or the A tag sends a GET request

πŸ‘‰ Tips: Cookies are stored on the client and are easy to be intercepted; The information can be stored in the Session of the server, and the SessionId field can be transmitted through the Cookie to verify the user information.

HTTP persistent connection

HTTP protocol is connectionless, each request and receive the response will be disconnected, resulting in a large amount of resource waste, so persistent connection for resource reuse.

Persistent connections, also called long connections, are opened with Connection: keep-alive. HTTP is the application layer protocol, TCP is the transport layer protocol, so its essence is TCP persistent connection.

Since HTTP is a request and response protocol, it can only be initiated by the client, so the WebSocket protocol is introduced, which is used for two-way communication between the client and the server. Interested partners can jump to another article about WebSocket to build a simple chat room

Upgrade to WebSocket requires the Upgrade field and specify Connection: Upgrade. The following figure shows that this is an upgrade request, the upgrade protocol is WebSocket

Proxy, gateway, tunnel

πŸ‘‰ agent

A proxy is a middleman role between the server and the client, receiving requests sent by the client and forwarding them to the server, and also receiving responses returned by the server and forwarding them to the client.

The CDN mentioned in the HTTP cache above is one type of proxy

The proxy server needs to be identified by the Via field. The proxy order in Via is the order in which packets are transmitted in HTTP transmission

Forward proxy: The proxy server proxies the client to interact with the server. In this case, the server receives the request from the proxy server, which can be used to hide the real client or solve the access restriction problem.

Reverse proxy: In contrast with forward proxy, the proxy server proxies the server to interact with the client. The client receives response information from the proxy server, which can be used to hide the real server or help the server perform load balancing and security defense.

πŸ‘‰ gateway

Used to connect two or more applications using different protocols, is the role of protocol converter, such as sending mail via browser [HTTP] [POP3].

Web gateway: The gateway that uses HTTP on one side and other protocols on the other side is called the Web gateway

Security gateway: Converts the internal HTTP to HTTPS upon outbound from the gateway, and converts the external HTTPS to HTTP upon inbound from the gateway

πŸ‘‰ tunnel

Tunneling adds a new LAYER of IP addresses to IP packets, so that packets that cannot pass through can pass through. In this case, the server receives the outer IP address.

The tunnel itself does not parse the HTTP request, and the request is forwarded to the subsequent server as is. The tunnel ends when the communication ends. VPN uses tunnel technology.

When tunneling is used, CONNECT requests are made

HTTP content negotiation

When accessing the same URI, the server can return different resources as required. The specific resources to be returned are determined by the client and server through negotiation. The mechanism is HTTP content negotiation mechanism.

Accept means that the client tells the server what data it wants to receive, while Content means that the server actually sends the data. The header field related to Content negotiation is as follows:

  • Accept

  • Accpet-Language

  • Accept-Charset

  • Accept-Encoding

  • Content-Type

  • Content-Language

  • Content-Encoding

Quality factor: Quality factor Q is used for weight in content negotiation. The value ranges from 0 to 1, and the default value is 1.

Accept-Language: zh-cn,zh; Q = 0.8, en - us; Q = 0.7, en. Q = 0.6Copy the code

Search for resources from left to right in order from high priority to low priority until corresponding resources are found.

Now let’s pair up and talk about these fields

πŸ‘‰ Accept && Content-type: Specifies the data Type

Media type: Also known as MIME type, used to represent the nature and format of a document, file, or byte stream, using the Type/subtype structure

The types are as follows: text, image, audio, video, application, binary

Multipart: The Multipart /form-data type is commonly used to send FormData data. I used this type in the file upload Guide

πŸ‘‰ accpet-language && Content-language: Used to specify the supported Language through which internationalization can be achieved

πŸ‘‰ accept-charset && Content-type: specifies a character set, such as UTF-8

< Content-Type: text/html; charset=utf-8
> Accept-Charset: charset=utf-8
Copy the code

πŸ‘‰ accept-encoding && content-encoding: Specifies the compression mode, such as gzip

HTTP range request

For large files, HTTP allows clients to request a portion of the resource at a time, known as a range request.

Accept-ranges is used to tell the client whether the server can process range requests. Bytes can be processed, but none cannot be processed.

Range tells the server what part to request and returns 206 if a Range request is made.

Range: bytes=500-999
Copy the code
  • 500-999: indicates the 500th byte to the 999th byte

  • 500- : indicates the 500th byte to the end of the file

  • -499: the value ranges from 0 to 499 bytes

UA

User-agent: indicates the application type, operating system, software developer, and version of the client.

Can use window. The navigator. Looking at the machine UA information userAgent

The general format is as follows: Mozilla/5.0 (Platform information) Engine version Type and version of the browser, which is used for device identification and content differentiation or compatibility processing.

HTTPS

HTTPS = HTTP + TLS (Transport layer encryption protocol, formerly SSL protocol), which is used to solve the problem of middlemen listening and tampering during HTTP plaintext transmission.

To ensure performance and security, HTTPS uses asymmetric encryption and symmetric encryption

To ensure that public key transmission is not tampered with, digital signature is used for verification, and THE PUBLIC trust of HTTPS certificate is guaranteed by CA and system root certificate mechanism

πŸ‘‰ configuration HTTPS

A recommended website for obtaining security certificates for free is freessl.cn

Select Browser generation, which automatically downloads the certificate and uploads the downloaded certificate to the server.

We fill it into the domain name resolution, the host record is TXT record, the record type is TXT, and fill in the above record values.

Then configure nginx:

server { listen 443 ssl; listen [::]:443 ssl; server_name sso.deeruby.com; ssl_certificate /etc/nginx/ssl/sso/full_chain.pem; ssl_certificate_key /etc/nginx/ssl/sso/private.key; ssl_session_cache shared:SSL:1m; ssl_session_timeout 5m; ssl_ciphers HIGH:! aNULL:! MD5; ssl_prefer_server_ciphers on; error_page 404 /404.html; error_page 500 502 503 504 /50x.html; Location / {proxy_pass http://127.0.0.1:3001; } } server { listen 80; server_name sso.deeruby.com; rewrite ^(.*) https://$server_name$1 permanent; }Copy the code

Reference links & extended reading

【 wechat reading 】 illustration HTTP

The definitive guide to HTTP

The soul of HTTP, strengthen your HTTP knowledge system

【 Amandakelake 】GET and POST: dialectical look at 100 continue, and the fundamental difference

Explain CORS and how to save an OPTIONS request

[CHEndorid] Discussion on the security of HTTP PUT, DELETE and other methods

[Black Gold Team] Best practices for front-end caching

How to explain reverse proxy to your girlfriend?

【Jerry Qu】HTTP proxy principle and implementation

What is user-agent? How do I get the browser version and type?

“Big Front-end Advanced Security” series HTTPS details