directory
1. Common methods of HTTP/1.1 URI and URL Packet structure Status code Cookie 2. HTTP/2.0 3. HTTPS 4. Cross-domain same-origin Policy Solution 5 Performance Optimization AppendixCopy the code
This article is an original article, ten thousand words long, it is recommended to read after the first code.
A, HTTP / 1.1
HTTP was introduced in 1990, but was not officially published as a standard until May 1996, when HTTP/1.0 was born. In 1999, HTTP/1.1 was released, the current mainstream version of the HTTP protocol.
The HTTP protocol works at the application layer. The request is sent from the client, and the server responds to the request and returns.
The URI and URL
A URI is an abbreviation of Uniform Resource Identifier, which is used to identify abstract or physical resources.
Uris can be classified as URLS, UrNs, or a compact string with both locators and Names features.
- The URN is like a name, establishing its identity.
- A URL is like an address, providing a way to find it
HTTP Message Structure
The information exchanged through THE HTTP protocol is called HTTP packets.
⇩ HTTP packet structure
An HTTP packet consists of three parts
- Header: The content and properties of the request or response to be processed by the server or client
- Blank line (CR+LF) : CR- carriage return, LF- newline
- Message body: the data to be sent
The HTTP packet sent by the client is called a request packet, and the HTTP packet returned by the server is called a response packet. The main difference is in the packet header.
The request packet header from top to bottom is as follows:
- Request line: Contains the method used for the request, the request URI, and the HTTP version
- The first field
- Request header field
- Generic header field
- Entity header field
- Other fields
From top to bottom, the response packet header is:
- Status line: Indicates the status code, cause phrase, and HTTP version of the response result
- The first field
- Response header field
- Generic header field
- Entity header field
- Other fields
The HTTP/1.1 specification defines 47 header fields, as shown in the appendix at the end of this article. The common content-type is covered here, and other common header fields will be covered in the following sections.
Content-type: Specifies the media Type of the object within the entity body
- Text/HTML: HTML document tag
- Text /plain: Plain ASCII document markup
- Text/XML: Ignores the encoding specified in the XML header and defaults to US-ASCII encoding
- Image/JPEG: Jpeg image tag
- Image/GIF: GIF image tag
- Application /javascript: JS document tag
- Application/XML: XML file tag; Use data in XML format
- Application /octet-stream: indicates any binary data
- Application/X-wwW-form-urlencoded: used for form submission, organize request parameters with key1= VAL1&KEY2 = VAL2, and place them in the request entity
- Application /json: Data is organized in key-value pairs. This uses this type and requires that the parameters themselves be JSON data, which is put directly into the request entity without any processing
- Multipart /form-data: multi-part multimedia type; First a boundary is generated to divide the different fields. In the request entity each parameter starts with ——boundary, then additional information and parameter name, then blank lines, and finally the parameter content. Multiple parameters will have multiple boundary blocks. If the parameter is a file there is a special file field. Finally, ——boundary – is used as the end sign. Multipart /form-data supports file upload formats. Generally, multipart/form-data is used for forms that require file upload.
Only some of the commonly used optional values are listed. For the full version, go to MIME Reference here
Common status code
The responsibility of the status code is to describe the result of the request returned when the client sends a request to the server. With the status code, the user can tell whether the server handled the request properly or an error occurred.
1xx: Informational status code. The received request is being processed
- 100: The request has been partially processed
- 101: switching protocols
2XX: Success status code. The request is processed properly
- 200: The request succeeds
- 204: Request processed successfully, but no content returned
- 206: Returns the contents of the specified range
3xx: Redirection (Redirection status code). Additional operations are required to complete the request
- 301: permanent redirection
- 302: temporary redirect
- 303: Temporary redirect, requiring the GET method
- 304: The requested resource is not changed, use the cache directly
4xx: Client Error. The server cannot process the request
- 400: request error
- 401: Unauthorized
- 403: Access is denied by the server
- 404: No requested resource on server
5xx: Server Error: the Server fails to process the request
- 500: Server internal error
- 503: The server is temporarily unable to process requests
- 504: The server, acting as the gateway or proxy, does not receive the request from the upstream server in time, and the request times out
Cookie
HTTP is a stateless protocol. And some of you might say, well, doesn’t it have a status code? Why is it stateless?
That’s true. Status code refers to the state of communication, whereas stateless refers to not saving state. Each time a new request is sent, a new response is generated. The protocol itself does not retain information about all previous request or response packets. This is designed to handle a large number of transactions and ensure the scalability of the protocol.
Let’s say it’s a web page you need to log in to, but HTTP itself doesn’t record the previous state, so you have to log in again every time you jump to a new page.
Cookie was introduced to solve a similar contradiction while preserving the stateless protocol feature.
The working mechanism of Cookie is user identification and state management. In order to manage the user’s state, Web sites temporarily write some data to the user’s computer through the Web browser. Then when the user visits the Web site, the Cookie can be retrieved through communication.
A Cookie consists of the following fields
- Name: name of the
- Value: the value
- Domain: Domain that allows access to this cookie
- Path: The path of the page that allows access to this cookie
- Expires/max-age: Specifies the timeout period. If it is not set, it will expire at the same time as the session (close the browser).
- Size: Indicates the size of the cookie
- Httponly: If true, the cookie is only available in the HTTP request header and cannot be accessed through document.cookie (which can prevent information theft to some extent).
- Secure: Whether this cookie can only be delivered over HTTPS
Example: username=John Doe; expires=Thu, 18 Dec 2043 12:00:00 GMT
Commonly used method
HTTP/1.1 provides eight available methods: GET, POST, PUT, HEAD, DELETE, OPTIONS, TRACE, and CONNECT
- GET: Obtains resources. Parameters are directly concatenated in the URL, and the length is limited. Different browsers and servers have different limits on the length.
- POST: indicates the transfer entity body. The parameter (entity body) is transmitted as the payload data (supplement) of the request
- PUT: Used to transfer files. It does not have its own verification mechanism, so anyone can upload files. Not recommended
- HEAD: indicates the packet header. The body of the packet is not returned. It is mostly used to confirm THE URI
- DELETE: deletes a file. As with PUT, there is no validation mechanism
- OPTIONS: Queries the supported methods for the resource specified by the request URI.
- TRACE: TRACE path. You can query how the outgoing request was processed, modified/modified. Easy to cause XST (cross-site Tracking) attacks, usually not used.
- CONNECT: Requires to establish a tunnel to communicate with the proxy server to implement TCP communication using the tunnel protocol. The Secure Sockets Layer (SSL) and Transport Layer Security (TLS) protocols are used to encrypt the communication content and transmit it over the network tunnel.
It’s also said on the Web that a POST request sends two packets, first the header and then the entity; Get sends only one packet. It’s actually true. How many packets you send depends on how the client implements it. So far only Ruby is known to do this
Note: Entity body refers to the payload data (supplementary) transmitted as a request or response, whose contents consist of the entity head and entity body
Second, the HTTP / 2.0
In May 2015, the official version of HTTP/2.0 protocol was released. Significantly improved web performance compared to HTTP 1.x. Network latency is further reduced with full semantic compatibility with HTTP/1.1.
Let’s start with the Chrome Developer Tools to see how HTTP/2.0 compares to HTTP/1.1
It takes 46% to 95% less time to load the same resource using HTTP /2.0 than using HTTP /1.1.
Why so “fast”?
It all starts with the origins of HTTP/2.0. HTTP/2.0 is based on the SPDY protocol developed by Google. It uses compression, multiplexing, and priority technologies to reduce the load time of web pages and improve security. The core idea of THE SPDY protocol is to minimize the number of TCP connections.
Some of the improvements or features of HTTP/2.0 include:
- Binary framing
- Add a binary frame layer between the application layer and the transport layer
- In the binary framing layer, all transmitted information is split into smaller messages and frames and encoded in binary format
- multiplexing
- There are multiple streams within a TCP connection, that is, multiple requests can be sent simultaneously
- Frames are sent out of order on the client and reassembled on the peer end according to the stream identifier at the beginning of each frame.
- The first compression
- The transmitted headers are encoded using the HPACK (HTTP2 header compression algorithm) compression format
- An index table is maintained at both ends to record the headers that have appeared. Later, the key names of the recorded headers can be transmitted in the transmission process. After receiving data, the peer end can find the corresponding value through the key names
- Server push
- A server can send multiple responses to a single request from a client
Third, HTTPS
HTTP is a good protocol, but it’s not secure enough. There are mainly three aspects:
- Because the communication is in plain text, the content can be “eavesdropped”
- The identity of the communicating party is not verified, and therefore may be subject to identity fraud
- The integrity of the packet cannot be verified and may be tampered with
Therefore, we needed a protocol that could encrypt, authenticate the identity of the communicating party, and protect the integrity of the message. Hence HTTPS.
HTTP+ encryption + authentication + integrity protection = HTTPS
We call HTTP with the added encryption and authentication mechanisms HTTPS.
It is important to note that HTTPS is not an application-layer protocol, but rather a general term for SSL and HTTP. (SSL refers to SSL3.0 and TSL1.0, after all, TSL is also a protocol based on SSL.)
As the figure above shows, usually HTTP communicates directly with TCP; When SSL is used, HTTP communicates with SSL and SSL communicates with TCP.
For the three HTTP problems listed above, HTTPS is improved by the following methods
-
A hybrid encryption method of symmetric encryption and asymmetric encryption is adopted to ensure that the content is not “eavesdropped”. Specific practices are as follows:
- The client requests the public key from the server
- The client generates a random password string and encrypts it with the public key before sending it to the server
- The server uses the private key to decrypt the packet to obtain the random password string
- Subsequent communication uses random cipher strings to encrypt communication
-
There are two types of HTTPS authentication:
- Verification of the public key
- The server logs in to the digital Certificate authority with its public key
- The certification authority uses its private key to digitally sign the public key of the server and issues a certificate
- After obtaining the server certificate, the client uses the public key of the ca to verify the signature to verify the authenticity of the public key
- Client identity authentication
- The client certificate has been distributed to the client and must be installed on the client
- The server sends a Certificate Request message, asking the client to provide the client Certificate
- The Client sends the Client Certificate information to the server in Client Certificate packets
- After the authentication succeeds, the server obtains the public key of the client in the certificate and starts encrypted communication
- A Message Authentication Code (MAC) Message digest is attached when data is sent. MAC can check whether the packet is tampered with, thus protecting the packet integrity.
Finally, take a look at the complete HTTPS communication process
-
The Client sends a Client Hello packet to start SSL communication. The packet contains the SSL version and encryption components
-
When SSL communication is enabled, the Server responds with Server Hello packets. The packet contains the SSL version and the encrypted component content filtered from the encrypted components sent by the client
-
The server sends the Certificate packet. The packet contains a public key certificate.
-
The Server sends a Server Hello Done packet to notify the client. Description The first SSL handshake ended
-
The Client sends a Client Key Exchange packet. Contains the random password string encrypted with the public key obtained in Step 3 (pre-master secret).
-
The client continues to send a Change Cipher Spec packet. Inform the server that subsequent communication is encrypted using the random password string of Step 5.
-
The client sends a Finished packet. Contains the overall checksum value of all packets connected so far
-
The server also sends a Change Cipher Spec packet
-
The server also sends Finished packets
-
After exchanging Finished packets, the SSL connection is established
-
Application layer protocol communication, that is, send HTTP response
-
Finally, the client disconnects. When disconnection occurs, the close_NOTIFY packet is sent
-
The client then sends a TCP FIN packet to close the TCP communication
Fourth, cross-domain
Based on the same origin policy, cross-domain occurs when at least one protocol, domain name, or port is inconsistent
When cross-domain occurs, the request was made correctly and the server returned data, but it was intercepted by the browser.
The same-origin policy
Same-origin policy is a security policy that restricts the interaction of documents or scripts from one source with resources from another source. It can block malicious documents and reduce the media that may be attacked.
The restrictions include:
- Cookie, localStorage, sessionStorage, and indexedDB
- DOM node
- An ajax request
Tags that allow resources to be loaded across domains:
<img src=xxx>
<script src=xxx>
<link href=xxx>
Note: CORB may intercept requests from
The solution
There are five common ways:
- JSONP: It’s essentially using
<script>
Tag to initiate a GET request - CORS: The server needs to set the response header
Access-Control-Allow-Origin
. If you want to send cookies, you need to setAccess-Control-Allow-Credentials
If true, the client sets the request headerwithCredentials
To true - Proxy server forward: Use the same origin server to forward cross-domain requests
- websocket
- The postMessage() method: cross-document, cross-domain, multi-window, nested iframe communication is ok
When a CORS request is initiated, if the request is complex, a precheck request using the OPTIONS method is issued before the formal initiation to check whether the server allows cross-domain requests. (Complex request is not simple request)
A simple request must meet the following two conditions:
- Methods to GET/HEAD/POST
- The content-type to text/plain/multipart/form – data/application/x – WWW – form – urlencoded
Five, the attack
Security issues in Web applications include but are not limited to: cross-site scripting (XSS), SQL injection, OS command injection, HTTP header injection, Email header injection, forced browsing vulnerability, forced browsing, incorrect error message handling, open redirection, session hijacking, session fixed attacks, cross-site Request forgery (CSRF)
That’s a bit much, but here are two common attacks that are common
XSS
XSS stands for cross-site scripting attacks.
An attack by running illegal HTML tags or JavaScript in the browser of a registered user of a secure Web site
Common forms of attack are:
- Use forged forms to trick users into giving them information such as account numbers and passwords
- Use a script to get the user’s Cookie value
- Display content such as articles or pictures forged by attackers on the page
There are two ways to defend against XSS attacks:
- Set the filter to filter the data entered by the user in advance
- Use the Cookie’s HttpOnly field to prevent malicious scripts from reading the Cookie
CSRF
CSRF Cross-site request forgery attack.
Passive attack refers to a trap set by an attacker to force an authenticated user to update his or her status with unexpected personal information or setting information
Combined with the case to explain
- The attacker logged on to the site M and posted messages containing malicious code on the site’s message board
<img src='http://exampleM.com/gift/send?up=xxx'>
- After user A logs in to website M and browses the message board, the malicious code left by the attacker is triggered
- The Cookie in the browser already contains user A’s login information. After triggering the malicious code, user A sends A gift to the attacker in the identity of user A
You can defend against CSRF attacks from the following aspects
- Do not use GET requests to make changes to data
- Prevent third parties from obtaining user cookies (Cookie SameSite attribute)
- Block third party request interface
- The authentication information, such as Token and verification code, is included when the request is initiated
Cookie SameSite attribute three values are optional:
- Strict: indicates that third-party cookies are completely prohibited
- Lax (Chrome default) : Cookies are sent only for GET requests that navigate to the target URL, which includes only three cases: links, preloaded requests, and GET forms
- None: disables the property. However, the Secure attribute must be set at the same time (cookies can only be sent over HTTPS), otherwise it will not work.
Six, caching,
When we talk about caching here, we mainly refer to the caching mechanism of HTTP
The HTTP caching mechanism is that the server response header is configured to tell the browser whether the resource should be cached, whether to force validation of the cache, and for how long. The process by which the browser sends the request header to verify whether the cache is available or to reacquire resources based on whether the response header should be cached or when the cache expires
Strong cache
After the server accesses the data for the first time, the strong cache does not send repeated requests within the expiration time, but reads the data directly from the cache. At the heart of a strong cache is how to determine if a cache has expired.
Strong caching is implemented through cache-control.
Note: HTTP/1.0 is implemented by setting Expires. Cache-control takes precedence when both are set.
Cache-control as a request header field has the following optional directives:
instruction | parameter | instructions |
---|---|---|
no-cache | There is no | Force re-authentication to the source server |
no-store | There is no | Nothing in the request or response is cached |
Max-age = [seconds] | necessary | The maximum Age value of the response |
Stale (= [seconds]) | Can be omitted | Received an expired response |
Min-fresh = [seconds] | necessary | The response is expected to remain valid within the specified time |
no-transform | There is no | Agents cannot change media types |
only-if-cached | There is no | Get resources from the cache |
cache-extension | – | New directive token |
Cache-control as a response header field has the following optional directives:
instruction | parameter | instructions |
---|---|---|
public | There is no | A cache of responses can be provided to any party |
private | Can be omitted | Only a response is returned to a specific user |
no-cache | Can be omitted | You must verify its validity before caching it |
no-store | There is no | Nothing in the request or response is cached |
no-transform | There is no | Agents cannot change media types |
must-revalidate | There is no | Cacheable but must be confirmed with the source server |
proxy-revalidate | There is no | The intermediate cache server is required to confirm the validity of the cached response |
Max-age = [seconds] | necessary | The maximum Age value of the response |
Maxage = [s] | necessary | The maximum Age value for the public cache server response |
cache-extension | – | New directive token |
Negotiate the cache
The negotiated cache communicates with the server each time it reads data and adds a cache identifier
- The first time the server requests the data, the server returns the data along with the cache identifier
- On a second request for the same resource, the browser sends the cache identifier to the server first
- After receiving the id, the server checks whether the ID matches
- If no, the resource is updated and the server returns new data and new cache id
- If yes, the resource is not updated. The status code 304 is returned, and the browser directly reads the data in the cache
(HTTP/1.0 caches are last-modified, and HTTP/1.1 caches are Etag)
There are strong and weak Etag values
- Strong Etag: Changes the value of the resource no matter how slight the change
- Weak Etag: Append a “W/” to the field value, as in:
ETag: W/"usagi-1234"
, the value changes only when the resource changes fundamentally
Take a look at the HTTP caching mechanism in its entirety
Negotiation cache is started after the forced cache is invalid. The browser reads the memory cache first. If the disk cache is not read, the browser reads the memory cache first.
Seven, other
fetch
First let’s see what the fetch is
The Fetch API provides a JavaScript interface for accessing and manipulating specific parts of the HTTP pipeline, such as requests and responses. It also provides a global fetch() method, which provides a simple, reasonable way to asynchronously fetch resources across the network.
The official alternative to XMLHttpRequest for asynchronous access to resources across the network
Features of FETCH:
- Based on the
promise
Realized, can also be combinedasync/await
. - Fetch requests are cookie-free by default and need to be set
Fetch (URL, {credentials: 'include'})
The three parameters of Credentials: omit (never send cookie), same-origin (send identical), include (always send) - The server does not trigger a 400 or 500 status code. The FETCH is rejected only when a network error prevents the request from completing
- All versions of IE do not support native Fetch
- Fetch does not support cancellation of a request
XMLHttpRequest
You can usexhr.abort()
Method to cancel a request (although this method is not very reliable, and it depends on the server implementation) - Fetch cannot view request progress usage
XMLHttpRequest
You can go throughxhr.onprogress
Callback to dynamically update the progress of a request, something currently not natively supported by FETCH
Performance optimization
So how do you optimize the front end on the network
1. DNS optimization DNS resolution also takes time, and the solution consists of two points:
- Reduce DNS requests
- DNS prefetch is used
You can add the following configuration to head (remember to change the domain name)
<meta http-equiv="x-dns-prefetch-control" content="on" />
<link rel="dns-prefetch" href="//picture.example.com" />
<link rel="dns-prefetch" href="//static.example.com" />
<link rel="dns-prefetch" href="//example.tool.com" />
Copy the code
2. Reduce the number of requests and resource size
- Use HTTP encoded compression techniques such as the commonly used GZIP
- Static resources go CDN, do not understand CDN recommendations see the previous article
3. The cache
- Strong cache, negotiated cache mentioned earlier
- Use Service workers to create separate threads for caching
- Use Push Cache (HTTP/2.0)
4. Go to HTTP/2.0
Appendix:
Generic header field
The field name | instructions |
---|---|
Cache-Control | Controls the behavior of the cache |
Connection | Hop by hop header and connection management |
Date | Date and time when the packet was created |
Pragma | Packet instructions |
Trailer | The header of the packet end |
Transfer-Encoding | Specifies the transmission encoding mode of the message body |
Upgrade | Upgrade to another protocol |
Via | Information about the proxy server |
Warning | Error notification |
Request header field
The field name | instructions |
---|---|
Accept | The type of media that the user agent can handle |
Accept-Charset | The preferred character set |
Accept-Encoding | Priority content encoding |
Accept-Language | Preferred language (natural language) |
Authorization | Web Authentication information |
Expect | Expect specific behavior from the server |
From | The email address of the user |
Host | The requested resource is located on the server |
If-Match | Compare entity Tags (ETags) |
If-Modified-Since | Compare the update times of resources |
If-None-Match | Compare entity tags (as opposed to if-match) |
If-Range | A range request for entity bytes is sent when the resource is not updated |
If-Unmodified-Since | Compare the update time of the resource (as opposed to if-modified-since) |
Max-Forwards | Maximum number of hops transmitted |
Proxy-Authorization | The proxy server requires authentication information on the client |
Range | Byte range request for the entity |
Referer | The original fetching party of the URI in the request |
TE | The priority of the transmission code |
User-Agent | HTTP client program information |
Response header field
The field name | instructions |
---|---|
Accept-Ranges | Whether to accept byte range requests |
Age | Calculate the elapsed time of resource creation |
ETag | Matching information about the resource |
Location | Redirects the client to the specified URI |
Proxy-Authenticate | The proxy server authenticates the client |
Retry-After | Timing requirements for making the request again |
Server | Installation information about the HTTP server |
vary | Management information cached by the proxy server |
WWW-Authenticate | The server authenticates the client |
Entity header field
The field name | instructions |
---|---|
Allow | The HTTP methods that the resource can support |
Content-Encoding | The encoding mode applicable to the entity subject |
Content-Language | The natural language of the entity subject |
Content-Length | Size of entity body (in bytes) |
Content-Location | Replace the URI of the corresponding resource |
Content-MD5 | Message digest of the entity body |
Content-Range | The location range of the entity principal |
Content-Type | The media type of the entity subject |
Expires | The expiration date and time of the entity |
Last-Modified | The last modified date and time of the resource |
Because I am not a net worker from, if there is improper expression in the article, please correct the big guy. The article is also posted on MelonField, a private public account.
Reference:
- Illustrated HTTP
- Danielmiessler.com/study/diffe…
- www.jianshu.com/p/de5845b4c…
- www.zhihu.com/question/34…
- www.chromium.org/Home/chromi…
- Blog.csdn.net/guduyibeizi…
- Blog.csdn.net/u012352278/…
- Developer.mozilla.org/zh-CN/docs/…