Let’s take a look at the five-layer network protocol stack:

1. Application layer (DNS and HTTP) : The DNS resolves an IP address and sends an HTTP request 2. Transport layer (TCP and UDP) : the transport layer is responsible for locating the specific process of data processing, forwarding data, and establishing TCP connections (three-way handshake). UDP provides unreliable data services. DNS resolution Sends domain name resolution requests to the local PREFERRED DNS server. The DNS sends domain name resolution requests to port 53 of the DNS through UDP. 3. Network layer (IP and ARP) : IP address 4. Data link layer (PPP) : encapsulates frame 5. Physical layer (using physical media to transmit bitstreams) : physical transport

DNS resolution

The Domain Name System (DNS) is short for Domain Name System. It is a naming System for computers and network services organized into Domain hierarchies. It is used on TCP/IP networks and provides services for converting host names and Domain names into IP addresses.

Process:

1. The browser checks whether the IP address of the parsed domain name exists in the cache. If yes, obtain the cache address. 2. If the browser does not match the cache, the browser checks the corresponding parsed result in the operating system. The operating system also has a domain name resolution process. You can set the IP address of the GitHub domain name in the hosts file. Then the browser will use this IP address first. 3. If no match is found, the browser extracts the domain name field, such as Baidu.com, from the RECEIVED URL, and requests the local domain name server (LDNS) to resolve the domain name. This server is usually close to you and caches the domain name resolution result. 4. If the local Server does not have an address corresponding to the cached domain name, the Server directly hops to the Root Server for resolution. 5. The LDNS sends the request to the gTLD. 7. The gTLD that receives the request searches for and returns the IP address corresponding to the domain name to the LDNS

2. Establish a TCP connection

TCP three-way handshake

The client sends a SYN packet to the server and waits for confirmation. 2. The server receives and confirms the SYN packet and sends a SYN + ACK packet 3 to the client. The client confirms to receive the SYN + ACK packet from the server and sends the ack packet to the server. After the connection is established, three TCP connections are completed

The three-way handshake is the minimum number of times that a client and a server need to let each other know that their ability to receive and send is ok

At the TCP/IP level, there is another difference between POST request and GET request: GET request will generate one TCP packet, and POST request will generate two packets. (1) In GET request, the browser will send the header and data together, and the server will return data in response to 200. (2) In the POST request, the browser first sends the header, and the server responds with 100 indicating that it can continue. The browser then sends data, and the server responds with 200 returning data

Send HTTP request. The server processes the request and returns HTTP packet

1. The HTTP packet structure includes a common header, a request/response header, and a request/response body

2. HTTP common status code:

301: Permanent redirection. The requested resource has been permanently moved to a new location, and any future references to this resource should use the URL returned by this response 302: Temporary redirection, which requires the client to perform a temporary redirection, and the client continues to send future requests to the original address. 303: The response to the current request can be found at another URL, and the client should access that resource using get. This method exists primarily to allow the output of post requests activated by the script to be redirected to a new resource. 404: The server receives the request, but refuses to provide service (authentication failure) 404: The requested resource does not exist 502: The server has an internal error

3. Common request headers and response headers

Request header: Accept (the MIME Type supported by the browser) Response header: Content-Type (the Type of entity Content returned by the server)

Request header: accept-encoding (The type of compression supported by the browser, such as gzip, cannot be received beyond the type)

Request header: Content-Type (the Type of entity Content sent by the client) Response header: Content-Type (the Type of entity Content returned by the server)

Request header: cache-control (specifies the caching mechanism followed by the request and response) Response header: max-age (how many seconds the client’s local resources should be cached)

Request header: if-modifyIEd-since response header: last-Modified

Request header: if-none-match (used to Match whether the file contents have changed) Response header: ETag (the current value of the entity tag of the request variable, which is a fingerprint of the file. The ETag changes as soon as the file is changed.

Response header: set-cookie (Set the cookie associated with the page, and the server passes the cookie to the client through this header)

Response header: keep-alive (If the client has keep-alive, the server will also respond)

Request header: Origin response header: assess-Control-allow-Origin

Request header: Host (url of the request server); ; Referer (source URL for this page)

Access-control-allow-headers (request Headers allowed by the server); Access-control-allow-methods (server allowed request Methods); Set-cookie (sets the Cookie associated with the page, and the server passes the Cookie to the client through this header)

4. The cookie and session

On the login page, the server will generate a session, which contains information about the user (such as account number, password, etc.), and then there will be a sessionId (equivalent to the key corresponding to the session of the server), and then the server will write cookies in the login page. When you visit pages under the same domain name in the future, the cookie will be automatically brought and automatically checked

5. Strong cache and negotiation cache

(1) Strong cache and negotiated cache

  • Strong cache: If the browser determines that the local cache is not expired, it directly uses it without sending an HTTP request

  • Negotiated cache: The browser makes an HTTP request to the server, which then tells the browser that the file has not changed and lets the browser use the local cache

(2) How to judge strong cache and negotiated cache

  • Strong cache

Http1.0: Pragma/Expires http1.1: cache-control/Max – Age

Cache-control: public: Shared cache, which can be cached by browsers or proxy servers private: private cache, which can only be cached by browsers No-cache: local cache, which can only be cached by proxy servers, but this cache must be authenticated by the server. No-store: disables caching completely. Local and proxy servers are not cached, and max-age is obtained from the server every time: when max-age>0, it is directly extracted from the browser cache; Max-age <=0 Sends an HTTP request to the server to confirm whether the resource is modified

  • Negotiate the cache

Http1.0: the if-modified-since/last-modified http1.1: If – None – Match/ETag

6. Http1.0, http1.1, http2.0

(1) Http1.0 and HTTP1.1 differences

  • Cache handling: in HTTP1.0, if-modified-since Expries in the header is mainly used as a cache judgment standard; Http1.1 introduces ETag, if-unmodified-since, if-none-match, and more cache headers to control cache policy.

  • Bandwidth optimization and network connection usage: In HTTP1.0, there are some wasted broadband phenomena, such as the client only needs a part of an object, but the server sends the whole object back, and does not support resumable function; Http1.1 introduces the range header field in the request header, which allows only a portion of the resource to be requested, i.e. the return code is 206

  • Error code notification management: add 24 error status response codes in HTTP1.1, such as 409 indicates that the requested resource conflicts with the current state of the resource, 410 indicates that a resource on the server is permanently deleted, etc

  • Host header handling: Http1.0 assumed that each server was bound to a unique IP address, so the URL in the request message did not pass the host name. However, with the development of virtual host technology, there can be multiple virtual hosts on each physical server, and they can share one IP address. Http1.1 both requests and responses support the host header field

  • Long connection: Connection:keep-alive is enabled by default in http1.1. HTTP :keep-alive is enabled by default in http1.1. This somewhat compensates for the point where HTTP1.0 creates a connection on every request

(2) New features of HTTP2.0

  • New binary format: http1.x parsing is text-based; Protocol resolution for HTTP2.0 is in binary format

  • Multiplexing: Each request is used as a connection sharing mechanism. Each request corresponds to an ID. In this way, a link can have multiple requests, and each connected request can be randomly mixed together. The receiving side can subgroup the request into different server requests based on the request ID.

  • The header compression: The http1.x header contains a large amount of information, and must be sent repeatedly each time. Http2.0 uses encoder to reduce the size of the header fields that need to be transmitted

  • Server push

7. The difference between HTTP and HTTPS

1. The HTTPS protocol requires a CA to apply for a certificate.

2. HTTP runs on TCP and all transmitted content is plaintext. HTTPS runs on SSL/TLS and SSL/TSL runs on TCP and all transmitted content is encrypted.

3. HTTP and HTTPS use completely different connection modes and different ports. HTTP is 80 and HTTPS is 443

8. Symmetric and asymmetric encryption

(1) Symmetric encryption

Symmetric encryption means that encryption and decryption are the same secret key. The advantage of symmetric encryption is fast encryption, but the disadvantage is also obvious, must keep the secret key, if the secret key is lost, it will bring great risk

(2) asymmetric encryption

The so-called asymmetric encryption is to generate a pair of secret keys, divided into a public key and a private key, the private key is kept, the public key published

9. HTTPS principle

(1) Certificate verification stage (asymmetric encryption)

  • The browser initiates an HTTPS request
  • Server returns HTTPS certificate (including public key)
  • The client verifies whether the certificate is valid and prompts a warning if the certificate is invalid

(2) Data transmission stage (symmetric encryption)

  • When the certificate is valid, a random number is generated using the public key locally
  • The public key encrypts the random number and transmits the encrypted random number to the server
  • The server decrypts random numbers using private keys
  • The server constructs symmetric encryption through the random number passed in by the client, encrypts the returned content and transmits it

4. The browser parses the rendered page

Each TAB page is a browser kernel process, and each process is multi-threaded, with the following types of child threads:

1. The GUI thread

This thread is used to parse HTML into a DOM tree, parse CSS into a CSSDOM tree, and draw paint. When a page needs to rearrange reflow and redraw repaint, this thread is used, which is mutually exclusive with the JS engine

Bytes ->character ->tokens -> Nodes -> Object Model

Render tree: A render tree formed by combining a DOM tree with a CSSSOM tree (elements that cannot be displayed, such as script, head, or display: None, are not included in the render tree and therefore will not be rendered). The layout of the page is based on the render tree

2. Js engine threads

It parses JS scripts and executes codes, and is mutually exclusive with GUI threads, that is, GUI threads will be suspended when JS engine threads run, and GUI threads will continue to run when JS engine threads finish running, which consists of one main thread and multiple Web worker threads (such as macro tasks and micro tasks). Since the Web worker is attached to the main thread and cannot operate DOM, JS is still a single-threaded language

Js blockage condition

To be clear, downloading and parsing js files will block parsing of HTML files and page rendering. Because JS scripts may change the DOM structure, if it does not block the parsing and page rendering of HTML files, then when JS scripts change the DOM structure or element style, it will cause backflow and redraw, which will cause unnecessary performance waste.

If you don’t want js congestion, you can use the Async or defer properties to load the JS file asynchronously and execute it immediately after loading

Defer is lazy and will look like a script after the body in the browser, just before the DOMContentLoaded event (only when the DOM has loaded, without the stylesheet or image)

Async is an asynchronous execution. It is executed after the asynchronous download, in no order, before the onLoad event (all the DOM, style sheet and image on the page have been loaded)

3. The event triggers the thread

When the corresponding event (either webapis completion event or page interaction event) is triggered, the thread places the corresponding callback function in the callback queue, waiting for the JS engine thread to process it

4. Timer threads

Corresponding to setTimeout, setInterval API, the thread will time. When the timing ends, the corresponding callback function will be put into the task queue. When the timing time of setTimeout is less than 4ms, it will be calculated as 4ms

Asynchronous HTTP network request threads

This thread is opened for each HTTP request. When a state change is detected, a state change event is generated. If the state change event has a callback function, it is put into the task queue, which polls the thread for listening to the task queue to know whether the task queue is empty