What happens from entering the URL to loading the page

process

  • URL parsing: Address bar URL cache
  • Check the HSTS preload list
  • DNS query: DNS caches the translation of a URL to an IP address. This process is called DNS resolution
  • ARP cache
  • TCP connection: TCP send buffer & receive buffer
  • Request processing: HTTP request cache (CDN node cache, proxy server cache, browser cache, back-end dynamic calculation result cache, etc.)
  • Accept the response and render the page

URL parsing

Address bar URL cache

According to the user’s input URL, the first part of the cache is the address bar URL cache. We just type a few letters, and the browser automatically completes the URL. When we use the autocomplete url, you will see that the requested static resources are also retrieved from the cache

Whenever we obtain the resource timeline of the main page, we should request the server again and cannot use the cache of the local browser. And why? This should be clear when you see the hash value of the static resource file name

  • Converts non-ASCII Unicode characters

The browser checks if the input contains either a-z, A-z, 0-9, – or. The character; If so, the browser will use Punycode for the hostname part

HSTS preload list

HSTS:HTTP Strict Transport Security (HTTP Strict Transport Security) The International Internet engineering organization IETE is implementing a new Web Security protocol to force clients (such as browsers) to use HTTPS to establish connections with servers.

With HSTS: Browsers that support the HSTS protocol will check the HSTS preload list (this list contains domain names that request the browser to use HTTPS only) after entering the URL. If the site is in the list, the browser will use HTTPS and return the code 307. Browsers that do not support HSTS visit our site without redirecting, thus improving compatibility.

Such as the Denver nuggets input http://juejin.im/timeline will jump to https://juejin.im/timeline:

Other operating

The browser will also perform some additional operations, such as security checks, access restrictions (previously limited by Domestic browsers 996. Icu)

DNS resolution process (enter juejin. Im and press Enter to start domain name resolution for juejin. Im)

The process of DNS resolution is to find which machine has the resources you need, and also a process to obtain the IP address of the target domain name.

Domain-to-ip lookups take too long and take too many steps. Therefore, DNS needs to be optimized. DNS has multiple levels of cache, sorted by distance from browser: browser cache, system cache, router cache, IPS server cache, root DNS cache, top-level DNS cache, and primary DNS cache

The browser searches its DNS cache (the browser maintains a map of domain names and IP addresses); If no, go to the next step. Search DNS cache in operating system; If no, go to the next step. Search for the hosts file of the operating system (in Windows, maintain a mapping table between domain names and IP addresses). If no, go to the next step

The browser’s DNS cache

  • Type :chrome:// DNS/into your Chrome browser and you can see chrome’s DNS cache.
  • The system cache is stored in /etc/hosts(Linux) :
  • Although THE DNS cache can improve the speed of obtaining DNS, the DNS cannot resolve the latest IP address when the IP address changes. So the browser DNS cache won’t be too long, a minute or so

Operating system DNS cache

  • The operating system sends the domain name to the LDNS (local zone name server), which queries its DNS cache (the hit ratio is about 80%). If the search succeeds, the LDNS returns the result. If the search fails, it initiates an iterative DNS resolution request:
  • LDNS sends a request to the Root Name Server (address of the Root DNS Server, such as COM, NET, and IM TOP-LEVEL DNS Server). In this case, the Root Name Server returns the ADDRESS of the TOP-LEVEL DNS Server in the IM domain.
  • The LDNS sends a request to the TOP-LEVEL domain name server of the IM domain and returns the juejin.
  • LDNS sends a request to the juejin. Im domain name server and obtains the IP address of Juejin.
  • LDNS returns the obtained IP address to the operating system and caches the IP address itself. The operating system returns the IP address to the browser and caches the IP address itself.

DNS Prefetch

By default, the browser will Prefetch the domain name of the page that is not in the same domain as the current domain (the domain name of the page being viewed) and cache the result. This is the implicit DNS Prefetch. There are two things related to DNS in front-end optimization

  • Reduce the number of DNS requests
  • DNS Prefetch should be placed at the front of the web page as much as possible. It is recommended to place it at the back. Specific use methods are as follows:
<meta http-equiv="x-dns-prefetch-control" content="on">

<link rel="dns-prefetch" href="//www.img.com">

<link rel="dns-prefetch" href="//www.api.com">

<link rel="dns-prefetch" href="//www.test.com">

Copy the code

If you want to disable implicit DNS Prefetch, you can use the following tags:

<meta http-equiv="x-dns-prefetch-control" content="off">

Copy the code

DNS Load Balancing

The well-known Content Delivery Network (CDN) utilizes the DNS redirection technology. The DNS server will return an IP address closest to the user to the user, and the SERVER of the CDN node is responsible for responding to the user’s request and providing the required Content.

ARP cache

ARP is a protocol used to interpret addresses. The MAC address of the corresponding party can be traced based on the IP address of the corresponding party.

The ARP cache is a buffer for storing IP and MAC addresses. In essence, it is a mapping table of IP and MAC addresses. Each entry in the table records the IP and MAC addresses of other hosts.

When the ADDRESS resolution protocol is asked for the MAC address of a node with a known IP address, it first checks the AR cache. If there is any, the CORRESPONDING MAC address is returned directly. If no, ARP requests are sent

A TCP connection

HTTP uses TCP as its transport layer protocol. The TCP creation and link folding processes are automatically created by the TCP/IP protocol stack

TCP three handshakes four waves

  • TCP Packet Format
    • 1) Serial number: Seq serial number, consisting of 32 bits, which identifies the byte stream sent from the TCP source to the destination and is marked when the initiator sends data.
    • 2) Confirmation number: The Ack number is 32 bits. The confirmation number field is valid only when the Ack flag bit is 1. Ack=Seq+1.
    • 3) Flag bits: there are 6 flag bits in total, namely URG, ACK, PSH, RST, SYN, AND FIN. Specific meanings are as follows: (A) URG: Urgent Pointer is valid. (B) ACK: confirm that the serial number is valid. (C) PSH: The receiver should deliver the packet to the application layer as soon as possible. (D) RST: Resets the connection. (E) SYN: Initiates a new connection. (F) FIN: Releases a connection.
  • TCP three-way handshake
    • To establish a TCP connection, the client and server need to send a total of three packets. The purpose of the three-way handshake is to connect to a specified port on the server, establish a TCP connection, synchronize the serial numbers and confirmation numbers of the two connected parties, and exchange them
  • TCP four waves:
    • Removing a TCP connection requires four packets. This is called a four-way handshake. Either the client or the server can initiate the wave action actively. In socket programming, either side performs the close() operation to generate the wave action

The HTTP request

The process of sending an HTTP request is to construct an HTTP request packet and send it to the specified port on the server through TCP (HTTP 80/8080, HTTPS 443). An HTTP request packet consists of three parts: the request line, the request header, and the request body.

  • HTTPS process: Before transferring data through HTTPS, the client and server need to conduct a HANDSHAKE (TLS/SSL handshake). During the handshake, the password for encrypting the transmitted data is established. TLS/SSL uses asymmetric encryption, symmetric encryption, and hash
  • HTTP request cache (CDN node cache, proxy server cache, browser cache, back-end dynamic calculation result cache, etc.)
    • Strong caching (Cache-control and Expires) Strong caching is controlled primarily by cache-Control and Expires fields in the response header. Can be used in request headers and response headers
      • Cache-control has a higher priority
      • Request header: no-cache no-store
      • Response header: no-cache no-store public
    • Negotiated cache (last-Modified and Etag)
      • When a client requests the resource again, it attaches the if-Modified-since field to its request header (the last-Modified value returned in the response header when the requested resource was first acquired).
      • When the client requests the resource again, it will attach the if-none-match field to its request header (the value is the Etag value returned in the response header when the resource was first obtained).
  • Status code
    • 1XX: indicates that the request has been received and processing continues.
    • 2xx: success – The request is successfully received, understood, or accepted.
    • 3xx: Redirect – Further action must be taken to complete the request.
    • 4XX: client error – The request has a syntax error or the request cannot be implemented.
    • 5xx: Server side error – The server failed to fulfill a valid request.
    • Common status codes encountered are :200, 204, 301 (permanent redirection), 302(temporary redirection), 304(cache), 400, 401, 403, 404, 422, 502(gateway)

The browser parses the rendered page

  • DOM Tree: A data structure that the browser parses HTML into a Tree. (https://segmentfault.com/a/1190000014520786)
  • CSS Rule Tree: The browser interprets the CSS into a Tree data structure.
  • Render Tree: DOM and CSSOM are combined to generate a Render Tree.
  • Layout: the layout; With Render Tree, browsers already know what nodes are in a web page, the CSS definition of each node, and their dependencies to figure out where each node is on the screen.
  • Painting, drawing; According to the calculated rules, through the graphics card, draw the content on the screen.
  • Composite draws DOM elements in a page on multiple layers. Once the drawing process is complete on each layer, the browser merges all layers into one layer in a reasonable order and displays them on the screen. This process is especially important for pages with overlapping elements, because if the layers are merged in the wrong order, the elements will display incorrectly
    • javascript -style- layout- paint- composite
    • javascript -style- paint- composite
    • javascript -style- composite
    • RenderObject becomes LayoutObject, RenderLayer becomes PaintLayer.)
      • RenderObjects preserves the tree structure. A RenderObjects knows how to draw the contents of a Node by making the necessary draw calls to a GraphicsContext.
      • Each Node in the DOM tree has a corresponding LayoutObject. LayoutObject knows how to paint the content of a Node on the screen.
    • PaintLayer ->GraphicsLayers
      • Some particular render Layers are thought of as Compositing Layers, which have a separate GraphicsLayer, and other render Layers that are not Compositing Layers share one with their first parent that has a GraphicsLayer
      • Each GraphicsLayer has a GraphicsContext. The GraphicsContext is responsible for the output of bitmaps of this layer. Bitmaps are stored in shared memory and uploaded to GPU as textures. Then draw to the screen, at which point our page is displayed on the screen
      • There is a prerequisite for the rendering layer to be promoted to the compositing layer, which must be the SelfPaintingLayer
        • Hardware-accelerated IFrame elements (such as pages with compositing layers embedded in iframe)
        • 3D or hardware-accelerated 2D Canvas elements
        • The video control bar that overlays the video element
        • Video elements
        • There are 3 d transform
        • Backface – visibility is hidden
        • Animation or transition is applied to opacity, Transform, fliter and backdropfilter
  • Reflow: When the browser finds that something has changed that affects the layout and needs to go back and rerender, this process is called reflow
  + display: none

+ Add or remove visible DOM elements;

+ element position change;

+ Element size changes -- margins, padding, borders, width, and height

+ Content changes -- for example, width and height changes due to text changes or image size changes;

+ page rendering initialization;

+ Browser window size changes -- resize event occurs;



+ 1) offsetTop, offsetLeft, offsetWidth, offsetHeight

+ (2) the scrollTop/Left/Width/Height

+ (3) clientTop/Left/Width/Height

+ (4) width,height

+ (5) requests getComputedStyle(), or IE's currentStyle

+ 1) Add CSS styles instead of using JS control styles (that's how I came up with the backflow problem)

+ (2) Try to do all the DOM changes at once

+ (3) Change the className directly, or use cssText if changing the style dynamically (consider unoptimized browsers)

+ (4) Do not frequently access attributes that cause the browser to flush the queue, and if you do, use the cache

+ (5) Takes elements out of the animation stream, reducing the size of the Render Tree that flows back

+ (6) Sets the position attribute to absolute or fixed for an element that needs to be rearranged multiple times, so that the element is removed from the document flow and its changes do not affect other elements. For example, animate elements are best set to absolute positioning;

+ (7) try not to use form layout, if there is no fixed width form the width of a column is determined by the width of a column, it is likely in the last line is beyond the width of the previous column width, cause the overall reflux create table may need to repeatedly calculation to determine the node in the render tree attributes, it usually takes three times in the time of the same element.

Copy the code
  • Repaint: Changing an element’s background color, text color, border color, etc., without affecting its surrounding or interior layout, a portion of the screen is repainted, but the element’s geometry remains the same
    • Visibility: Hidden will only trigger repaint because no position changes have been detected

reference

  • [1] [wireless performance optimization, the Composite] https://fed.taobao.org/blog/taofed/do71ct/performance-composite/
  • [2] [renderer/core/painting] https://chromium.googlesource.com/chromium/src/+/master/third_party/blink/renderer/core/paint/README.md
  • [3] [the GPU acceleration in Chrome synthetic] http://www.chromium.org/developers/design-documents/gpu-accelerated-compositing-in-chrome?spm=taofed.bl Ac8R41jpV oginfo. Blog. 1.19585