Browser kernel
The browser kernel is mainly divided into two parts: rendering engine, JS engine;
- Rendering engine: responsible for getting the content of the web page (HTML CSS IMG…) , and calculate how a web page should be displayed, render it into a DOM tree, and then output it to a monitor or printer. The browser kernel is different for the syntax interpretation of the web page is also different, so the rendering effect is also different;
- Javascript engine: parse and execute javascript, translate javascript code into CPU instructions for execution;
- Major browser kernels:
- IE: Trident kernel;
- Firefox: Gecko kernel;
- Safari: WebKit kernel;
- Opera: Formerly presto kernel, now changed to Google Chrome Blink kernel;
- Chrome: Blibk kernel, which is based on WebKit and developed by Google and Opera Software;
Browser storage
features | cookie | localSorage | sessionStorage | indexedDB |
---|---|---|---|---|
Data life cycle | Generally, it is generated by the server. You can set the expiration time | It stays there until it’s cleared | The page is closed and cleaned up | Unless it is cleaned up, it will remain |
Data store size | 4K | 5M | 5M | infinite |
Communicates with the server | Each time it is carried in the header, it has an impact on request performance | Don’t participate in | Don’t participate in | Don’t participate in |
Cookies, sessionStorage and localStorage
- Cookies are data stored (usually encrypted) on a user’s local terminal by a website to identify the user
- Cookie data is always carried in the same HTTP request (even if it is not needed) and is passed back and forth between the browser and the server (optimization point) – sessionStorage and localStorage do not automatically send data to the server, but keep it locally
- Storage size:
- Cookie data size cannot exceed 4K
- SessionStorage and localStorage, while also limited in size, are much larger than cookies, reaching 5M or more
- Term time:
- LocalStorage stores persistent data. Data will not be lost after the browser is closed unless the data is actively deleted.
- SessionStorage data is automatically deleted after the current browser window is closed;
- Cookie The cookie set remains valid until the expiration time, even if the window or browser is closed
LocalStorage has getItem (fetch data) and setlTEM (store data) to access data.
- LocalStorage can only store strings. Access JSON data with json.stringify () and json.parse (). If setItem is disabled, try… Catch an exception
- Cookie is not originally used for storage, but for communication with the server, need to access need to encapsulate API.
HTTP request scenario
- Get: usually (to see data)- to see;
- POST: Submits data to the server usually (creates data)-create;
- PUT: Submits data to the server. Usually (to update data)-update, similar to the POST method, also submits data, but puts specify the location of resources on the server.
- HEAD: only the first page information is requested;
- DELETE: deletes resources on the server.
- OPTIONS: used to get the request mode supported by the current URL;
- TRACE: Used to activate a remote application-layer request message loop;
- CONNECT: a channel that converts request links to transparent TCP/IPd;
The HTTP status code
- 1XX: message status code
- 100: continue When a POST request is sent, HTTP haeder has been sent and the server will return this information to confirm the request and then send specific parameter information.
- 2XX: success status code
- 200: ok The information is returned normally.
- 201: Created The request succeeds and the server creates a new resource.
- 202: Accepted server has received the request but has not yet processed it;
- 3XX: redirection
- 301: Move per The requested page has been permanently redirected;
- 302: Found temporary redirection;
- 303: See other temporarily flushes redirects and always uses GET to request new URLS;
- The requested page has not been modified since the last request;
- 4XX: Client error
- 400: The bad Request server cannot understand the format of the request. The client should not try to initiate the request with the same content again.
- 401: Unauthorized Requests are not authorized;
- 403: forbidden Forbidden access;
- 404: Not found
- 5XX: Server error
- 500: Internal server error Indicates a common server error.
- 503: Service unacailable The server is temporarily unable to process the request (possibly overloaded live maintenance)
Describe the browser processing process:
Typically, we enter the URL from the browser’s address bar and wait for the browser to return what we want. In this process, the browser does the following:
- The browser sends the requested URL to the DNS for domain name resolution, finds the real IP address, and sends a request to the server.
- The server to the background processing completed after the return of data, browser to receive files (HTML, JS, CSS, images, etc.);
- The browser parses the loaded resources (HTML, JS, CSS, etc.) and establishes the corresponding internal data structure (such as HTML DOM).
- Load the parsed resource file, render the page, and finish.
Describe the browser process in detail:
After entering the URL in the browser address bar:
- The browser checks the cache and if the requested resource is in the cache and fresh, skip to the transcoding step:
- If the resource is not cached, make a new request.
- If it is cached, verify whether it is fresh enough and provide it to the client directly. Otherwise, verify with the server.
- Checking freshness usually has two HTTP headers for controlling Expires and cache-control:
- 2.3.1 HTTP1.0 provides Expires, where the value is an absolute time to indicate a cached fresh date
- 2.3.2 HTTP1.1 added cache-control: max-age=, which is the maximum fresh time in seconds
- Browser parse URL fetch protocol, host, port, path
- The browser assembles an HTTP (GET) request packet
- The browser obtains the host IP address as follows:
- Browser cache
- This machine is the cache
- Hosts file
- Router cache
- ISP DNS cache
- DNS recursive query (Inconsistent IP addresses may occur due to load balancing)
- Open a socket with the destination IP address, establish a TCP connection with the port, and shake hands three times as follows:
- The client sends a TCP SYN=1, Seq=X packet to the server port
- The server sends back packets with SYN=1, ACK=x+1,Seq=Y
- The client sends ACK=Y+1, Seq=z
- After a TCP connection is established, an HTTP request is sent
- The server receives the request, parses it and forwards it to the server program, such as the virtual Host using the HTTP Host header to determine the requested server program
- The server checks whether the HTTP request header contains cached authentication information. If the authentication cache is fresh, the server returns the corresponding status, such as 304
- A reasonable program reads the complete request and prepares the HTTP response, which may require querying the database
- The server sends corresponding packets back to the browser over the TCP connection
- The browser receives the HTTP response and then closes the TCP connection or retains the TCP connection for reuse. The four-way handshake for closing the TCP connection is as follows:
- The active party sent Fin=1,ACK= Z,Seq= X packets
- The passive sends ACK=X+1,Seq=Y packets
- The passive sends Fin=1,ACK=X, and Seq=Y packets
- The active party sends ACK=Y,Seq= X packets
- The browser checks the corresponding status code
- If the resource is cacheable, cache it
- Decode the corresponding
- Decide what to do based on the resource type
- Parsing HTML documents, building DOM trees, downloading resources, building CSSOM trees, and executing JS scripts are in strict monthly order
- Build a DOM tree:
- Tokenizing: Parsing character streams into tags according to the HTML specification
- Lexing: Lexical analysis converts tags into objects and defines attributes and rules
- DOM Construction: Organize objects into DOM trees based on HTML tag relationships
- Image, style sheet, JS file encountered during parsing, start download
- Build the CSSOM tree:
- Tokenizing: Character stream converted to token stream
- Node: Creates a Node based on the tag
- CSSOM: The node creates the CSSOM tree
- Build a render tree based on the DOM drink CSSOM tree – from the root of the DOM tree to traverse all visible nodes, invisible nodes include: 1) script, meta, such as itself invisible tags. 2) Nodes hidden by CSS, such as display: None – For each visible node, find the appropriate CSSOM rule and apply it – publish the content and calculation style of the visible node
- Js parsing is as follows
- The browser creates the Document object, parses the HTML, and adds the parsed elements and text nodes to the Document
- When the HTML parser encounters scripts without async and defer, it adds them to the document and then executes inline or external scripts. These scripts execute synchronously, and the parser pauses while the script is downloaded and executed. This allows you to insert text into the input stream with document.write(). Synchronous scripts often simply define functions and register event handlers that can iterate and manipulate scripts and their previous document contents
- When the parser encounters a script with the async property set, it starts downloading the script and continues parsing the document. The script is executed as soon as it is downloaded, but the parser does not stop for it to download. Document.write () is prohibited for asynchronous scripts, which have access to their own script and previous document elements
- When the document is parsed, document.readState becomes interactive
- All defer scripts will be executed in the order in which they appear in the document. The deferred script has access to the full document tree, and document.write() is prohibited.
- The browser fires the DOMContentLoaded event on the Document object
- When the document is fully parsed, the browser may still be waiting for content such as images to load. When that content is loaded and all asynchronous scripts are loaded and executed, document.readState changes to complete and the window fires the load event
- Display the page (the page is displayed gradually as the HTML is parsed)
Advantages and disadvantages of Cookies
- Advantages: High scalability and availability
- Data persistence;
- No server resources are required. Cookies are stored on the client and read by the server after being sent;
- You can configure expiration rules. Controls the lifetime of cookies so that they do not last forever. The thief is likely to get an expired cookie;
- Simplicity. Text-based lightweight structure;
- Control the size of session objects stored in cookies through good programming;
- Through encryption and secure transmission technology (SSL), reduce the possibility of cookie cracking;
- Only store insensitive data in cookies, even if stolen, there will be no major loss;
- Disadvantages:
- Limits on the number and length of cookies.
Number: The total number of cookies per field is limited. A).IE6 or lower up to 20 cookies; B).ie7 and later versions can end up with 50 cookies; C). Firefox has a maximum of 50 cookies; D). Chrome and Safari do not have a hard limit on the length of each cookie: the length should not exceed 4KB (4096B), otherwise it will be truncated; 2) Potential security risks. Cookies can be intercepted or tampered with. If cookies are intercepted, it is possible to obtain all session information. 3) The user is disabled. Some users limit this capability by disabling the browser or client device’s ability to accept cookies; 4) Some states cannot be saved in the client. For example, to prevent forms from being submitted repeatedly, we need to save a counter on the server side. If we keep this counter on the client side, it does nothing.
Browser cache
Browser caches are divided into strong cache and negotiated cache. When a client requests a resource, the process for obtaining the cache is as follows:
- The HTTP headers of this resource are used to determine whether it matches the strong cache. If yes, the cache resource is directly obtained from the local server without sending a request to the server.
- When the strong cache does not match, the client sends a request to the server, which verifies that the resource matches the negotiated cache through another request header, called HTTP revalidation. If the resource matches, the server returns the request, but does not return the resource. Instead, it tells the client to fetch it directly from the cache. When the client receives the return, it retrieves the resource from the cache.
- Strong and negotiated caches have in common that the server does not return the resource if the cache is hit; The difference is that the strong cache does not send requests to the server, but the negotiated cache does.
- When the negotiation cache also dies, the server sends the resource back to the client.
- When CTRL + F5 forces a page refresh, load it directly from the server, skipping strong cache and negotiated cache;
- When F5 refreshes a web page, it skips the strong cache but checks the negotiated cache.
Browser Rendering steps
- The HTML parses the DOM Tree
- The CSS parses the Style Rules
- The two are associated to generate a Render Tree
- Compute each node’s letter according to Render Tree
- Painting renders the entire page based on the calculated information
When the browser parses the document: if it encounters a script tag, it immediately parses the script and stops parsing the document (because JS may change the DOM and CSS, and it is wasteful to continue parsing); In the case of an external script, parsing the document will continue after the script has been downloaded. Now that the script tag has deferred and async properties, script parsing will parse out changes to the DOM and CSS in the script and append them to the DOM Tree and Style Rules
The difference between GET and POST requests
- The GET argument is passed through the URL, and the POST is placed in the body. (THE HTTP protocol states that the URL is in the request header, so the size limit is very small)
- GET requests pass parameters in the URL with length limits, whereas POST does not.
- GET is harmless when the browser falls back, while POST resubmits the request
- GET requests are actively cached by browsers, whereas POST requests are not, unless set manually
- GET is less secure than POST because parameters are directly exposed in the URL and therefore cannot be used to pass sensitive information
- GET accepts only ASCII characters for the data type of the argument, while POST has no restrictions
- GET requests can only be url(X-www-form-urlencoded) encoding, while POST supports multiple encoding methods
- GET generates a TCP packet; POST generates two TCP packets. For GET requests, the browser sends HTTP headers and data together, and the server responds with 200. For POST, the browser sends a header, the server responds with 100 continue, the browser sends data, and the server responds with 200 OK.
What is the reflow
The process by which browsers recalculate the location and geometry of page elements in order to re-render part or the entire page is called reflow. In plain English, when a developer defines a style (including the browser’s default style), the browser calculates it and places the element where it should appear based on the result, a process called reflow. Since reflow is a user-blocking operation in the browser, it is important to understand how to reduce reflow times and DOM hierarchies, and how CSS efficiency affects refolw times. Reflow is one of the key factors that leads to inefficient DOM script execution. Any node on the page that triggers reflow causes its children and ancestors to re-render. A quick explanation of Reflow: When an element changes, it affects the content or structure of the document, or the location of the element, a process called Reflow.
Causes of reflow
- Resize the window
- Change the text size
- Add/remove stylesheets
- Content changes, (the user also writes content in the input box)
- Activate pseudoclasses, such as hover, to manipulate the class attribute
- Script manipulation of DOM
- Calculate offsetWidth and offsetHeight
- Setting the style property