How browsers work & front-end security

Network security

Three principles

During transmission, user privacy data is not allowed to be transmitted in plaintext.
The user privacy data is not allowed to be stored in plaintext locally.
On the server, the user privacy data is not allowed to be stored in plain text.

HTTP is plaintext transmission. If any of The physical device nodes, such as WiFi, router, carrier, and equipment room, is monitored, The transmitted content is completely exposed. This attack is called Man In The Middle (MITM) attack. In terms of the network, we know that both POST and GET requests will be captured. Without HTTPS, we cannot prevent packet capture. If the user’s privacy is transmitted in plaintext, the consequences will not be mentioned.

A lot of user password is universal, once be stolen by lawless element, go to other website bump library, cause loss. I mentioned HTTP transport because there are three main risks

Eavesdropping: A third party may learn the contents of communications.
Tampering risks: Third parties may modify communications.
Impersonation risk (pretending) : Third parties can impersonate others to participate in communications.

So you have HTTPS. You can think of HTTPS as HTTP + TLS. TLS is a transport layer encryption protocol, and its predecessor is SSL.

Encrypted transmission (to avoid plaintext transmission)

1. Symmetric encryption

Encryption and decryption uses the same key client and server to communicate, using symmetric encryption, if only one secret key is used, it is easy to crack; If a different key is used each time, the management and transmission cost of mass key will be relatively high.

2. Asymmetric encryption

Two keys are required for encryption and decryption. The two keys are a public key and a private key. The asymmetric encryption mode is as follows:

Party B generates two keys (public key and private key). Public keys are public and available to anyone, while private keys are private
Party A obtains Party B’s public key and uses it to encrypt the information
Party B gets the encrypted information and decrypts it with the private key.

But when the server to return the data, if use public key encryption, the client is not private key to decrypt, and if that is encrypted with the private key client, although there is a public key can decrypt, but the public key is transmitted over the Internet before, is likely to have been got, are not safe, so it is a process of asymmetric encryption is only can’t meet. (Strictly speaking, private keys cannot be used for encryption, but can only be used for signature. This is because the mathematical requirements for different variables are different when generating public and private keys in cryptography, so the ability of public and private keys to resist attacks is also different.

https

The purpose of HTTPS is to solve the problem of information tampering and monitoring during HTTP plaintext transmission.

In order to give consideration to performance and security, asymmetric encryption + symmetric encryption is used.
In order to ensure that the public key is not tampered with during transmission, asymmetric encryption digital signature function is used, and the PUBLIC trust of HTTPS certificate is guaranteed by CA and system root certificate mechanism.

Only the certificate, plaintext information, and signed plaintext information are transmitted. The CA public key is not transmitted (to prevent man-in-the-middle attacks). The client browser can obtain the CA public key through the system root certificate. (The CA certificate and public key built into the system or browser become critical.)

Encrypted storage Don’t use clear store password If use clear store password (whether existing database or the logs), once the data reveal that, all the user’s password is exposed in front of the hacker, without reserve mentioned risk may occur in the beginning, that we are half a day for a fresh encrypted password has lost its meaning.

To sum up, if we want to ensure the information security of users as much as possible, we need to do the following work

Using HTTPS request
Use RSA to encrypt passwords and transmit data
Use BCrypt or PBKDF2 unidirectional encryption, and store

Forcing HTTPS

Some Web sites buy SSL certificates and configure them on their Web servers, thinking they’re done. But that just means you’ve enabled the HTTPS option, and users probably won’t notice. To ensure that every user benefits from HTTPS, you should redirect all incoming HTTP requests to HTTPS. This means that anyone who visits your site will automatically switch to HTTPS, and their information will be secure from then on.

The secure parameter, in conjunction with cookies, prevents cookies from being taken out of the original HTTP request (man-in-the-middle interception).

TCP three handshakes four waves

Tcp is a Transmission Control Protocol (Tcp) designed to provide reliable end-to-end byte streams over unreliable Internet networks

First handshake: request connection client->SYN=1, random SEq =x (packet first byte serial number) second handshake: agree to reply, ACK =x+1, random SEq =y, return to confirm connection third handshake: The client checks whether the ACK bit is X +1 and whether the ACK bit is 1. If the ack bit is correct, the client sets the flag ACK bit to 1 and ack= Y +1. Ack = y+1; ack = 1;

Authentication and authorization + Browser storage

What is Authentication?

Verify the identity of the current user and prove that you are yourself in the Internet:

User name and password for login
Email sends the login link
Mobile phone number Receiving verification code

What is Authorization?

The user grants third-party applications the permission to access certain resources of the user. When installing a mobile application (whether the user is allowed to access photo albums and geographical locations), log in to the wechat mini program (whether the user is allowed to obtain personal information such as nickname, profile picture, region, and gender).

The authorization modes are cookie, session, Token, and OAuth

What are Credentials?

The premise of authentication and authorization is that a medium (certificate) is needed to mark the identity of the visitor. After a successful login, the server issues a token to the browser of the user, indicating the identity, and carries it with each request.

What is a Cookie

HTTP is a stateless protocol, each request is completely independent, the server cannot confirm the identity of the current visitor, unable to distinguish the last request sender and this time sender is the same person. So in order for the server and browser to do session tracking (to know who is visiting me), they must actively maintain a state that tells the server whether the previous two requests came from the same browser. This state needs to be implemented through cookies or sessions.
Cookie stored on the client: A cookie is a small piece of data that the server sends to the user’s browser and keeps locally. It is carried and sent to the server the next time the browser makes a request to the same server.
Cookies are not cross-domain: each cookie is bound to a single domain name and cannot be used under other domain names. First-level domain names and second-level domain names are allowed to be shared (depending on the domain)

Features: Cookie size is limited, generally 4 KB; The number of cookies stored in the same domain name is limited. The number varies with different browsers. Generally, the number is 20. Cookie supports setting the expiration time and is automatically destroyed when it expires. (Max-age unit: second. If the value is negative, it indicates temporary cookie closing browser failure. The default value is -1. The Cookie under the current domain name is carried every time an HTTP request in the same domain is initiated. HttpOnly is supported to prevent cookies from being accessed by the client’s JavaScript

What is a Session

Session is another mechanism for recording the state of the session between the server and client
The session is implemented based on cookies. The session is stored on the server, and the sessionId is stored in the cookie of the client

SessionID is a bridge between Cookie and Session, and most systems verify user login status based on this principle.

What is the localStorage

The characteristics of

The size is limited to 5MB to 10MB.
Sharing data between all tabs and Windows of the same origin;
Data is only stored in the client and does not communicate with the server.
Data persists and does not expire. The data persists after the browser is restarted.
Operations on data are synchronous.

What is the sessionStorage

SessionStorage data only exists in the current browser TAB;
The data remains after the page is refreshed, but is erased when the browser TAB is closed;
Has a unified API interface with localStorage;
Operations on data are synchronous.

What is a Token?

Resource credentials needed to access a resource interface (API)
Simple token composition: UID (unique user identification), time(timestamp of the current time), sign (signature, the first few digits of the token are hashed into a certain length of hexadecimal string)

Features:

Stateless server and good scalability
Supports mobile devices
security
Support for cross-program calls

What is the JWT

JSON Web Token (JWT) is currently the most popular cross-domain authentication solution. (Do not use cookies)

Method: Authorization; Via the url; When cross-domain, JWT can be placed in the data body of the POST request

The difference between JWT and Session and Token is that JWT already contains user information, so there is no need to query in the database

What is a XSS

Cross-site Scripting (cross-site Scripting) is a code injection Scripting attack

Storage (anything that can be typed into a database, injected with a script, and rendered by the server to be stitched together in HTML and returned to the browser)
Reflexivity (script writes url, such as routing parameters, inducing users to click, server rendering splicing script HTML back to browser)
Domicity (the script writes to the URL, and the front-end JavaScript picks up the malicious code in the URL and executes it)

Guard: Cookie readOnly Prevents javascript from accessing cookies. The front-end server formats the input box. Escape HTML The XSS pitfalls are.textContent,.setAttribute().

What is CSRF cross-site request Forgery when A user has logged into secure site A, inducing the user to visit Site B, and then B uses the credentials obtained by SITE A to access Site A, bypassing user authentication

1. Log in to trusted website A and generate cookies locally.
2. Visit dangerous website B without logging out of A.

Defense against: Origin Referrer Token Samesite

Origin of Base64 encoding

Because some network transmission channels do not support all bytes, for example, traditional mail only supports visible characters, such as ASCII control characters cannot be sent through mail. Base64 is a representation of binary data based on 64 printable characters.

ASCII code In a computer, all data are stored and calculated using binary numbers. Fifty-two letters (including uppercase) such as A, B, C, and D, as well as numbers such as 0 and 1, and some common symbols (such as *, #, and @) are also stored in a computer using binary numbers. Used to specify which binary numbers are used to represent the common symbols mentioned above

ASCII American standard code for information interchange that stores 128 characters (including 33 control characters) in a single byte. In a computer, all data is represented as binary numbers when stored and manipulated (because computers use high and low levels to represent 1 and 0, respectively). For example, Like a, b, c, d 52 letters (including capital) and 0, 1, such as digital and some commonly used symbols (such as *, #, @, etc.) in a computer store also want to use the binary number, what binary Numbers which symbols, and specific use, of course, everyone can agree its own set of (this is called encoding), If people want to communicate with each other without causing confusion, they must use the same coding rules. Therefore, the relevant standardization organization of the United States issued the ASCII coding, which stipulated which binary numbers are used to represent the common symbols mentioned above [2].

Base64 is one of the most common encoding methods for transmitting 8Bit bytecode on the network. Base64 is a method to represent binary data based on 64 printable characters. Base64 encoding is the process from binary to character

How browsers work

Asynchronous programming

Asynchronism, as opposed to synchronization, can be understood as the tasks to be done after the completion of asynchronous operations. They are usually put into the Event queue in the form of callback functions or promises, and then the Event Loop mechanism checks whether the asynchronous operation is completed at each polling. If completed, the corresponding tasks will be executed in turn according to the execution rules in the event queue.

How javascript works (single thread, task queue, EventLoop, microtask, macro task)

Single threaded feature

A single thread avoids the complex synchronization problems associated with multi-threaded operations.

Task queues (how JavaScript works)

All synchronization tasks are executed on the main thread, forming an execution stack.
In addition to the main thread, there is a “task queue”. Whenever an asynchronous task has a result, an event is placed in the “task queue”.
Once all synchronization tasks in the execution stack are completed, the system reads the task queue to see what events are in it. Those corresponding asynchronous tasks then end the wait state, enter the execution stack, and start executing.

Event Loop

Each time, the Tick checks whether there are tasks to be executed in the task queue. The Tick process is to check whether there are any pending events, and if there are, the relevant events and the callback function are put into the execution stack and executed by the main thread. Onclick is handled by the DOM Binding module of the browser kernel. When an event is triggered, a callback function is immediately added to the task queue. 2. SetTimeout will be delayed by the timer module of the browser kernel. When the time is up, the callback function will be added to the task queue. 3. Ajax is handled by the network module in the browser kernel, and the callback is added to the task queue after the network request has completed and returned.

Javascript is single-threaded and browsers are multi-threaded. Process and thread are the concept of the operating system, process is the execution instance of the application program, each process is composed of private virtual address space, code, data and other various system resources, that is, process is the operating system for resource allocation and independent operation of the smallest unit.

Process

Process is the execution instance of application program, each process is composed of private virtual address space, code, data and other various system resources, that is, process is the smallest unit of operating system for resource allocation and independent operation. When we start an application, the computer creates at least one process. The CPU allocates a portion of memory for the process, where all the state of the application is stored. The application may also create multiple threads that share the data in the memory. If the application is shut down, the process is terminated and the operating system frees the associated memory.

Threads

An execution unit within a process, a basic unit that is independently scheduled and dispatched by the system. Once the process is created, the system actually starts the main thread of execution that executes the process
A process is like a bounded factory, while threads are like employees in a factory, doing things on their own or in collaboration with each other, so a process can create multiple threads.
Threads themselves do not need the system to reallocate resources, and they share all resources owned by the current process with other threads belonging to the same process. PS: Processes do not share resources and address space, so there are not too many security issues. Because multiple threads share the same address space and resources, there are complex security issues such as malicious modification or acquisition of unauthorized data between threads.

Chrome has a multi-process architecture

The main process

The main Browser Process (coordinating and controlling) (1) handles the address bar, bookmark bar, forward and back buttons, (2) handles the Browser’s invisible low-level operations, such as web requests and file access (3) manages pages, creates and destroys other processes
The Renderer Process is responsible for everything about the rendering of a page within a TAB, page rendering, script execution, event handling, etc

Plugin Process Controls all plug-ins used by a web page. For example, flash has a Process for each type of plug-in and is created only when the plug-in is used
GPU Process is responsible for processing GPU-related tasks, such as 3D drawing

Advantages By default, a new TAB page is opened to create a process, so the crash of a single TAB page does not affect the entire browser. Similarly, third-party plug-in crashes do not affect the entire browser. Multiple processes can take full advantage of modern cpus with multiple cores.

Disadvantages The system allocates memory and CPU resources to newly started processes of the browser. Therefore, memory and CPU resources are consumed more. Chrome does a good job of freeing up memory, however. Basic memory can be quickly freed up for other programs to run.

A browser implements at least three resident threads: JavaScript engine threads, GUI rendering threads, and browser event-firing threads.

1. The JavaScript engine is based on event-driven single thread execution. The JavaScript engine always waits for the arrival of tasks in the task queue and then processes them.

2. The GUI rendering thread is responsible for rendering the browser interface, which is executed when the interface needs to be repainted or Reflow due to some action. However, it is important to note that the GUI rendering thread and the JavaScript engine are mutually exclusive. The GUI thread is suspended while the JavaScript engine is executing, and GUI updates are stored in a queue until the JavaScript engine is idle.

3. Event trigger thread. When an event is triggered, the thread adds the event to the end of the queue, waiting for the JavaScript engine to process it. These events can come from the block of code currently executing by the JavaScript engine, such as setTimeout, or from other threads in the browser kernel, such as mouse clicks, Ajax asynchronous requests, etc. But due to the single-threaded nature of JavaScript all of these events are queued up for the JavaScript engine to process (asynchronous code is executed only if no synchronous code is executed in the thread)

The problem

Why is Javascript single threaded?

JavaScript for processing user interaction in the page, as well as the operation of DOM tree, CSS style tree to give users a dynamic and rich interactive experience and server logic interaction processing. If JavaScript manipulates these UI DOM in a multithreaded manner, UI manipulation conflicts can occur. But to avoid further complexity due to the introduction of locking,Javascript initially opted for single-threaded execution.

Why does JS block page loading?

Because JavaScript is DOM manipulable, if you render the interface while modifying these element attributes (that is, the JavaScript thread and the UI thread are running at the same time), you might get inconsistent element data before and after the render thread. So to prevent unexpected results from rendering, the browser sets the GUI rendering thread and JavaScript engine to be mutually exclusive.

Does CSS load block?

CSS loading does not block DOM parsing (parallel), Render Tree is dependent on DOM Tree and CSSOM Tree so CSS loading blocks DOM rendering and CSS blocks subsequent JS execution

What is CRP (Critical Rendering Path)? How to optimize?

Html can be parsed step by step, and CSS parsing is parallel, but CSS is not, because each property of CSS can be changed csSOM, such as the back of the previous set of font size overwrite, so you must wait for csSOM to be built before entering the next stage. The speed at which CSS loads and builds CSSOM will have a direct impact on the first screen rendering speed, so CSS is considered a resource that blocks rendering by default.

Normally DOM and CSSOM are built in parallel, but when the browser encounters a script tag, DOM construction is paused until the script completes execution. But since JavaScript can modify CSSOM, you need to wait until the CSSOM is built before executing JS.

Optimization revolves around three factors

Number of critical resources (JS, CSS)

Critical path length

Number of key bytes (smaller bytes are faster to download and process — compression)

Specific practices:

Optimization of the dom

Keep HTML files small, remove redundant code, compress code, use HTTP cache

Optimize the cssom

Embed only the CSS required for the first screen in the head via the style tag, and load the rest asynchronously and non-blocking (e.g., Critical CSS)

Avoid using @import

@import will change CSS import from parallel to serial loading

Asynchronous js

All text resources should be as small as possible, remove unused code, reduce file size (Minify), use gzip compression (Compress), use caching (HTTP Cache)

Script can be loaded asynchronously by adding async properties

5. The process of rendering from the input URL browser.

After the CSS file is downloaded, the CSS file is parsed into a tree data structure, and then the RenderObject tree is combined with the DOM tree to form the RenderObject tree layout RenderObject (RenderObject) RenderObject (RenderObject); RenderObject (RenderObject); RenderObject (RenderObject); RenderObject (RenderObject) The process then composes the layers and displays the page

6.Event Loop contains at least two queues, macroTask queue and MicroTask queue

Async /await comes in pairs, and the async labelled function returns a Promise object, which can be added to the callback using the then method. Statements following await are executed synchronously. But statements under await are added to the end of the current task queue for asynchronous execution as microtasks.

After first micro macro

Reflux (Reflow)

When the size, structure, or attributes of some or all of the elements in the Render Tree change, the process by which the browser rerenders some or all of the document is called reflux. Operations that cause backflow:

Page first render
The browser window size changed. Procedure
The size or position of the element changed
Element content changes (number of words or image size, etc.)
Element font size changes
Add or remove visible DOM elements
Activate CSS pseudo-classes (such as: :hover)
Query some properties or call some methods

Redraw (Repaint)

When a change in the style of an element in a page does not affect its position in the document flow (e.g., color, background-color, visibility, etc.), the browser assigns the new style to the element and redraws it, a process called redraw.

Reflux is more expensive than drawing. Sometimes even if only a single element is backflowed, its parent element and any elements that follow it will also backflow.

What are the advantages and disadvantages of multithreading?

Advantages:

1. Put time-consuming operations (network request, picture download, audio download, database access, etc.) in the sub-thread to execute, which can prevent the main thread from getting stuck;

2, can play the advantages of multi-core processing, improve the UTILIZATION rate of CPU.

Disadvantages:

1. Each child thread consumes a certain amount of resources;

2. It will make the code less readable;

3. If multiple threads access a resource at the same time, resource contention will occur.