Preface πŸ” ️

As you all know, browser principles are a time-honored topic for front-end interviews. Because browsers are really, really broad. From a simple HTTP knowledge to cross-domain issues, front-end security issues, and so on, all involve the principles of the browser. So, naturally, it is also one of the most important questions in the interview.

In the following article, I will explain all the browser-related questions I encountered in preparing for the interview, and make a systematic summary and summary. Start your HTTP learning journey

πŸŒ„ 1. HTTP and HTTPS

(1) The relationship between HTTP and HTTPS 🧭

1. What are HTTP and HTTPS?

HTTP :(HyperText Transfer Protocol)

HTTPS :(Hypertext Transfer Protocol Secure) Hypertext Transfer Protocol

2. The difference between HTTP and HTTPS

http https
The name of the Hypertext transfer protocol Hypertext Transfer Security protocol
The default port 80 443
Way to send cleartext The encrypted
security Relatively poor security: easy to be monitored, disguised, tampered with Security is relatively good: prevent eavesdropping, prevent camouflage, prevent tampering
Response speed Fast response (3 packs) Slow response (12 packets) TCP 3 SSL 9
The cost of The lower The cost is high and certificates need to be purchased
Link to the cache Relatively efficient It is relatively low, which increases data overhead and power consumption

Colloquial answer:

HTTP is a hypertext transfer protocol, is the most widely used network protocol on the Internet, is a client and server side request and response standard, used to transmit hypertext from the WWW server to the local browser transport protocol, it can make the browser more efficient, so that the network transmission is reduced. As for HTTPS, it is an HTTP channel aiming at security. It is the secure version of HTTP. SSL layer is added to HTTP, and THE security basis of HTTPS is SSL. (πŸ‘‰ answers what HTTP and HTTPS are)

The HTTP connection is very simple and stateless, and the data transmitted is not encrypted, that is, plaintext. Netscape uses SSL to encrypt the data transmitted through HTTP. Therefore, HTTPS is a network protocol constructed by HTTP and SSL for encrypted transmission and identity authentication. This protocol is more secure than HTTP. (Answer the content sent in plaintext and encryption)

HTTP is a hypertext transfer protocol, and information is transmitted in plain text. HTTPS requires a certificate and costs a lot. It is a secure SSL encryption transfer protocol. (πŸ‘‰ answers port number questions)

HTTPS can be used to authenticate users and servers to ensure that data is sent to the correct client and server. When the client uses HTTPS to communicate with the Web server, the client accesses the web server using an HTTPS URL and requires the Web server to establish an SSL connection. After receiving the request from the client, the Web server sends the website certificate, Then the client and the Web server start to negotiate the SSL connection security level, that is, the encryption level. The two sides agree on the security level to establish a session key, and then use the public key of the website to encrypt the session key and send the session key to the website. (The Web server decrypts the session key through its own private key, encrypts the communication with the client through the session key) (πŸ‘‰ answers the HTTPS connection mode)

Recommend using HTTPS, compared to the equivalent HTTP site, using HTTPS encryption site in the search results will be higher ranking oh!

HTTPS handshake is time-consuming, lengthening page loading time by 50% and increasing power consumption by 10% to 20%. HTTPS cache is not as efficient as HTTP, which increases data overhead. SSL certificates also cost money πŸ’΄. An SSL certificate must be bound to an IP address. You cannot bind multiple domain names to the same IP address. (πŸ‘‰ answers the disadvantages of HTTPS)

smallTips :

  • WWW, i.e.,World Wide Web“Is short for world Wide Web.
  • Why is it80The port? 80 ishttpThe default port of the protocol is when entering a web site, while the browser (nonIE) has helped you enter the agreement, so you enterbaudu.com, actually visitedbaidu.com:80 。

(2) HTTP protocol 🧭

1, HTTP1.0, HTTP1.1, HTTP2.0 protocol basic content

(1) http1.0

Introduction date: HTTP1.0 was introduced in 1996.

Main Contents:

  • http1.0Provides only the most basic authentication, user name and passwordunencrypted;
  • http1.0Only supportShort connectionEach time data is sent, it passes throughTCPThree handshakes and four waves,Low efficiency;
  • http1.0Use only theheadertheif=modified-Since ε’Œ ExpiresAs aCache invalidationStandards;
  • http1.0Does not supportThe endpoint continuinglyData is sent every time it is sentAll the data;
  • http1.0That every computer isOnly one IP address can be boundVirtual networks are not supported.

(2) http1.1

Introduction date: HTTP1.1 was introduced in 1999.

Main Contents:

  • Http1.1 uses the digest algorithm (MD5/SHA-1) for authentication;

  • Http1.1 uses long connections by default, that is, only one connection needs to be established, multiple data can be transmitted, and once the transmission is complete, only one disconnection is required. Keep-alive (connection: keep-alive/close) is set by the request header.

  • Http1.1 supports resumable breakpoints through the Range of request headers.

  • Http1.1 uses virtual networks, where multiple virtual hosts can reside on a single physical server and share the same IP address.

(3) the http2.0

Introduction time: established in 2015.

Main Contents:

  • Head compression: using HPACK algorithm for compression;

    Why introduce head compression?

    Cookie, Accept, Sever, Range, and other HTTP1.1 header fields can take up hundreds to thousands of bytes, while the body is sometimes only a few dozen bytes (” head heavy body light “).

  • Binary format: HTTP2.0 has chosen a binary format closer to TCP/IP, ditching ASCII to improve parsing efficiency;

  • Enhanced security: HTTP2.0 generally runs on HTTPS;

  • Multiplexing: Multiple requests can be made on a connection.

Small tips:

Memory points: algorithms, connections, headers, power-off, virtual networks

2, http1.0, HTTP1.1, HTTP2.0 protocol differences

(1) Main differences between HTTP1.0 and HTTP1.1

(1) long connection

  • http1.0You need to usekeep-aliveParameter to tell the server to establish a long connection, whilehttp1.1Long connections are supported by default.
  • The reason for using long connections is that,httpIs based onTCP/IP protocolCreate oneTCPThe connection needs to go through three handshakes, which has some overhead and performance impact if the connection needs to be re-established for each communication. So it’s best to maintain oneA long connectionYou can use a long connection to send multiple requests.

β‘‘ Bandwidth saving

  • http1.1supportOnly header information is sent(Without anybodyInformation) if the server thinks the clientHave permissionRequest the server, and return100Otherwise return401;
    • If the client receives the100Before beginning to put the requestbodySend to the server;
    • So when the server returns401The client does not have to send the requestbodySave bandwidth;
  • In additionhttp1.1It also supports sending only a portion of the content; In this way, when the client has some resources, it only needs to request other resources from the server. This is to supportFile resumable at breakpointOn the basis of.

β‘’HOST domain (virtual network)

  • http1.1Can be found inweb serverOn (e.g.tomat) set up virtual sites, that is,web serverOn the multipleVirtual siteCan be SharedSame IP and port;
  • http1.0There is nohostThe domain,http1.1This parameter is supported.

(2) Main differences between HTTP1.1 and HTTP2.0

β‘  Multiplexing

  • inhttp1.1In the protocol, browser clients have certain requests for the same domain name at the same timeData limit, the number of requests exceeding the limit will be blocked.
  • http2.0Using themultiplexingThe technology to do the same connection concurrently processing multiple requests, and the number of concurrent requests ratiohttp1.1Several orders of magnitude larger.
  • Of course,http1.1You can build a few moreTCPConnect to support processing more concurrent requests, but createTCPThe connection itself is also expensive.

β‘‘ Head compression

  • http1.1Does not supportheaderCompression of data,http2.0useHPACK algorithmrightheaderIs compressed so that data is smaller and can travel faster over the network.

β‘’ Server push

  • It means that when we supporthttp2.0 ηš„ web serverWhen requesting data, the server will incidentally push some resources needed by the client to the client, so that the client will not create a connection to send a request to the server to obtain. This approach is very appropriateLoading static Resources.
  • So where are these resources pushed by the server? In fact, there is a client somewhere, the client directly from the local load of these resources, do not go to the network, the speed is much faster.

3, http2.0

(1) Set goals for http2.0 projects

  • Page load time (PLT) reduced by 50%.

  • There is no need for the site author to change anything.

  • Minimize deployment complexity without requiring network infrastructure changes.

  • Work with the open source community to develop this new protocol.

  • Collect real performance data to verify the validity of experimental protocols.

(2) Http2.0 features

β‘  Multiplexing (request and response multiplexing)

In HTTP1.1, the browser client has a limit on the number of requests (connections) for the same domain name at any one time, and will block if the limit is exceeded.

Http2.0 uses multiplexing to process multiple requests concurrently on the same connection, and multiplexing allows multiple request-response messages to be sent simultaneously over a single HTTP/2 connection.

(2) Binary frame division layer

At the heart of all HTTP/2 performance enhancements lies the new binary framing layer, which defines how HTTP messages are encapsulated and transmitted between clients and servers.

HTTP2.0 splits all transmitted information into smaller messages or frames (binary frames) and binary encodes them.

β‘’ Header compression

Each HTTP transport needs to carry a set of headers that describe the transferred resources and their properties.

HTTP/2 uses the HPACK compression format to compress request and response header metadata:

  • Support for encoding header fields of transports through static Huffman code, reducing the size of individual transports.

  • Both the client and server are required to maintain and update an index list of previously seen header fields (in other words, it establishes a shared compression context), which is then used as a reference to effectively encode previously transmitted values.

β‘£ Server push

In addition to the response to the initial request, HTTP/2 allows the server to push additional resources to the client without the client explicitly requesting them, making it ideal for loading static resources.

β‘€ Count the stream priority

After splitting the HTTP message into many independent frames, we can reuse frames from multiple data streams, and the order in which the client and server interleave to send and transmit these frames becomes a key performance determinant.

To do this, the HTTP/2 standard allows each data stream to have an associated weight and dependency:

Each data flow can be assigned an integer between 1 and 256, and there can be explicit dependencies between each data flow and other data flows.

β‘₯ One connection per source

Each data stream is split into frames that can be interleaved and prioritized. Therefore, all HTTP/2 connections are permanent and require only one connection per source, with many performance benefits.

7) flow control

Flow control is a mechanism that prevents the sender from sending a large amount of data to the receiver, lest it exceed the latter’s needs or processing capacity.

When is flow control used? Typically, the sender may be very busy, under high load, or may only want to allocate a fixed amount of resources for a particular data stream. Flow control will be used.

4. Talk about HTTP caching (browser caching)

(1) What is cache? What’s the point?

Definition: Caching is a technique for saving a copy of a resource and using it directly on the next request.

Function:

  • Can significantly improve the performance of your website and application.
  • Reduced wait time and network traffic.
  • Reduces the time required to display the resource representation.
  • Makes the page load faster.
  • Alleviates server pressure and improves performance.

(2) Do you know of any caching methods?

  • Browser cache
  • Proxy cache
  • Gateway caches
  • CDNThe cache
  • Reverse proxy cache

(3) Cache location

1) the Service Worker

The Service Worker’s cache differs from other built-in caching mechanisms in that it gives us control over which files are cached, how the cache is matched, and how the cache is read, and the cache is persistent.

2) Memory Cache

The reads are efficient, but the duration is short, and once the Tab page is closed, the in-memory cache is freed.

3) Disk Cache

Slow read speed, capacity and storage timeliness advantages.

4) Push Cache

Push Cache is HTTP /2. It exists only for sessions, is released once the Session ends, and is cached for a short time.

(4) How does HTTP caching work

HTTP caching is divided into mandatory caching and negotiated caching.

1) Mandatory caching

Forced caching is when files are fetched directly from the cache without a request being sent.

2) Negotiate cache

Negotiation cache means that the file is already cached, but whether it is read from the cache or not requires negotiation with the server, depending on the field Settings of the request/response header. Negotiation cache requires the request to be sent as opposed to mandatory caching.

3) Strong cache of relevant fields

Cache-control: A generic header field used to specify directives in HTTP requests and responses to implement caching.

Strong cache fields include Expires and cache-Control. Cache-control takes precedence over Expires if both exist.

  • Cache request instruction

Cache-Control: No-cache, no-store, max-age=

, max-stale[=

], min-fresh=

, no-transform, only-if-cached


  • Cache response instruction

Cache-control: Public, private, no-cache, no-store, no-transform, proxy-revalidate, max-age=

, s-maxage=

, must-revalidate

  • Cache-control command description
instruction instructions
public All content will be cached (both client and proxy can be cached)
private Content is only cached in private caches (only clients can cache, not proxy servers)
no-cache You must verify with the server that the returned response has been changed before you can use it to satisfy subsequent requests to the same url. Therefore, if an appropriate validation token (ETag) exists, no-cache initiates a round-trip communication to validate the cached response, avoiding downloading if the resource has not been changed.
no-store None of the content is cached in cache or temporary Internet files
must-revalidation/proxy-revalidation If the cached content is invalid, the request must be sent to the server/proxy for revalidation
max-age=xxx (xxx is numeric) Cached content will expire after XXX seconds. This option is only available in HTTP 1.1 and has a higher priority if used with last-Modified

4) Negotiate cache related fields

The negotiation cache fields are last-modified/if-modified-since, Etag/ if-none-match.

5) Negotiate the cache effectiveness process

  • First request from browser:

  • Second request from browser:

6) Browser caches – how do you choose which one to use and when?

Browser caches are divided into strong cache and negotiated cache. When a client requests a resource, the process for obtaining the cache is as follows:

  • Let’s start with some of this resourcehttp headerDetermine whether it matches the strong cache. If it matches, the cache resources are directly obtained from the local server without sending requests to the server.
  • When the strong cache does not hit, the client sends a request to the server, and the server sends a request through anotherrequest headerVerify that the resource matches the negotiated cache, calledhttpThen verify that, if a match is hit, the server will request back, but does not return the resource, but tells the client to directly get the resource from the cache, the client will get the resource from the cache after receiving the return;
  • Strong and negotiated caches have in common that the server does not return the resource if the cache is hit; The difference is that the strong cache does not send requests to the server, but the negotiated cache does. When the negotiation cache also dies, the server sends the resource back to the client.
  • whenctrl+f5When the page is forced to refresh, it is loaded directly from the server, skipping the strong cache and negotiated cache.
  • whenf5When the page is refreshed, the strong cache is skipped, but the negotiated cache is checked.
  • Detailed articles are added at πŸ‘‡

  • Do you know 304? Graphic strong cache and negotiated cache

  • Links: juejin. Cn/post / 697452…

5, HTTP common header fields

(1) General Motors head

field instructions
Request URL Requested domain name
Request Method Request way
Status Code The status code returned
Remote Address The remote address requested

The Response header

field instructions
Cache-Control The caching mechanism that the server should follow
Connection Keep-alive connection mode
Content-Encoding The Web server indicates what compression methods it uses (gzip, Deflate)
Content-Language The Web server tells the browser the language of the object it is responding to
Content-Length The Web server tells the browser the length of the object it is responding to
Content-Range The Web server indicates which part of the entire object the response contains, for example: Content-range :bytes
Content-Type The Web server tells the browser the type of its response object. Such as: the content-type: application/XML
ETag informThe clientEntity identification is a resource that can be identified byString formThe way to make a unique identification, the value has strong and weak points
Expired When does the Web server indicate that the entity expires
Last-Modified The Web server indicates when the entity was last modified
Set-Cookie Cookie information used to start state management
Location Used with redirection, redirection

(3) Request header

field instructions
Accept Acceptable response Content Types
Accept-Charset Acceptable character set
Accept-Encoding How to encode acceptable response content
Accept-Language Browsers receive supported languages
Accept-Datetime An acceptable time-dependent version of the response content
Authorization Authentication information of the resource to be authenticated in HTTP
Cache-Control Used to specify whether caching is used in the current request/reply
Connection The type of connection that the client (browser) wants to use preferentially
Cookie An HTTP Cookie Set by the previous server via set-cookie
Content-MD5 The binary MD5 hash value (digitally signed) of the content of the request body, the result encoded in Base64
Content-Length The length of the request body in base 8
Content-Type MIME type of the request body (for POST and PUT requests)
Referrer The source protocol, which is the URL that sends the request page
Expect Indicates that the client is asking the server to perform a specific behavior
From The email address of the user who initiated this request
If-Modified-Since 304 unmodified is allowed if the corresponding resource has not been modified
If-None-Match Conditional request, determine whether the entity ETag is inconsistent with the resource ETag, if not, return 200, request response and negotiation cache; If so, 304 Not Modified is returned and can be read from the local cache.
If-Unmodified-Since A response is sent only if the entity has not been modified since a certain time
Max-Forwards Limits the number of times the message can be forwarded by the proxy and gateway
Range Represents a request for a portion of an entity, byte offset starting at 0
User-Agent The browser identity string
Upgrade Requires that the server be upgraded to a higher version protocol

(4) the Cookie

1) Header field of Cookie service

Header field name instructions The first type
Set-Cookie Cookie information used to start state management Response header field
Cookie Cookie information received by the client Request header field
  • Attributes of the set-cookie field:
attribute instructions
NAME=VALUE The name given to the Cookie and its field value (required)
expires=DATE Cookie validity period (specifies the validity period of the Cookie that can be sent by the browser. If this parameter is not specified, it is used until the browser is closed by default)
path=PATH The file directory used to limit the range in which cookies are sent (defaults to the document directory if not specified)
Domain – domain name Domain name used as the Cookie object (default to the name of the server that created the Cookie if not specified)
Secure Cookies are sent only for SECURE HTTPS communication
httpOnly Restrict cookies so that they cannot be accessed by JavaScript scripts
  • Cookies:

Tell the server that when the client wants HTTP state management support, it includes the Cookie it receives from the server in the request. If multiple packets are received, the packets can also be sent as multiple packets.

6. HTTP status code

Status code describe
1XX Prompt information
2XX Yes, the request was processed successfully
3XX Redirection correlation
4XX Client error
5XX Server side error

The common ones are 200 (normal), 404 (the page resource cannot be found), 304 (jump page), 500 (server error), etc., as follows:

Status code meaning use
100 Continue to Continue The client should continue with its request
101 Switching Protocols Switching protocol The server switches protocols based on client requests. You can only switch to a more advanced protocol, for example, the new version of HTTP
200 OK request successful Typically used for GET and POST requests
201 Created a Created The new resource was successfully requested and created
202 Has been Accepted The request has been accepted, but processing is not complete
203 Authoritative Information is non-authoritative Information The request succeeded. The meta information returned is not the original server, but a copy
204 No Content Indicates No Content The server processed successfully, but did not return content. You can ensure that the browser continues to display the current document without updating the web page
205 Reset Content Resets the Content The server is successful, and the user end (for example, browser) should reset the document view. Use this return code to clear the browser’s form field
206 Partial Content Indicates Partial Content The server successfully processed some of the GET requests
300 Choose Multiple Choices The requested resource can include multiple locations, and a list of resource characteristics and addresses can be returned for user terminal (e.g., browser) selection
301 Moved Permanently The requested resource has been permanently moved to the new URI, the return message will include the new URL, and the browser will automatically redirect to the new URL. Any future new requests should be replaced with a new URI
302 Found a temporary move Similar to 301. But resources are moved only temporarily. The client should continue to use the original URI
304 Not Modified The requested resource is not modified, and the server does not return any resources when it returns this status code. Clients typically cache accessed resources by providing a header indicating that the client wants to return only resources that have been modified after a specified date
305 Use Proxy Uses Proxy The requested resource must be accessed through a proxy
400 Bad Request Client request syntax error, server cannot understand
401 Unauthorized The request requires user authentication
402 Payment Required Reserved for future use
403 Forbidden The server understands the request from the requesting client, but refuses to execute the request
404 Not Found The server could not find the resource (web page) based on the client’s request. With this code, a web designer can set up a personalized page that says “the resource you requested could not be found.
405 Method Not Allowed The method in the client request is disabled
503 Service Unavailable The server is temporarily unable to process client requests due to overloading or system maintenance. The length of the delay can be included in the server’s retry-after header
504 Gateway Time-out The server acting as a gateway or proxy did not get the request from the remote server in time
505 HTTP Version not supported The server did not support the HTTP version of the request and could not complete the processing

7. HTTP request mode scenario

(1) HTTP request mode

  • get, requests the specified page information and returns the entity body;
  • post, requesting the server to acceptThe document specifiedIdentified as a pairURLA new dependent entity of
  • head, similar to a GET request, except that the response is not returnedSpecific content, User acquisitionThe header;
  • optionsAllows the client to view the serverperformance, such as the server support requests and so on;
  • PUT, transfer files;
  • DELETE, delete files;
  • OPTIONS, ask for support methods;
  • TRACE, tracing the path;
  • CONNECTTo useThe tunnel protocolConnect the proxy.

(2) The difference between GET and HEAD

  • HEADThe methods andGETAgain, except that the server does not return in responseThe message body. The responseHEADThe request ofHTTPThe meta information contained in the header should be the same as the information sent in response to a GET request. This method can be used to obtain meta information about the entity implied by the request without transferring the entity body itself.
  • This method is typically used to test the validity, accessibility, and latest modifications of hypertext links.
  • rightHEADThe response to a request can be cacheable because the information contained in the response can be used to update a previously cached entity from that resource. If the new field value indicates that the cached entity is different from the current entity (e.gContent-Length , Content-MD5 , ETag ζˆ– Last-ModifiedThe cache must treat the cache entry as expired.

(3) The difference between GET and POST

  • GETParameters throughurlPass,POSTOn thebodyIn the. (According to the HTTP protocol,urlIn the request header, so the size is limited).
  • GETRequest inurlThe parameters passed in thePOSTNo. The reason is shown in ↑
  • GETIs harmless when the browser falls back, whilePOSTThe request will be submitted again.
  • GETThe request is initiated by the browsercacheAnd thePOSTNo, not unless you set it up.
  • GET ζ―” POSTMore insecure because the parameters are directly exposed inurl, so it cannot be used to transmit sensitive information.
  • For the data type of the parameter,GETWe only acceptASCIICharacters, andPOSTThere is no limit.
  • GETThe request can only proceedurl(x-www=form-urlencoded)Coding, andPOSTSupports multiple encoding modes.
  • GETGenerate a packet;POST Produce twoTCPPackets. forGETThe way the browser will put the requesthttp ηš„ header ε’Œ dataThe server responds200(Return data). And forPOST, the browser sends firstheader, server response100 continue, and the browser sends itdata, server response200 OK(Return data).

(4) Why do cross-domain complex requests need to be prechecked?

  • Complex requests can have adverse effects on the server.
  • For example,delete ε’Œ put, both of which make changes to the server data before the request is madeAsk the server first, whether the current page domain name is in the serverPermission to list, the browser sends a formal request only after the server permits it. Otherwise, the browser does not send a formal request.

8. HTTP request process

(1) Collection of questions

  • What happens when you enter the URL from the browser address bar
  • Url rendering process
  • Parsing parameters in url (write code)
  • Url input to the page display process
  • HTML parses the rendering process

(2) Problem solving

  • The browser performs DNS domain name resolution on the requested URL and finds the real IP address.

  • Based on this IP address, find the corresponding server and initiate the TCP three-way handshake.

  • After a TCP connection is established, an HTTP request is sent.

  • The server responds to the HTTP request, and the browser gets the HTML code;

  • The browser parses the HTML code and requests resources (such as JS, CSS, images, etc.) in the HTML code.

    Note: get the HTML code before you can find these resources;

  • The browser renders the page to the user;

  • The server closes the TCP connection.

(3) Supplement

After understanding the HTTP request process, you need to understand:

β‘  How to resolve DNS domain name;

β‘‘TCP three-way handshake;

β‘’ Why do you shake hands three times?

(4) Why should HTTP requests be implemented based on TCP?

β‘€ TCP waved four times;

β‘₯ Why do you wave four times?

Why establishing a connection is a three-way handshake and closing a connection is a four-way wave?

⑧ What if the connection has been established, but the client suddenly fails?

⑨ HTTP request modes.

9. HTTP rendering steps

The HTTP rendering steps are:

  • β‘  Parse HTML files and build DOM Tree;
  • 2. Parse CSS files and build CSSOM Tree (CSS rule Tree);
  • (3) Combine THE DOM Tree with the CSSOM Tree to construct the Render Tree.
  • (4) Reflow: Calculate node information according to Render Tree;
  • β‘€ Repaint: Paint the entire page based on the calculated information.

(3) HTTPS protocol 🧭

1. Advantages and disadvantages of HTTPS

(1) Advantages

1) Send data to the correct client

Using THE HTTPS protocol, users and servers can be authenticated to ensure that data is sent to the correct clients and servers.

2) Safer

HTTPS is a network protocol that uses SSL and HTTP to encrypt transmission and authenticate identity. It is more secure than HTTP and protects data from theft and alteration during transmission, ensuring data integrity.

3) Increase the cost of man-in-the-middle attacks

HTTPS is the most secure solution under the current architecture, and while it is not absolutely secure, it significantly increases the cost of man-in-the-middle attacks.

4) Higher search rankings

Google jumped to its search algorithm in 2014, and websites encrypted with HTTPS will rank higher in search results.

Baidu also released a supportive attitude towards HTTPS sites in 2018, indicating that HTTPS will affect search ranking as one of the premium features.

(2) Disadvantages

1) Page rendering takes more time

Due to SSL, the HTTPS handshake phase is time-consuming and can increase the page load time by nearly 50%.

2) Increased costs

SSL certificates cost money, and more powerful certificates cost more.

3) HTTPS connection caching is not as efficient as HTTP

HTTPS connection caching is not as efficient as HTTP, increases data overhead and power consumption, and even compromises existing security measures.

4) SSL certificates usually require IP binding

SSL certificates usually need to be bound to IP addresses. Multiple domain names cannot be bound to the same IP address. IPv4 resources cannot support such consumption.

5) Limitations

The HTTPS protocol also has a limited range of encryption and has little effect on hacker attacks, denial of service attacks and server hijacking. Most importantly, the SSL certificate credit chain system is not secure, especially in cases where some countries can control the CA root certificate, man-in-the-middle attacks are just as feasible.

2. HTTPS access process

A collection of questions:

  • HTTPS handshake process
  • HTTPS request process
  • HTTPS encryption and decryption

Brief explanation:

  • Customers to usehttp ηš„ URLaccessWebServer, required withWebServer establishmentSSLThe connection.
  • WebAfter receiving the request from the client, the server will send the certificate information of the website (contained in the certificate)The public key) sends a copy to the client.
  • Client browser andWebThe server starts negotiation.SSLThe security level of the connection, that is, the level of information encryption.
  • The browser on the client establishes the session key according to the mutually agreed security level, then encrypts the session key using the website’s public key and transmits it to the website.
  • WebThe server decrypts the session key using its own private key.
  • WebThe server uses the session key to encrypt communication with the client.

Detailed explanation:

  1. Client initiatingHTTPSrequest

The user enters an HTTPS url into the browser and connects to port 443 of the server.

  1. Server Configuration

Refers to the above mentioned digital certificates;

  1. Send the certificate

After receiving the request from the client, the Web server sends a copy of the certificate information (including the public key) of the website to the client.

  1. The client parses the certificate

The client checks the certificate and verifies whether the public key is valid. If a problem occurs, a warning is displayed. If there is no problem, generate a random value (private key) and continue the encryption with the certificate;

  1. Transmitting encrypted information

The client will encrypt the random value (private key) to the server, the server will decrypt it;

  1. The server decrypts the information

The server decrypts it to a random value (the private key) and then symmetrically encrypts the content through that value. Symmetric encryption means that the information to be returned is mixed with a random value (the private key), so that the data cannot be retrieved unless the random value (the private key) is known.

  1. Transmit encrypted information

Continue to pass encrypted information to the client;

  1. The client decrypts the information

The client decrypts the message sent by the server with the previously generated private key (random value), and then obtains the decrypted content.

3. Why is HTTPS secure

Compared with HTTP, HTTPS joins TLS/SSL. It is a layer security protocol between TCP and HTTP.

TLS/SSL relies on three basic algorithms: hash, symmetric encryption, and asymmetric encryption. The functions of these three algorithms are as follows:

  • Verify the integrity of information based on hash function;
  • Symmetric encryption algorithm uses negotiated secret keys to encrypt data.
  • Asymmetric encryption implements identity authentication and key negotiation.

4. How to optimize HTTPS performance?

(1) HTTPS access speed optimization

1) Set up HSTS

The server returns an HTTP header of HSTS, and after the browser retrieves the header, the request is redirected internally to www.baidu.com by default, whether the user types www.baidu.com or www.baidu.com, for some time.

2) Session resume

A Session resume, as its name implies, reuses sessions to simplify handshakes.

Reduced CPU consumption because asymmetric key exchange calculations are not required. Improved access speed eliminates the need for a second full handshake, saving an RTT and computing time.Copy the code

3) Set Ocsp stapling to Nginx

When the browser sends a Client Hello message, it carries a Certificate status request extension. After viewing the extension, the server directly returns the OSCP content to the browser to complete the Certificate status check. Since the browser does not need to query the CA site for certificate status, this feature can significantly improve access speed.

4) useSPDY orHTTP2

SPDY’s biggest feature is multiplexing, which allows multiple HTTP requests to be sent together over the same connection, unlike current HTTP protocols, which can only be sent sequentially one by one.

HTTP2 supports multiplexing to the same effect.

Current implementations of SPDY and HTTP2 use the HTTPS protocol by default. Both SPDY and HTTP2 support existing HTTP semantics and apis and are almost transparent to Web applications.Copy the code

5) False start

The principle of False Start is to send client_KEY_exchange data together to save one RTT.

(2) HTTPS computing performance optimization

1) ECC ellipse is preferred for arithmetic encryption

ECC elliptic encryption arithmetic is much faster and better than ordinary discrete logarithm calculation.

Symmetric key size The size of RSA and DH keys ECC Key Size
80 1024 160
112 2048 224
128 3072 256
192 7680 384
256 15360 521

Symmetric key algorithm: AES, DES, RC4

Asymmetric encryption algorithms: RSA, DH, and ECC

2) Use the latest version of OpenSSL

OpenSSL is an open source software library package that applications can use to securely communicate and avoid eavesdropping.

In general, the new version of OpenSSL is faster and more secure than the old version.

3) Hardware acceleration scheme

  • SSL dedicated accelerator card.

  • GPUSSL acceleration.

4) TLS remote proxy calculation

πŸ–οΈ 2. Browser storage

1. Browser storage

features cookie localStorage sessionStorage indexedDB
Data life cycle Generally, it is generated by the server. You can set the expiration time Unless it’s cleaned up, it’s always there Clean up when the page is closed Unless it’s cleaned up, it’s always there
Data store size 4K 5M 5M infinite
Communicates with the server It is carried in the header each time and has an impact on request performance Don’t participate in Don’t participate in Don’t participate in

Add: Cookie is not used to store, but to communicate with the server, need to access please own encapsulation API.

LocalStorage has its own getItem and setItem methods, it is very convenient to use.

LocalStorage Note:

  • LocalStorage can only store strings. Access JSON data with json.stringify () and json.parse ().

  • If setItem is disabled, try… Catch Catches an exception.

Cookie, localStorage, and sessionStorage

(1) What are cookie, localStorage and sessionStorage?

1) cookie

  • A cookie is a very specific thing. It refers to a kind of data that can be stored permanently in the browser. It’s just a data storage function implemented by the browser.

  • The cookie is generated by the server and sent to the browser, which saves the cookie in the form of KV in a text file in a directory. The cookie will be sent to the server when the same website is requested next time.

  • The cookie expiration time is set by the client. If the expiration time is not set, it indicates that the lifetime of the cookie is during the browser session. When the browser window is closed, the cookie disappears. Cookies with a lifetime of a browser session are called session cookies. If an expiration time is set, the cookie remains valid until the set expiration time, even if the window or browser is closed.

  • Session cookies are generally stored in memory rather than hard disk, although this behavior is not regulated by the specification. If the expiration time is set, the browser saves cookies to the hard disk, closes the browser, opens the browser, and these cookies remain valid until the expiration time is exceeded. Different browsers have different ways of dealing with cookies that are stored in memory.

  • You can use document.cookie = “” to set the value of the cookie. Cookie values exist as key-value pairs. If the keys are set the same, the original values will be overwritten. When the keys are different, the pair is superimposed.

2) localStorage

  • It is always valid and is always saved when a window or browser is closed and therefore used as persistent data;
  • Same-origin Windows are shared and not invalidated, and remain in effect whether the window or browser is closed or not.

3) sessionStorage

  • A form of browser storage.

  • It is valid only before the current browser window closes and cannot be persistent.

  • In the same browser, if you jump to a new page from the current page, you can share it. However, if you open a new page directly, you cannot share it.

(2) Similarities and differences between Cookie, localStorage and sessionStorage

1) The similarities of the three are as follows:

  • All are saved on the browser side and are homologous.

2) The difference between the three is:

  • Communication with the server is different:

    Cookie data is always carried in same-origin HTTP requests (even if it is not needed), i.e. cookies are passed back and forth between the browser and the server, whereas sessionStorage and localStorage do not automatically send data to the server, only locally stored;

    Cookies are sent out with HTTP requests, while loacalStorage and sessionStorage are not sent out with HTTP requests.

    Cookie data also has the concept of path, which can restrict cookies to a specific path.

  • Storage size limits also differ:

    Cookie data cannot exceed 4K, and because each HTTP request carries cookies, cookies are only suitable for storing small data, such as session id;

    SessionStorage and localStorage, while also limited in size, are much larger than cookies, reaching 5M or more.

  • Different data validity periods:

    SessionStorage: only valid until the current browser window closes;

    LocalStorage: always valid, saved even when the window or browser is closed and therefore used as persistent data;

    Cookie: Only valid before the set cookie expiration time, even if the window is closed or the browser is closed.

  • Different scopes:

    SessionStorage is not shared across different browser Windows, even on the same page;

    Localstorage and cookies are shared in all same-origin Windows;

3. Use of cookies

(1) Save the user login status

For example, storing the user ID in a cookie so that the user does not need to log in again the next time they visit the page is a feature that is now available in many forums and communities.

Cookies can also set the expiration time, when the time limit is exceeded, the cookie will automatically disappear. As a result, the system can often prompt users for how long they want to stay logged in: common options are one month, three months, a year, and so on.

(2) Tracking user behavior

A weather website, for example, can display local weather conditions based on the region selected by the user. If it is tedious to choose the location every time, it will appear very humanized after using cookies. The system can remember the region visited last time, and when opening the page next time, it will automatically display the weather situation of the region where the user was last time.

Because everything is done in the background, such pages are very easy to customize as if they were customized for a particular user. If the site provides the ability to change the skin or layout, you can use cookies to record the user’s options, such as background color, resolution, etc. When the user visits the interface next time, the interface style of the last visit can still be saved.

4. Session and Token

(1) the Session

Here’s an example:

  • sessionLiterally, yesThe session. It’s kind of like when you’re talking to someone, how do you know that you’re talking to A Joe and not a Joe? There must be something about the other person (looks, height, etc.) that says he is Zhang SAN.
  • sessionSimilarly, the server needs to know who is currently sending the request to it.
  • To make this distinction, the server assigns a different”identity“, this identity means what we usually saysessionId. And then every time the client sends a request to the server, it carries this”identity“, and the server knows who the request came from.
  • As for how the client saves this.”identityFor browser clients, the default is used in most casescookieOf course, can also be usedlocalStorage ε’Œ sessionStorageStore thisidentity, you can use it according to your own needs.
  • It’s important to note that,sessionFor a session, even if the same page is opened twice, it is considered the same session.
  • Server usagesessionPut the user’s informationtemporarySaved on the server after the user leaves the sitesessionWill be destroyed.
  • This kind of user information storage method is relativecookieIt’s safer to talk, butsessionThere is adefects: If the Web server is load balanced, then the next operation request arrives at another serversessionWill be lost.

To sum up, session:

  • When the program needs to create one for a client requestsessionThe server first checks to see if the client’s request already contains onesessionIdentification (known assessionId), if it has been previously created for this clientsession, the server does thissessionIdI’m going to put the corresponding thetasessionRetrieves out to use (retrieves not, will create a new); Otherwise, if the client request does not containsessionId, creates one for the clientsessionAnd generate one with thissessionThe associatedsessionId , sessionIdThe value of should be a string that is neither repetitive nor easy to find patterns to mimicsessionIdWill be returned to the client for saving in this response. Save thissessionIdCan be adoptedcookieOr it could belocaStorage ε’Œ sessionStorageSo that the browser can automatically send the identity to the server according to the rules during the interaction.

(2) Token

  • Token-based authentication is ubiquitous in the Web world. In most Internet companies that use Web APIS, tokens are the best way to handle authentication under multiple users.

  • The following features allow you to use token-based authentication in your application:

    • Stateless and extensible;
    • Support for mobile devices;
    • Cross-program call;
    • Security.
  • Most of the apis and Web applications you’ll see use tokens. Facebook, Twitter, Google, GitHub, etc.

  • Detailed articles are added at πŸ‘‡

  • To solve the browser storage problem, have to understand cookies,localStorage and sessionStorage

  • Links: juejin. Cn/post / 697340…

🏜️

1. What is same-origin policy?

A security policy of the browser, which means that only when the protocol, domain name, and port of a WEB address are the same, they can access each other. That is, if the protocol, domain name, and port are different, the browser forbids the page from loading or executing scripts in different domains from its own.

2. Why do browsers have same-origin policies?

If there is no same-origin policy, others can easily obtain the cookie information of our website or conduct DOM operations on the webpage.

This is a very scary thing, especially the cookie information, which contains sessionID, which is an important credential of the session session with the server. If someone gets the cookie, it may lead to data theft and other consequences.

3. What contents are restricted by the same Origin policy?

  • Data stored in a browser, such aslocalStroage 、 Cookie ε’Œ IndexedDBCannot be accessed across domains through scripts;
  • Scripts cannot be used to operate in different domainsDOM οΌ›
  • Can’t passajaxRequest data from different domains.

4. Cross-domain problem solutions

(1) Cross domain through JSONP

A collection of questions:

  • JSONPThe principle of
  • JSONPHow to Communicate securely

1) Principle of JSONP

JSONP (JSON with Padding) is a “usage mode” of JSON that allows web pages to request data from other domains.

According to the XmlHttpRequest object, which is subject to the same origin policy, web pages can get JSON data dynamically generated from other sources using the open policy of

The data captured with JSONP is not JSON, but rather arbitrary JavaScript that is run with a JavaScript interpreter rather than parsed with a JSON parser.

As a result, all JSONP Get requests sent by Chrome are of js type, not XHR.

2) JSONP consists of two parts: callback function and data

The callback function is the function that is called to be placed on the current page when the response arrives.

The data is the JSON data passed into the callback function, which is the argument to the callback function.

function handleResponse(response){
 console.log('The responsed data is: '+response.data);
}
var script = document.createElement('script');
script.src = 'http://www.baidu.com/json/?callback=handleResponse';
document.body.insertBefore(script, document.body.firstChild);
/*handleResonse({"data": "zhe"})*/
// The principle is as follows:
// when we request through the script tag
// The background will be based on the corresponding parameters (JSON,handleResponse)
// to generate the corresponding JSON data (handleResponse({"data": "}))
// Finally the returned JSON data (code) will be executed in the current JS file
// Now the cross-domain communication is complete
Copy the code

3) Disadvantages:

  • Only Get requests can be used.
  • Can’t registersuccess,errorThe event listener function, etc., cannot be easily identifiedJSONPWhether the request failed.
  • JSONPIt is easy to load code from other domainsCross-site request forgeryThe security of the attack cannot be ensured.

(2) Cross domain by modifying document.damain (same primary domain)

1) Prerequisites:

Both domains must belong to the same base domain! And the protocol used, the port must be consistent, otherwise document.domain can not be used for cross-domain, so only cross subdomain

In the root domain scope, it is possible to set the value of the domain property to its upper domain. For example, in the aaa.xxx.com domain, you can set domain to xxx.com but cannot set domain to XXX.org or com.

2), for example,

For example: www.a.com/a.html and HTTP…

β‘  In www.a.com/a.html:

document.domain = 'a.com';
var ifr = document.createElement('iframe');
ifr.src = 'http://www.script.a.com/b.html';
ifr.display = none;
document.body.appendChild(ifr);
ifr.onload = function(){
 var doc = ifr.contentDocument || ifr.contentWindow.document;
 // In this case, you can manipulate the doc
 ifr.onload = null;
};
Copy the code

β‘‘ In www.script.a.com/b.html:

document.domain = ‘a.com’;

Pass document.name = ‘xxx.com’ with js under two HTML; Set consistency to achieve mutual access.

(3) Use window.name to cross domains

1) How to apply window.name to cross domains?

Window.name works by loading a cross-domain HTML file in iframe (generally dynamically created). The HTML file then assigns the string content passed to the requester to window.name. The requester can then retrieve the window.name value as a response.

2) limit

Cross-domain capability of iframe tags;

The ability of the window.name property value to persist after the document is refreshed (and the maximum allowed is about 2M).

3), for example,

For example: www.a.com/a.html and HTTP…

1) a.h HTML

<script>
  var iframe = document.createElement('iframe');
  iframe.style.display = 'none'; / / hide
 
  var state = 0; // Prevents the page from being refreshed indefinitely
  iframe.onload = function() {
      if(state === 1) {
          console.log(JSON.parse(iframe.contentWindow.name));
          // Clear the created iframe
          iframe.contentWindow.document.write(' ');
          iframe.contentWindow.close();
          document.body.removeChild(iframe);
      } else if(state === 0) {
          state = 1;
          // When proxy.html is blank, it points to the current domain.
          // Blocked a frame with origin "http://www.a.com/a.html" from accessing a cross-origin frame.
          iframe.contentWindow.location = 'http://www.a.com/a.html'; }}; iframe.src ='http://www.b.com/b.html';
  document.body.appendChild(iframe);
</script>
Copy the code

β‘‘ Contained in b.com/b.html:

<script>
     window.name = 'Contents to be transmitted';
</script>
Copy the code

(4) Use HTML5’s newly introduced window.postmessage method

1) Window.postmessage method

New HTML5 feature that can be used to send messages to all other Window objects. It should be noted that we must ensure that all scripts are executed before sending MessageEvent. If it is called during function execution, subsequent functions will timeout and cannot be executed.

2), for example,

1) a.com/index.html:

<iframe id="ifr" src="b.com/index.html"></iframe>
<script type="text/javascript">
    window.onload = function() {
         var ifr = document.getElementById('ifr');
         var targetOrigin = 'http://b.com'; / / if written 'http://b.com/c/proxy.html'
         // postMessage will not be executed if it is written as 'http://c.com'
         ifr.contentWindow.postMessage('I was there! ', targetOrigin);
    };
</script>
Copy the code

(2) b.com/index.html:

<script type="text/javascript">
     window.addEventListener('message'.function(event){
      // Determine the message source address using the origin attribute
      if (event.origin == 'http://a.com') {
         alert(event.data); // pop "I was there!"
         alert(event.source); // A reference to the window object in a.com and index.html
         // Because of the same origin policy, event.source cannot access the window object}},false);
</script>
Copy the code

(5) cors

1) CORS is cross-domain

Cross-origin Resource Sharing (CORS) is a specification of browser technology. It provides a method for Web services to send sandbox scripts from different domains to avoid the same Origin policy of browsers and ensure secure cross-domain data transfer. Modern browsers use CORS in API containers such as XMLHttpRequest to reduce HTTP request risk sources. Unlike JSONP, CORS supports HTTP requirements in addition to GET requirements.

2) The server generally needs to add one or more of the following response headers:

Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: POST, GET, OPTIONS
Access-Control-Allow-Headers: X-PINGOTHER, Content-Type
Access-Control-Max-Age: 86400
Copy the code

3) Cross-domain requests do not carry Cookie information by default. If you need to carry Cookie information, configure the following parameters:

"Access-Control-Allow-Credentials": true
/ / Ajax Settings
"withCredentials": true
Copy the code

4) The implementation of CORS in IE is XDR

var xdr = new XDomainRequest();
xdr.onload = function(){
 console.log(xdr.responseText);
}
xdr.open('get'.'http://www.baidu.com'); . xdr.send(null);
Copy the code

5) Implementations in other browsers are in XHR

var xhr = new XMLHttpRequest();
xhr.onreadystatechange = function () {
 if(xhr.readyState == 4) {if(xhr.status >= 200 && xhr.status < 304 || xhr.status == 304) {console.log(xhr.responseText);
  }
 }
}
xhr.open('get'.'http://www.baidu.com'); . xhr.send(null);
Copy the code

6) Implement CORS across browsers

unction createCORS(method, url){
 var xhr = new XMLHttpRequest();
 if('withCredentials' in xhr){
  xhr.open(method, url, true);
 }else if(typeofXDomainRequest ! ='undefined') {var xhr = new XDomainRequest();
  xhr.open(method, url);
 }else{
  xhr = null;
 }
 return xhr;
}
var request = createCORS('get'.'http://www.baidu.com');
if(request){
 request.onload = function(){... }; request.send(); }Copy the code

(6) Create script dynamically

Script tags are not restricted by the same origin policy.

function loadScript(url, func) { 
		var head = document.head || document.getElementByTagName('head') [0]; 
		var script = document.createElement('script'); 
		script.src = url; 
		script.onload = script.onreadystatechange = function() { 
			if(!this.readyState || this.readyState == 'loaded' || this.readyState == 'complete') {  
				func();  
				script.onload = script.onreadystatechange = null; }}; head.insertBefore(script,0);
	}
	window.baidu = { 
		sug: function(data) { 
			console.log(data); 
		}
	}
	loadScript('http://suggestion.baidu.com/su?wd=w'.function() {
		console.log('loaded')});// Where is the content we requested?
	// We can see the content introduced by script in source in the Chorme debug panel
Copy the code

(7) Use location.hash across domains

1) Principle:

The idea is to use location.hash to pass values.

2.

Suppose the file cs1.html under the domain name a.com is communicating with the file cs2.html under the domain name cnblogs.com.

A hidden iframe is automatically created. The SRC of the iframe points to the cs2.html page under the cnblogs.com domain name

β‘‘ cs2.html responds to the request and then passes the data by modifying the hash value of cs1.html

(3) At the same time, add a timer on cs1.html to determine whether the value of location.hash has changed at intervals. If it has changed, the hash value will be obtained

Note: Since the two pages are not in the same domain, IE and Chrome do not allow you to change the value of parent-location. hash, so use a proxy iframe under the a.com domain name.

The code is as follows:

First, the cs1.html file from a.com:

function startRequest() {
		var ifr = document.createElement('iframe'); 
		ifr.style.display = 'none'; 
		ifr.src = 'http://www.cnblogs.com/lab/cscript/cs2.html#paramdo';
		document.body.appendChild(ifr);
 
	}
function checkHash() {
		try {			
			var data = location.hash ? location.hash.substring(1) : ' ';			  
			if(console.log) {				   
				console.log('Now the data is '+ data); }}catch(e) {};
}
setInterval(checkHash, 2000);
Copy the code

Cs2.html under the domain cnblogs.com:

// Simulate a simple parameter handling operation
 
switch(location.hash) {	 
	case '#paramdo':
		  callBack();		  
		break;		 
	case '#paramset':
		   / / do something...
		  break;
}
 
function callBack() {	 
	try {		  
		parent.location.hash = 'somedata';		 
	} catch(e) {
		   // The security mechanism of Internet Explorer and Chrome cannot be modified with parent-location. hash,
		   // So use an intermediate proxy iframe under the CNBlogs domain
		var ifrproxy = document.createElement('iframe');		  
		ifrproxy.style.display = 'none';		  
		ifrproxy.src = 'http://a.com/test/cscript/cs3.html#somedata'; // Note that the file is in the "a.com" field
		document.body.appendChild(ifrproxy); }}Copy the code

Domain name cs3.html under a.com:

// Since parent. Parent belongs to the same domain as itself, we can change its location.hash value
 
parent.parent.location.hash = self.location.hash.substring(1);
Copy the code

The websocket (8)

A collection of questions:

  • What protocol does real-time collaborative editing use? websocket
  • What is websocket and how is it different from polling?
  • How is Websocket set up? Relationship to HTTP?
  • Does Websocket have origin restriction?

1) What is websocket?

Web Sockets are a browser API that aims to provide full-duplex, bidirectional communication over a single persistent connection. (Same-origin policy does not apply to Web Sockets)

2) Web Sockets principle

The client sends an HTTP request to the server to upgrade the protocol. Then the server switches the protocol and returns the request to the client. The connection established thus far is upgraded from HTTP to Web Socket.

3) When will websocket be used?

This works only on servers that support the Web Socket protocol.

var socket = new WebSockt('ws://www.baidu.com');//http->ws; https->wss
socket.send('hello WebSockt');
socket.onmessage = function(event){
 var data = event.data;
}
Copy the code

4) polling

  • Polling is when the client repeatedly asks the server if there is any new information.
  • Compatibility: Short polling > Long polling > WebSocket.
  • Performance: Websocket > Long poll > Short poll.

(9) Nginx reverse proxy

1) What is Nignx?

  • You don’t need the target server, but you need to set up a relaynginxServer, forForward the request.
  • It needs to be modified at the operations level, and it is possible that the requested resources are not within our control (third party), so this approach cannot be used as a general solution.

β›Ί front-end security

1. Cross-site scripting attacks XSS

(1) What is XSS

Cross-site Scripting (XSS) attacks exploit vulnerabilities of web pages to inject malicious code into web pages so that users can execute the injected malicious code when loading the web pages.

(2) Attack types of XSS

There are three types of XSS:

  • Nonpersistent span (also called reflex)
  • Persistent cross-site (also called storage)
  • DOMcross-site

1) Non-persistent cross-station (reflective)

β‘  Attack Procedure

  • The attacker constructs a specialURL, which contains malicious code.
  • The user opens with malicious codeURLWhen, the web server will send malicious code fromURLTake out, splice inHTMLIs returned to the browser.
  • When the user’s browser receives the response, it parses it and executes the malicious code mixed in.
  • Malicious code steals user data and sends it to the attacker’s website, or impersonates the user’s behavior and calls the target website interface to perform the operations specified by the attacker.

β‘‘ Attack Scenario

Reflective XSS (also known as non-persistent XSS) vulnerabilities are common in functions that pass parameters through urls, such as web site searches, jumps, and so on.

β‘’ Attack mode

Because users need to take the initiative to open malicious URL to take effect, attackers often combine a variety of means to induce users to click.

Reflective XSS can also be triggered by the contents of a POST, but the trigger condition is more stringent (the form submission page needs to be constructed and the user is directed to click), so it is very rare.

2) Persistent cross-site (storage)

β‘  Attack Procedure

  • The attacker submits malicious code to the database of the target website.
  • When the user opens the target website, the website server takes out the malicious code from the database and splices it inHTMLIs returned to the browser.
  • When the user’s browser receives the response, it parses it and executes the malicious code mixed in.
  • Malicious code steals user data and sends it to the attacker’s website, or impersonates the user’s behavior and calls the target website interface to perform the operations specified by the attacker.

β‘‘ Attack Scenario

Stored XSS attacks (also known as persistent XSS) are common on web features with user-saved data, such as forum posts, product reviews, and user messages.

(3) hazards

It is one of the most dangerous types of cross-site scripting, more inside-out than reflective and DOM XSS, and therefore more harmful because it does not require manual triggering by the user.

Any Web program that allows users to store data may have stored XSS vulnerability. When an attacker submits a piece of XSS code, it will be received and stored by the server and XSS will be used when all visitors visit a page.

3) DOM cross-site

β‘  Attack Procedure

  • The attacker constructs a specialURL, which contains malicious code.
  • The user opens with malicious codeURL 。
  • The user browser receives the response and parses it to the front endJavaScriptTake out theURLAnd execute the malicious code.
  • Malicious code steals user data and sends it to the attacker’s website, or impersonates the user’s behavior and calls the target website interface to perform the operations specified by the attacker.

(2) the harm

DOM usually stands for objects in HTML, XHTML, and XML. Using DOM allows programs and scripts to dynamically access and update the content, structure, and style of documents. It does not require the server to parse the response directly, triggering XSS depends on browser-side DOM parsing, so preventing DOM XSS is completely the responsibility of the front end, be careful! .

Summary:

The differences between reflective and memory types are:

Stored XSS malicious code is stored in the database, reflective XSS malicious code is stored in the URL.

The first two differences of DOM are:

DOM TYPE XSS attack, the extraction and execution of malicious code is completed by the browser side, which is a security vulnerability of front-end JavaScript itself, while the other two XSS are security vulnerabilities of the server side.

Comparison of the three:

type Storage area The insertion point
Reflective XSS URL HTML
Type stored XSS Back-end database HTML
The DOM model XSS Back-end database/front-end storage /URL The front-end JavaScript

(3) How to defend against XSS

Wherever there is input data, there can be XSS hazards.

1) Set HttpOnly

After the HttpOnly attribute is set in the cookie, the JS script cannot read the cookie information.

2) Escape strings

Most XSS attacks are carried out by the input and output of data as attack points. Therefore, data is filtered for these attack points.

Data includes input and output of front-end data and input and output of back-end data.

So what is data filtering? How do you filter the data?

Data filtering checks input formats, such as email, phone number, user name, password… Etc., input in accordance with the specified format. It’s not just the front end, the back end does the same filtering checks. Without data filtering, an attacker can bypass the normal input process and send Settings directly to the server using the relevant interface.

Therefore, the data can be filtered by encapsulating filtering functions to divert common input from several attackers without the browser parsing into script code.

function escape(str) {
  str = str.replace(/&/g.'& ');
  str = str.replace(/</g.'< ');
  str = str.replace(/>/g.'> ');
  str = str.replace(/"/g.'&quto; ');
  str = str.replace(/'/g.'& # 39; ');
  str = str.replace(/`/g.'the & # 96; ');
  str = str.replace(/\//g.'/ ');
  return str;
}
Copy the code

3) Whitelist

For displaying rich text, it is not possible to escape all characters as this would filter out the required format. In this case, whitelist filtering is usually adopted. You can also filter through the blacklist. However, because there are too many tags and tag attributes to be filtered, whitelist filtering is recommended.

  • Detailed articles are added at πŸ‘‡
  • Discussion on Web front-end security policy XSS and CSRF, and how to prevent?
  • Links: juejin. Cn/post / 697269…

2. Cross-site request forgery CSRF

(1) What is CSRF

Cross-site Request Forgery, also known as one-click attack or Session riding, commonly abbreviated CSRF or XSRF, Is a method of hijacking a user to perform unintended actions on a currently logged Web application. For example, the attacker induces the victim to enter a third party website, where he or she sends a cross-site request to the attacked website. Using the victim in the attacked website has obtained the registration certificate, bypassing the background user authentication, to impersonate the user to perform a certain operation on the attacked website.

(2) CSRF attack process

To complete a CSRF attack, the victim must complete two steps in sequence:

  • Log in to trusted site A and generate it locallyCookie 。
  • Visit dangerous site B without logging out of A.

Looking at this, you might say, “If I don’t meet one of these two criteria, I won’t be attacked by CSRF.” Yes, it does, but you can’t guarantee that the following won’t happen:

  • You can’t guarantee that after you log in to a website, you won’t open another onetabPage and visit another site.
  • You are not guaranteed to close your browser after you have done so locallyCookieExpires immediately. Your last session has ended. (Actually, closing the browser doesn’t end a session, but most people mistakenly think closing the browser is equivalent to logging out/ending the session……)
  • The site mentioned above may be a trusted and frequently visited site with other vulnerabilities.

(3) Characteristics of CSRF

  • Attacks are generally launched on third party sites, not the site being attacked. The attacked site cannot prevent the attack from happening.
  • Instead of stealing data directly, the attack uses the victim’s login credentials on the targeted site to impersonate the victim to submit the action.
  • The attacker can not obtain the login credentials of the victim during the whole process, just “fake”.

(4) CSRF attack scenario

Cross-site requests can be made in a variety of ways:

  • The pictureURL, hyperlinks,CORS 、 FormSubmit and so on. Part of the request can be directly embedded in third-party forums, articles, difficult to track.
  • CSRFIt is usually cross-domain because outlands are usually more easily controlled by attackers. However, if there are easily exploited functions in the local domain, such as forums and comment areas for Posting pictures and links, the attack can be directly carried out in the local domain, and this attack is more dangerous.

(5) Common attack types of CSRF

1) CSRF of GET type

CSRF utilization of the GET type is very simple and requires only one HTTP request. It is typically utilized as follows:

<img src="http://bank.example/withdraw? amount=10000&for=hacker" > 
Copy the code

After the victim to visit the page containing the img, the browser will automatically to http://bank.example/withdraw? Account =xiaoming&amount=10000&for=hacker Sends an HTTP request. Bank.example will receive a cross-domain request containing the victim’s login information.

2) POST CSRF

This type of CSRF is typically exploited using an auto-submitted form, such as:

 <form action="http://bank.example/withdraw" method=POST>
    <input type="hidden" name="account" value="xiaoming" />
    <input type="hidden" name="amount" value="10000" />
    <input type="hidden" name="for" value="hacker" />
</form>
<script> document.forms[0].submit(); </script>
Copy the code

When you visit the page, the form is automatically submitted, simulating a POST operation.

Post-type attacks are generally a little more stringent than GET, but still not complex. Any personal website, blog, website uploaded by hackers may be the source of attacks, back-end interface can not rely on the security of POST only above.

3) CSRF of link type

Link-type CSRFS are uncommon and require the user to click a link to trigger them, compared to the other two cases where the user opens the page and is caught. This type usually involves embedding malicious links in the pictures published in the forum, or inducing users to be lured in the form of advertisements. Attackers usually trick users into clicking with exaggerated words, such as:

<a href="http://test.com/csrf/withdraw.php?amount=1000&for=hacker" taget="_blank"Word-wrap: break-word! Important; "> <a/>Copy the code

(6) How to defend against CSRF

1) Verification code

It forces the user to interact with the application in order to complete the final request. This method can contain CSRF well, but the user experience is poor.

2) Referer check

This method has the lowest cost, but is not guaranteed to be 100% effective because the server does not always get the Referer and there is a risk that older browsers will forge the Referer.

3) token

The CSRF defense mechanism of token verification is recognized as the most appropriate solution.

(7) Difference between CSRF and XSS

  • Generally speakingCSRFIs made up ofXSSThe implementation,CSRFIt is often calledXSRF(CSRFThis can also be done by issuing a request directly from the command line.
  • So essentially,XSS 是Code injection problems.CSRF 是 HTTP problem.XSSIt’s the unfiltered content that causes the browser to execute the attacker’s input as code,CSRFIt’s because the browser is sendingHTTPRequest time.
  • Automatically takecookie, while the general websitesessionThere arecookieThe inside (TokenValidation can be avoided).
  • Detailed articles are added at πŸ‘‡
  • Discussion on Web front-end security policy XSS and CSRF, and how to prevent?
  • Links: juejin. Cn/post / 697269…

🏞️ other questions

1. What debugging tools Chrome has used

  • ElementPanel (mouse);
  • consolePanel (to output some prompts);
  • sourcesPanel (debugjs);
  • NetworkPanel (view status of network requests, interface calls…) .

2. Understanding the browser kernel

(1) It is mainly divided into two parts: rendering engine and JS engine

  • Rendering engine:Responsible for getting the content of the page (HTML CSS IMG…) , and calculate the display mode of the web page, and then output to the monitor or printer, the browser kernel is different for the syntax interpretation of the web page is also different, so the rendering effect is not the same.
  • Js engines:Parsing and executionjavascriptTo achieve the dynamic effect of the web page.

(2) Part of the browser kernel

  • IE: tridentThe kernel
  • Fireforx: gekcoThe kernel
  • Safari: webkitThe kernel
  • Opera:Used to beprestoThe kernel,OperaNow switch toGoogle - Chrome ηš„BlinkThe kernel
  • Chrome: Blink(based onwebkit.GooglewithOpera SoftwareJoint development)

🏑 6. Conclusion

In this article, from the basics of HTTP, to browser caching issues, to cross-domain, front-end security issues, systematically comb through the browser in the front-end interview points.

Here, the end of this article! Hope to help you ~

If you need to add to the article, or find small details wrong, welcome friends to leave a message in the comment section or contact VX :MondayLaboratory, timely correction ~

Let this interview content more perfect, benefit more in the preparation of small partners!

Finally, I wish everyone who read this article can get their favorite offer ~πŸ₯‚πŸ₯‚πŸ₯‚

🐣 Egg One More Thing

🏷️ PDF

πŸ‘‰ wechat public account Monday laboratory, click the navigation bar below the interview column briefly view the keyword to obtain ~

🏷️ Update address

πŸ‘‰ offer comes to the interview column

🏷 ️ set pieces

  • If you think this article is helpful to you, you might as well like to support yo ~~πŸ˜‰
  • That’s all for this article! See you next time! πŸ‘‹ πŸ‘‹ πŸ‘‹