Preface π οΈ
As you all know, browser principles are a time-honored topic for front-end interviews. Because browsers are really, really broad. From a simple HTTP knowledge to cross-domain issues, front-end security issues, and so on, all involve the principles of the browser. So, naturally, it is also one of the most important questions in the interview.
In the following article, I will explain all the browser-related questions I encountered in preparing for the interview, and make a systematic summary and summary. Start your HTTP learning journey
π 1. HTTP and HTTPS
(1) The relationship between HTTP and HTTPS π§
1. What are HTTP and HTTPS?
HTTP :(HyperText Transfer Protocol)
HTTPS :(Hypertext Transfer Protocol Secure) Hypertext Transfer Protocol
2. The difference between HTTP and HTTPS
http | https | |
---|---|---|
The name of the | Hypertext transfer protocol | Hypertext Transfer Security protocol |
The default port | 80 | 443 |
Way to send | cleartext | The encrypted |
security | Relatively poor security: easy to be monitored, disguised, tampered with | Security is relatively good: prevent eavesdropping, prevent camouflage, prevent tampering |
Response speed | Fast response (3 packs) | Slow response (12 packets) TCP 3 SSL 9 |
The cost of | The lower | The cost is high and certificates need to be purchased |
Link to the cache | Relatively efficient | It is relatively low, which increases data overhead and power consumption |
Colloquial answer:
HTTP is a hypertext transfer protocol, is the most widely used network protocol on the Internet, is a client and server side request and response standard, used to transmit hypertext from the WWW server to the local browser transport protocol, it can make the browser more efficient, so that the network transmission is reduced. As for HTTPS, it is an HTTP channel aiming at security. It is the secure version of HTTP. SSL layer is added to HTTP, and THE security basis of HTTPS is SSL. (π answers what HTTP and HTTPS are)
The HTTP connection is very simple and stateless, and the data transmitted is not encrypted, that is, plaintext. Netscape uses SSL to encrypt the data transmitted through HTTP. Therefore, HTTPS is a network protocol constructed by HTTP and SSL for encrypted transmission and identity authentication. This protocol is more secure than HTTP. (Answer the content sent in plaintext and encryption)
HTTP is a hypertext transfer protocol, and information is transmitted in plain text. HTTPS requires a certificate and costs a lot. It is a secure SSL encryption transfer protocol. (π answers port number questions)
HTTPS can be used to authenticate users and servers to ensure that data is sent to the correct client and server. When the client uses HTTPS to communicate with the Web server, the client accesses the web server using an HTTPS URL and requires the Web server to establish an SSL connection. After receiving the request from the client, the Web server sends the website certificate, Then the client and the Web server start to negotiate the SSL connection security level, that is, the encryption level. The two sides agree on the security level to establish a session key, and then use the public key of the website to encrypt the session key and send the session key to the website. (The Web server decrypts the session key through its own private key, encrypts the communication with the client through the session key) (π answers the HTTPS connection mode)
Recommend using HTTPS, compared to the equivalent HTTP site, using HTTPS encryption site in the search results will be higher ranking oh!
HTTPS handshake is time-consuming, lengthening page loading time by 50% and increasing power consumption by 10% to 20%. HTTPS cache is not as efficient as HTTP, which increases data overhead. SSL certificates also cost money π΄. An SSL certificate must be bound to an IP address. You cannot bind multiple domain names to the same IP address. (π answers the disadvantages of HTTPS)
smallTips
οΌ
- WWW, i.e.,
World Wide Web
“Is short for world Wide Web. - Why is it
80
The port? 80 ishttp
The default port of the protocol is when entering a web site, while the browser (nonIE
) has helped you enter the agreement, so you enterbaudu.com
, actually visitedbaidu.com:80
γ
(2) HTTP protocol π§
1, HTTP1.0, HTTP1.1, HTTP2.0 protocol basic content
(1) http1.0
Introduction date: HTTP1.0 was introduced in 1996.
Main Contents:
http1.0
Provides only the most basic authentication, user name and passwordunencrypted;http1.0
Only supportShort connectionEach time data is sent, it passes throughTCP
Three handshakes and four waves,Low efficiency;http1.0
Use only theheadertheif=modified-Since
εExpires
As aCache invalidationStandards;http1.0
Does not supportThe endpoint continuinglyData is sent every time it is sentAll the data;http1.0
That every computer isOnly one IP address can be boundVirtual networks are not supported.
(2) http1.1
Introduction date: HTTP1.1 was introduced in 1999.
Main Contents:
-
Http1.1 uses the digest algorithm (MD5/SHA-1) for authentication;
-
Http1.1 uses long connections by default, that is, only one connection needs to be established, multiple data can be transmitted, and once the transmission is complete, only one disconnection is required. Keep-alive (connection: keep-alive/close) is set by the request header.
-
Http1.1 supports resumable breakpoints through the Range of request headers.
-
Http1.1 uses virtual networks, where multiple virtual hosts can reside on a single physical server and share the same IP address.
(3) the http2.0
Introduction time: established in 2015.
Main Contents:
-
Head compression: using HPACK algorithm for compression;
Why introduce head compression?
Cookie, Accept, Sever, Range, and other HTTP1.1 header fields can take up hundreds to thousands of bytes, while the body is sometimes only a few dozen bytes (” head heavy body light “).
-
Binary format: HTTP2.0 has chosen a binary format closer to TCP/IP, ditching ASCII to improve parsing efficiency;
-
Enhanced security: HTTP2.0 generally runs on HTTPS;
-
Multiplexing: Multiple requests can be made on a connection.
Small tips:
Memory points: algorithms, connections, headers, power-off, virtual networks
2, http1.0, HTTP1.1, HTTP2.0 protocol differences
(1) Main differences between HTTP1.0 and HTTP1.1
(1) long connection
http1.0
You need to usekeep-alive
Parameter to tell the server to establish a long connection, whilehttp1.1
Long connections are supported by default.- The reason for using long connections is that,
http
Is based onTCP/IP protocolCreate oneTCP
The connection needs to go through three handshakes, which has some overhead and performance impact if the connection needs to be re-established for each communication. So it’s best to maintain oneA long connectionYou can use a long connection to send multiple requests.
β‘ Bandwidth saving
http1.1
supportOnly header information is sent(Without anybody
Information) if the server thinks the clientHave permissionRequest the server, and return100Otherwise return401;- If the client receives the
100
Before beginning to put the requestbody
Send to the server; - So when the server returns
401
The client does not have to send the requestbody
Save bandwidth;
- If the client receives the
- In addition
http1.1
It also supports sending only a portion of the content; In this way, when the client has some resources, it only needs to request other resources from the server. This is to supportFile resumable at breakpointOn the basis of.
β’HOST domain (virtual network)
http1.1
Can be found inweb server
On (e.g.tomat
) set up virtual sites, that is,web server
On the multipleVirtual siteCan be SharedSame IP and port;http1.0
There is nohost
The domain,http1.1
This parameter is supported.
(2) Main differences between HTTP1.1 and HTTP2.0
β Multiplexing
- in
http1.1
In the protocol, browser clients have certain requests for the same domain name at the same timeData limit, the number of requests exceeding the limit will be blocked. http2.0
Using themultiplexingThe technology to do the same connection concurrently processing multiple requests, and the number of concurrent requests ratiohttp1.1
Several orders of magnitude larger.- Of course,
http1.1
You can build a few moreTCP
Connect to support processing more concurrent requests, but createTCP
The connection itself is also expensive.
β‘ Head compression
http1.1
Does not supportheader
Compression of data,http2.0
useHPACK algorithmrightheader
Is compressed so that data is smaller and can travel faster over the network.
β’ Server push
- It means that when we support
http2.0
ηweb server
When requesting data, the server will incidentally push some resources needed by the client to the client, so that the client will not create a connection to send a request to the server to obtain. This approach is very appropriateLoading static Resources. - So where are these resources pushed by the server? In fact, there is a client somewhere, the client directly from the local load of these resources, do not go to the network, the speed is much faster.
3, http2.0
(1) Set goals for http2.0 projects
-
Page load time (PLT) reduced by 50%.
-
There is no need for the site author to change anything.
-
Minimize deployment complexity without requiring network infrastructure changes.
-
Work with the open source community to develop this new protocol.
-
Collect real performance data to verify the validity of experimental protocols.
(2) Http2.0 features
β Multiplexing (request and response multiplexing)
In HTTP1.1, the browser client has a limit on the number of requests (connections) for the same domain name at any one time, and will block if the limit is exceeded.
Http2.0 uses multiplexing to process multiple requests concurrently on the same connection, and multiplexing allows multiple request-response messages to be sent simultaneously over a single HTTP/2 connection.
(2) Binary frame division layer
At the heart of all HTTP/2 performance enhancements lies the new binary framing layer, which defines how HTTP messages are encapsulated and transmitted between clients and servers.
HTTP2.0 splits all transmitted information into smaller messages or frames (binary frames) and binary encodes them.
β’ Header compression
Each HTTP transport needs to carry a set of headers that describe the transferred resources and their properties.
HTTP/2 uses the HPACK compression format to compress request and response header metadata:
-
Support for encoding header fields of transports through static Huffman code, reducing the size of individual transports.
-
Both the client and server are required to maintain and update an index list of previously seen header fields (in other words, it establishes a shared compression context), which is then used as a reference to effectively encode previously transmitted values.
β£ Server push
In addition to the response to the initial request, HTTP/2 allows the server to push additional resources to the client without the client explicitly requesting them, making it ideal for loading static resources.
β€ Count the stream priority
After splitting the HTTP message into many independent frames, we can reuse frames from multiple data streams, and the order in which the client and server interleave to send and transmit these frames becomes a key performance determinant.
To do this, the HTTP/2 standard allows each data stream to have an associated weight and dependency:
Each data flow can be assigned an integer between 1 and 256, and there can be explicit dependencies between each data flow and other data flows.
β₯ One connection per source
Each data stream is split into frames that can be interleaved and prioritized. Therefore, all HTTP/2 connections are permanent and require only one connection per source, with many performance benefits.
7) flow control
Flow control is a mechanism that prevents the sender from sending a large amount of data to the receiver, lest it exceed the latter’s needs or processing capacity.
When is flow control used? Typically, the sender may be very busy, under high load, or may only want to allocate a fixed amount of resources for a particular data stream. Flow control will be used.
4. Talk about HTTP caching (browser caching)
(1) What is cache? What’s the point?
Definition: Caching is a technique for saving a copy of a resource and using it directly on the next request.
Function:
- Can significantly improve the performance of your website and application.
- Reduced wait time and network traffic.
- Reduces the time required to display the resource representation.
- Makes the page load faster.
- Alleviates server pressure and improves performance.
(2) Do you know of any caching methods?
- Browser cache
- Proxy cache
- Gateway caches
CDN
The cache- Reverse proxy cache
(3) Cache location
1) the Service Worker
The Service Worker’s cache differs from other built-in caching mechanisms in that it gives us control over which files are cached, how the cache is matched, and how the cache is read, and the cache is persistent.
2) Memory Cache
The reads are efficient, but the duration is short, and once the Tab page is closed, the in-memory cache is freed.
3) Disk Cache
Slow read speed, capacity and storage timeliness advantages.
4) Push Cache
Push Cache is HTTP /2. It exists only for sessions, is released once the Session ends, and is cached for a short time.
(4) How does HTTP caching work
HTTP caching is divided into mandatory caching and negotiated caching.
1) Mandatory caching
Forced caching is when files are fetched directly from the cache without a request being sent.
2) Negotiate cache
Negotiation cache means that the file is already cached, but whether it is read from the cache or not requires negotiation with the server, depending on the field Settings of the request/response header. Negotiation cache requires the request to be sent as opposed to mandatory caching.
3) Strong cache of relevant fields
Cache-control: A generic header field used to specify directives in HTTP requests and responses to implement caching.
Strong cache fields include Expires and cache-Control. Cache-control takes precedence over Expires if both exist.
- Cache request instruction
Cache-Control: No-cache, no-store, max-age=
, max-stale[=
], min-fresh=
, no-transform, only-if-cached
- Cache response instruction
Cache-control: Public, private, no-cache, no-store, no-transform, proxy-revalidate, max-age=
, s-maxage=
, must-revalidate
- Cache-control command description
instruction | instructions |
---|---|
public | All content will be cached (both client and proxy can be cached) |
private | Content is only cached in private caches (only clients can cache, not proxy servers) |
no-cache | You must verify with the server that the returned response has been changed before you can use it to satisfy subsequent requests to the same url. Therefore, if an appropriate validation token (ETag) exists, no-cache initiates a round-trip communication to validate the cached response, avoiding downloading if the resource has not been changed. |
no-store | None of the content is cached in cache or temporary Internet files |
must-revalidation/proxy-revalidation | If the cached content is invalid, the request must be sent to the server/proxy for revalidation |
max-age=xxx (xxx is numeric) | Cached content will expire after XXX seconds. This option is only available in HTTP 1.1 and has a higher priority if used with last-Modified |
4) Negotiate cache related fields
The negotiation cache fields are last-modified/if-modified-since, Etag/ if-none-match.
5) Negotiate the cache effectiveness process
- First request from browser:
- Second request from browser:
6) Browser caches – how do you choose which one to use and when?
Browser caches are divided into strong cache and negotiated cache. When a client requests a resource, the process for obtaining the cache is as follows:
- Let’s start with some of this resource
http header
Determine whether it matches the strong cache. If it matches, the cache resources are directly obtained from the local server without sending requests to the server. - When the strong cache does not hit, the client sends a request to the server, and the server sends a request through another
request header
Verify that the resource matches the negotiated cache, calledhttp
Then verify that, if a match is hit, the server will request back, but does not return the resource, but tells the client to directly get the resource from the cache, the client will get the resource from the cache after receiving the return; - Strong and negotiated caches have in common that the server does not return the resource if the cache is hit; The difference is that the strong cache does not send requests to the server, but the negotiated cache does. When the negotiation cache also dies, the server sends the resource back to the client.
- when
ctrl+f5
When the page is forced to refresh, it is loaded directly from the server, skipping the strong cache and negotiated cache. - when
f5
When the page is refreshed, the strong cache is skipped, but the negotiated cache is checked.
Detailed articles are added at π
Do you know 304? Graphic strong cache and negotiated cache
Links: juejin. Cn/post / 697452…
5, HTTP common header fields
(1) General Motors head
field | instructions |
---|---|
Request URL | Requested domain name |
Request Method | Request way |
Status Code | The status code returned |
Remote Address | The remote address requested |
The Response header
field | instructions |
---|---|
Cache-Control | The caching mechanism that the server should follow |
Connection | Keep-alive connection mode |
Content-Encoding | The Web server indicates what compression methods it uses (gzip, Deflate) |
Content-Language | The Web server tells the browser the language of the object it is responding to |
Content-Length | The Web server tells the browser the length of the object it is responding to |
Content-Range | The Web server indicates which part of the entire object the response contains, for example: Content-range :bytes |
Content-Type | The Web server tells the browser the type of its response object. Such as: the content-type: application/XML |
ETag | informThe clientEntity identification is a resource that can be identified byString formThe way to make a unique identification, the value has strong and weak points |
Expired | When does the Web server indicate that the entity expires |
Last-Modified | The Web server indicates when the entity was last modified |
Set-Cookie | Cookie information used to start state management |
Location | Used with redirection, redirection |
(3) Request header
field | instructions |
---|---|
Accept | Acceptable response Content Types |
Accept-Charset | Acceptable character set |
Accept-Encoding | How to encode acceptable response content |
Accept-Language | Browsers receive supported languages |
Accept-Datetime | An acceptable time-dependent version of the response content |
Authorization | Authentication information of the resource to be authenticated in HTTP |
Cache-Control | Used to specify whether caching is used in the current request/reply |
Connection | The type of connection that the client (browser) wants to use preferentially |
Cookie | An HTTP Cookie Set by the previous server via set-cookie |
Content-MD5 | The binary MD5 hash value (digitally signed) of the content of the request body, the result encoded in Base64 |
Content-Length | The length of the request body in base 8 |
Content-Type | MIME type of the request body (for POST and PUT requests) |
Referrer | The source protocol, which is the URL that sends the request page |
Expect | Indicates that the client is asking the server to perform a specific behavior |
From | The email address of the user who initiated this request |
If-Modified-Since | 304 unmodified is allowed if the corresponding resource has not been modified |
If-None-Match | Conditional request, determine whether the entity ETag is inconsistent with the resource ETag, if not, return 200, request response and negotiation cache; If so, 304 Not Modified is returned and can be read from the local cache. |
If-Unmodified-Since | A response is sent only if the entity has not been modified since a certain time |
Max-Forwards | Limits the number of times the message can be forwarded by the proxy and gateway |
Range | Represents a request for a portion of an entity, byte offset starting at 0 |
User-Agent | The browser identity string |
Upgrade | Requires that the server be upgraded to a higher version protocol |
(4) the Cookie
1) Header field of Cookie service
Header field name | instructions | The first type |
---|---|---|
Set-Cookie | Cookie information used to start state management | Response header field |
Cookie | Cookie information received by the client | Request header field |
- Attributes of the set-cookie field:
attribute | instructions |
---|---|
NAME=VALUE | The name given to the Cookie and its field value (required) |
expires=DATE | Cookie validity period (specifies the validity period of the Cookie that can be sent by the browser. If this parameter is not specified, it is used until the browser is closed by default) |
path=PATH | The file directory used to limit the range in which cookies are sent (defaults to the document directory if not specified) |
Domain – domain name | Domain name used as the Cookie object (default to the name of the server that created the Cookie if not specified) |
Secure | Cookies are sent only for SECURE HTTPS communication |
httpOnly | Restrict cookies so that they cannot be accessed by JavaScript scripts |
- Cookies:
Tell the server that when the client wants HTTP state management support, it includes the Cookie it receives from the server in the request. If multiple packets are received, the packets can also be sent as multiple packets.
6. HTTP status code
Status code | describe |
---|---|
1XX | Prompt information |
2XX | Yes, the request was processed successfully |
3XX | Redirection correlation |
4XX | Client error |
5XX | Server side error |
The common ones are 200 (normal), 404 (the page resource cannot be found), 304 (jump page), 500 (server error), etc., as follows:
Status code | meaning | use |
---|---|---|
100 | Continue to Continue | The client should continue with its request |
101 | Switching Protocols Switching protocol | The server switches protocols based on client requests. You can only switch to a more advanced protocol, for example, the new version of HTTP |
200 | OK request successful | Typically used for GET and POST requests |
201 | Created a Created | The new resource was successfully requested and created |
202 | Has been Accepted | The request has been accepted, but processing is not complete |
203 | Authoritative Information is non-authoritative Information | The request succeeded. The meta information returned is not the original server, but a copy |
204 | No Content Indicates No Content | The server processed successfully, but did not return content. You can ensure that the browser continues to display the current document without updating the web page |
205 | Reset Content Resets the Content | The server is successful, and the user end (for example, browser) should reset the document view. Use this return code to clear the browser’s form field |
206 | Partial Content Indicates Partial Content | The server successfully processed some of the GET requests |
300 | Choose Multiple Choices | The requested resource can include multiple locations, and a list of resource characteristics and addresses can be returned for user terminal (e.g., browser) selection |
301 | Moved Permanently | The requested resource has been permanently moved to the new URI, the return message will include the new URL, and the browser will automatically redirect to the new URL. Any future new requests should be replaced with a new URI |
302 | Found a temporary move | Similar to 301. But resources are moved only temporarily. The client should continue to use the original URI |
304 | Not Modified | The requested resource is not modified, and the server does not return any resources when it returns this status code. Clients typically cache accessed resources by providing a header indicating that the client wants to return only resources that have been modified after a specified date |
305 | Use Proxy Uses Proxy | The requested resource must be accessed through a proxy |
400 | Bad Request | Client request syntax error, server cannot understand |
401 | Unauthorized | The request requires user authentication |
402 | Payment Required | Reserved for future use |
403 | Forbidden | The server understands the request from the requesting client, but refuses to execute the request |
404 | Not Found | The server could not find the resource (web page) based on the client’s request. With this code, a web designer can set up a personalized page that says “the resource you requested could not be found. |
405 | Method Not Allowed | The method in the client request is disabled |
503 | Service Unavailable | The server is temporarily unable to process client requests due to overloading or system maintenance. The length of the delay can be included in the server’s retry-after header |
504 | Gateway Time-out | The server acting as a gateway or proxy did not get the request from the remote server in time |
505 | HTTP Version not supported | The server did not support the HTTP version of the request and could not complete the processing |
7. HTTP request mode scenario
(1) HTTP request mode
get
, requests the specified page information and returns the entity body;post
, requesting the server to acceptThe document specifiedIdentified as a pairURL
A new dependent entity ofhead
, similar to a GET request, except that the response is not returnedSpecific content, User acquisitionThe header;options
Allows the client to view the serverperformance, such as the server support requests and so on;PUT
, transfer files;DELETE
, delete files;OPTIONS
, ask for support methods;TRACE
, tracing the path;CONNECT
To useThe tunnel protocolConnect the proxy.
(2) The difference between GET and HEAD
HEAD
The methods andGET
Again, except that the server does not return in responseThe message body. The responseHEAD
The request ofHTTP
The meta information contained in the header should be the same as the information sent in response to a GET request. This method can be used to obtain meta information about the entity implied by the request without transferring the entity body itself.- This method is typically used to test the validity, accessibility, and latest modifications of hypertext links.
- right
HEAD
The response to a request can be cacheable because the information contained in the response can be used to update a previously cached entity from that resource. If the new field value indicates that the cached entity is different from the current entity (e.gContent-Length
οΌContent-MD5
οΌETag
ζLast-Modified
The cache must treat the cache entry as expired.
(3) The difference between GET and POST
GET
Parameters throughurl
Pass,POST
On thebody
In the. (According to the HTTP protocol,url
In the request header, so the size is limited).GET
Request inurl
The parameters passed in thePOST
No. The reason is shown in βGET
Is harmless when the browser falls back, whilePOST
The request will be submitted again.GET
The request is initiated by the browsercache
And thePOST
No, not unless you set it up.GET
ζ―POST
More insecure because the parameters are directly exposed inurl
, so it cannot be used to transmit sensitive information.- For the data type of the parameter,
GET
We only acceptASCII
Characters, andPOST
There is no limit. GET
The request can only proceedurl(x-www=form-urlencoded)
Coding, andPOST
Supports multiple encoding modes.GET
Generate a packet;POST
Produce twoTCP
Packets. forGET
The way the browser will put the requesthttp
ηheader
εdata
The server responds200
(Return data). And forPOST
, the browser sends firstheader
, server response100 continue
, and the browser sends itdata
, server response200 OK
(Return data).
(4) Why do cross-domain complex requests need to be prechecked?
- Complex requests can have adverse effects on the server.
- For example,
delete
εput
, both of which make changes to the server data before the request is madeAsk the server first, whether the current page domain name is in the serverPermission to list, the browser sends a formal request only after the server permits it. Otherwise, the browser does not send a formal request.
8. HTTP request process
(1) Collection of questions
- What happens when you enter the URL from the browser address bar
- Url rendering process
- Parsing parameters in url (write code)
- Url input to the page display process
- HTML parses the rendering process
(2) Problem solving
-
The browser performs DNS domain name resolution on the requested URL and finds the real IP address.
-
Based on this IP address, find the corresponding server and initiate the TCP three-way handshake.
-
After a TCP connection is established, an HTTP request is sent.
-
The server responds to the HTTP request, and the browser gets the HTML code;
-
The browser parses the HTML code and requests resources (such as JS, CSS, images, etc.) in the HTML code.
Note: get the HTML code before you can find these resources;
-
The browser renders the page to the user;
-
The server closes the TCP connection.
(3) Supplement
After understanding the HTTP request process, you need to understand:
β How to resolve DNS domain name;
β‘TCP three-way handshake;
β’ Why do you shake hands three times?
(4) Why should HTTP requests be implemented based on TCP?
β€ TCP waved four times;
β₯ Why do you wave four times?
Why establishing a connection is a three-way handshake and closing a connection is a four-way wave?
β§ What if the connection has been established, but the client suddenly fails?
β¨ HTTP request modes.
9. HTTP rendering steps
The HTTP rendering steps are:
- β Parse HTML files and build DOM Tree;
- 2. Parse CSS files and build CSSOM Tree (CSS rule Tree);
- (3) Combine THE DOM Tree with the CSSOM Tree to construct the Render Tree.
- (4) Reflow: Calculate node information according to Render Tree;
- β€ Repaint: Paint the entire page based on the calculated information.
(3) HTTPS protocol π§
1. Advantages and disadvantages of HTTPS
(1) Advantages
1) Send data to the correct client
Using THE HTTPS protocol, users and servers can be authenticated to ensure that data is sent to the correct clients and servers.
2) Safer
HTTPS is a network protocol that uses SSL and HTTP to encrypt transmission and authenticate identity. It is more secure than HTTP and protects data from theft and alteration during transmission, ensuring data integrity.
3) Increase the cost of man-in-the-middle attacks
HTTPS is the most secure solution under the current architecture, and while it is not absolutely secure, it significantly increases the cost of man-in-the-middle attacks.
4) Higher search rankings
Google jumped to its search algorithm in 2014, and websites encrypted with HTTPS will rank higher in search results.
Baidu also released a supportive attitude towards HTTPS sites in 2018, indicating that HTTPS will affect search ranking as one of the premium features.
(2) Disadvantages
1) Page rendering takes more time
Due to SSL, the HTTPS handshake phase is time-consuming and can increase the page load time by nearly 50%.
2) Increased costs
SSL certificates cost money, and more powerful certificates cost more.
3) HTTPS connection caching is not as efficient as HTTP
HTTPS connection caching is not as efficient as HTTP, increases data overhead and power consumption, and even compromises existing security measures.
4) SSL certificates usually require IP binding
SSL certificates usually need to be bound to IP addresses. Multiple domain names cannot be bound to the same IP address. IPv4 resources cannot support such consumption.
5) Limitations
The HTTPS protocol also has a limited range of encryption and has little effect on hacker attacks, denial of service attacks and server hijacking. Most importantly, the SSL certificate credit chain system is not secure, especially in cases where some countries can control the CA root certificate, man-in-the-middle attacks are just as feasible.
2. HTTPS access process
A collection of questions:
- HTTPS handshake process
- HTTPS request process
- HTTPS encryption and decryption
Brief explanation:
- Customers to use
http
ηURL
accessWeb
Server, required withWeb
Server establishmentSSL
The connection. Web
After receiving the request from the client, the server will send the certificate information of the website (contained in the certificate)The public key) sends a copy to the client.- Client browser and
Web
The server starts negotiation.SSL
The security level of the connection, that is, the level of information encryption. - The browser on the client establishes the session key according to the mutually agreed security level, then encrypts the session key using the website’s public key and transmits it to the website.
Web
The server decrypts the session key using its own private key.Web
The server uses the session key to encrypt communication with the client.
Detailed explanation:
- Client initiating
HTTPS
request
The user enters an HTTPS url into the browser and connects to port 443 of the server.
- Server Configuration
Refers to the above mentioned digital certificates;
- Send the certificate
After receiving the request from the client, the Web server sends a copy of the certificate information (including the public key) of the website to the client.
- The client parses the certificate
The client checks the certificate and verifies whether the public key is valid. If a problem occurs, a warning is displayed. If there is no problem, generate a random value (private key) and continue the encryption with the certificate;
- Transmitting encrypted information
The client will encrypt the random value (private key) to the server, the server will decrypt it;
- The server decrypts the information
The server decrypts it to a random value (the private key) and then symmetrically encrypts the content through that value. Symmetric encryption means that the information to be returned is mixed with a random value (the private key), so that the data cannot be retrieved unless the random value (the private key) is known.
- Transmit encrypted information
Continue to pass encrypted information to the client;
- The client decrypts the information
The client decrypts the message sent by the server with the previously generated private key (random value), and then obtains the decrypted content.
3. Why is HTTPS secure
Compared with HTTP, HTTPS joins TLS/SSL. It is a layer security protocol between TCP and HTTP.
TLS/SSL relies on three basic algorithms: hash, symmetric encryption, and asymmetric encryption. The functions of these three algorithms are as follows:
- Verify the integrity of information based on hash function;
- Symmetric encryption algorithm uses negotiated secret keys to encrypt data.
- Asymmetric encryption implements identity authentication and key negotiation.
4. How to optimize HTTPS performance?
(1) HTTPS access speed optimization
1) Set up HSTS
The server returns an HTTP header of HSTS, and after the browser retrieves the header, the request is redirected internally to www.baidu.com by default, whether the user types www.baidu.com or www.baidu.com, for some time.
2) Session resume
A Session resume, as its name implies, reuses sessions to simplify handshakes.
Reduced CPU consumption because asymmetric key exchange calculations are not required. Improved access speed eliminates the need for a second full handshake, saving an RTT and computing time.Copy the code
3) Set Ocsp stapling to Nginx
When the browser sends a Client Hello message, it carries a Certificate status request extension. After viewing the extension, the server directly returns the OSCP content to the browser to complete the Certificate status check. Since the browser does not need to query the CA site for certificate status, this feature can significantly improve access speed.
4) useSPDY
orHTTP2
SPDY’s biggest feature is multiplexing, which allows multiple HTTP requests to be sent together over the same connection, unlike current HTTP protocols, which can only be sent sequentially one by one.
HTTP2 supports multiplexing to the same effect.
Current implementations of SPDY and HTTP2 use the HTTPS protocol by default. Both SPDY and HTTP2 support existing HTTP semantics and apis and are almost transparent to Web applications.Copy the code
5) False start
The principle of False Start is to send client_KEY_exchange data together to save one RTT.
(2) HTTPS computing performance optimization
1) ECC ellipse is preferred for arithmetic encryption
ECC elliptic encryption arithmetic is much faster and better than ordinary discrete logarithm calculation.
Symmetric key size | The size of RSA and DH keys | ECC Key Size |
---|---|---|
80 | 1024 | 160 |
112 | 2048 | 224 |
128 | 3072 | 256 |
192 | 7680 | 384 |
256 | 15360 | 521 |
Symmetric key algorithm: AES, DES, RC4
Asymmetric encryption algorithms: RSA, DH, and ECC
2) Use the latest version of OpenSSL
OpenSSL is an open source software library package that applications can use to securely communicate and avoid eavesdropping.
In general, the new version of OpenSSL is faster and more secure than the old version.
3) Hardware acceleration scheme
-
SSL dedicated accelerator card.
-
GPUSSL acceleration.
4) TLS remote proxy calculation
ποΈ 2. Browser storage
1. Browser storage
features | cookie | localStorage | sessionStorage | indexedDB |
---|---|---|---|---|
Data life cycle | Generally, it is generated by the server. You can set the expiration time | Unless it’s cleaned up, it’s always there | Clean up when the page is closed | Unless it’s cleaned up, it’s always there |
Data store size | 4K | 5M | 5M | infinite |
Communicates with the server | It is carried in the header each time and has an impact on request performance | Don’t participate in | Don’t participate in | Don’t participate in |
Add: Cookie is not used to store, but to communicate with the server, need to access please own encapsulation API.
LocalStorage has its own getItem and setItem methods, it is very convenient to use.
LocalStorage Note:
-
LocalStorage can only store strings. Access JSON data with json.stringify () and json.parse ().
-
If setItem is disabled, try… Catch Catches an exception.
Cookie, localStorage, and sessionStorage
(1) What are cookie, localStorage and sessionStorage?
1) cookie
-
A cookie is a very specific thing. It refers to a kind of data that can be stored permanently in the browser. It’s just a data storage function implemented by the browser.
-
The cookie is generated by the server and sent to the browser, which saves the cookie in the form of KV in a text file in a directory. The cookie will be sent to the server when the same website is requested next time.
-
The cookie expiration time is set by the client. If the expiration time is not set, it indicates that the lifetime of the cookie is during the browser session. When the browser window is closed, the cookie disappears. Cookies with a lifetime of a browser session are called session cookies. If an expiration time is set, the cookie remains valid until the set expiration time, even if the window or browser is closed.
-
Session cookies are generally stored in memory rather than hard disk, although this behavior is not regulated by the specification. If the expiration time is set, the browser saves cookies to the hard disk, closes the browser, opens the browser, and these cookies remain valid until the expiration time is exceeded. Different browsers have different ways of dealing with cookies that are stored in memory.
-
You can use document.cookie = “” to set the value of the cookie. Cookie values exist as key-value pairs. If the keys are set the same, the original values will be overwritten. When the keys are different, the pair is superimposed.
2) localStorage
- It is always valid and is always saved when a window or browser is closed and therefore used as persistent data;
- Same-origin Windows are shared and not invalidated, and remain in effect whether the window or browser is closed or not.
3) sessionStorage
-
A form of browser storage.
-
It is valid only before the current browser window closes and cannot be persistent.
-
In the same browser, if you jump to a new page from the current page, you can share it. However, if you open a new page directly, you cannot share it.
(2) Similarities and differences between Cookie, localStorage and sessionStorage
1) The similarities of the three are as follows:
- All are saved on the browser side and are homologous.
2) The difference between the three is:
-
Communication with the server is different:
Cookie data is always carried in same-origin HTTP requests (even if it is not needed), i.e. cookies are passed back and forth between the browser and the server, whereas sessionStorage and localStorage do not automatically send data to the server, only locally stored;
Cookies are sent out with HTTP requests, while loacalStorage and sessionStorage are not sent out with HTTP requests.
Cookie data also has the concept of path, which can restrict cookies to a specific path.
-
Storage size limits also differ:
Cookie data cannot exceed 4K, and because each HTTP request carries cookies, cookies are only suitable for storing small data, such as session id;
SessionStorage and localStorage, while also limited in size, are much larger than cookies, reaching 5M or more.
-
Different data validity periods:
SessionStorage: only valid until the current browser window closes;
LocalStorage: always valid, saved even when the window or browser is closed and therefore used as persistent data;
Cookie: Only valid before the set cookie expiration time, even if the window is closed or the browser is closed.
-
Different scopes:
SessionStorage is not shared across different browser Windows, even on the same page;
Localstorage and cookies are shared in all same-origin Windows;
3. Use of cookies
(1) Save the user login status
For example, storing the user ID in a cookie so that the user does not need to log in again the next time they visit the page is a feature that is now available in many forums and communities.
Cookies can also set the expiration time, when the time limit is exceeded, the cookie will automatically disappear. As a result, the system can often prompt users for how long they want to stay logged in: common options are one month, three months, a year, and so on.
(2) Tracking user behavior
A weather website, for example, can display local weather conditions based on the region selected by the user. If it is tedious to choose the location every time, it will appear very humanized after using cookies. The system can remember the region visited last time, and when opening the page next time, it will automatically display the weather situation of the region where the user was last time.
Because everything is done in the background, such pages are very easy to customize as if they were customized for a particular user. If the site provides the ability to change the skin or layout, you can use cookies to record the user’s options, such as background color, resolution, etc. When the user visits the interface next time, the interface style of the last visit can still be saved.
4. Session and Token
(1) the Session
Here’s an example:
session
Literally, yesThe session. It’s kind of like when you’re talking to someone, how do you know that you’re talking to A Joe and not a Joe? There must be something about the other person (looks, height, etc.) that says he is Zhang SAN.session
Similarly, the server needs to know who is currently sending the request to it.- To make this distinction, the server assigns a different”identity“, this identity means what we usually say
sessionId
. And then every time the client sends a request to the server, it carries this”identity“, and the server knows who the request came from. - As for how the client saves this.”identityFor browser clients, the default is used in most cases
cookie
Of course, can also be usedlocalStorage
εsessionStorage
Store thisidentity, you can use it according to your own needs. - It’s important to note that,
session
For a session, even if the same page is opened twice, it is considered the same session. - Server usage
session
Put the user’s informationtemporarySaved on the server after the user leaves the sitesession
Will be destroyed. - This kind of user information storage method is relative
cookie
It’s safer to talk, butsession
There is adefects: If the Web server is load balanced, then the next operation request arrives at another serversession
Will be lost.
To sum up, session:
- When the program needs to create one for a client request
session
The server first checks to see if the client’s request already contains onesession
Identification (known assessionId
), if it has been previously created for this clientsession
, the server does thissessionId
I’m going to put the corresponding thetasession
Retrieves out to use (retrieves not, will create a new); Otherwise, if the client request does not containsessionId
, creates one for the clientsession
And generate one with thissession
The associatedsessionId
οΌsessionId
The value of should be a string that is neither repetitive nor easy to find patterns to mimicsessionId
Will be returned to the client for saving in this response. Save thissessionId
Can be adoptedcookie
Or it could belocaStorage
εsessionStorage
So that the browser can automatically send the identity to the server according to the rules during the interaction.
(2) Token
-
Token-based authentication is ubiquitous in the Web world. In most Internet companies that use Web APIS, tokens are the best way to handle authentication under multiple users.
-
The following features allow you to use token-based authentication in your application:
- Stateless and extensible;
- Support for mobile devices;
- Cross-program call;
- Security.
-
Most of the apis and Web applications you’ll see use tokens. Facebook, Twitter, Google, GitHub, etc.
Detailed articles are added at π
To solve the browser storage problem, have to understand cookies,localStorage and sessionStorage
Links: juejin. Cn/post / 697340…
ποΈ
1. What is same-origin policy?
A security policy of the browser, which means that only when the protocol, domain name, and port of a WEB address are the same, they can access each other. That is, if the protocol, domain name, and port are different, the browser forbids the page from loading or executing scripts in different domains from its own.
2. Why do browsers have same-origin policies?
If there is no same-origin policy, others can easily obtain the cookie information of our website or conduct DOM operations on the webpage.
This is a very scary thing, especially the cookie information, which contains sessionID, which is an important credential of the session session with the server. If someone gets the cookie, it may lead to data theft and other consequences.
3. What contents are restricted by the same Origin policy?
- Data stored in a browser, such as
localStroage
γCookie
εIndexedDB
Cannot be accessed across domains through scripts; - Scripts cannot be used to operate in different domains
DOM
οΌ - Can’t pass
ajax
Request data from different domains.
4. Cross-domain problem solutions
(1) Cross domain through JSONP
A collection of questions:
JSONP
The principle ofJSONP
How to Communicate securely
1) Principle of JSONP
JSONP (JSON with Padding) is a “usage mode” of JSON that allows web pages to request data from other domains.
According to the XmlHttpRequest object, which is subject to the same origin policy, web pages can get JSON data dynamically generated from other sources using the open policy of
The data captured with JSONP is not JSON, but rather arbitrary JavaScript that is run with a JavaScript interpreter rather than parsed with a JSON parser.
As a result, all JSONP Get requests sent by Chrome are of js type, not XHR.
2) JSONP consists of two parts: callback function and data
The callback function is the function that is called to be placed on the current page when the response arrives.
The data is the JSON data passed into the callback function, which is the argument to the callback function.
function handleResponse(response){
console.log('The responsed data is: '+response.data);
}
var script = document.createElement('script');
script.src = 'http://www.baidu.com/json/?callback=handleResponse';
document.body.insertBefore(script, document.body.firstChild);
/*handleResonse({"data": "zhe"})*/
// The principle is as follows:
// when we request through the script tag
// The background will be based on the corresponding parameters (JSON,handleResponse)
// to generate the corresponding JSON data (handleResponse({"data": "}))
// Finally the returned JSON data (code) will be executed in the current JS file
// Now the cross-domain communication is complete
Copy the code
3) Disadvantages:
- Only Get requests can be used.
- Can’t registersuccess,errorThe event listener function, etc., cannot be easily identified
JSONP
Whether the request failed. JSONP
It is easy to load code from other domainsCross-site request forgeryThe security of the attack cannot be ensured.
(2) Cross domain by modifying document.damain (same primary domain)
1) Prerequisites:
Both domains must belong to the same base domain! And the protocol used, the port must be consistent, otherwise document.domain can not be used for cross-domain, so only cross subdomain
In the root domain scope, it is possible to set the value of the domain property to its upper domain. For example, in the aaa.xxx.com domain, you can set domain to xxx.com but cannot set domain to XXX.org or com.
2), for example,
For example: www.a.com/a.html and HTTP…
β In www.a.com/a.html:
document.domain = 'a.com';
var ifr = document.createElement('iframe');
ifr.src = 'http://www.script.a.com/b.html';
ifr.display = none;
document.body.appendChild(ifr);
ifr.onload = function(){
var doc = ifr.contentDocument || ifr.contentWindow.document;
// In this case, you can manipulate the doc
ifr.onload = null;
};
Copy the code
β‘ In www.script.a.com/b.html:
document.domain = ‘a.com’;
Pass document.name = ‘xxx.com’ with js under two HTML; Set consistency to achieve mutual access.
(3) Use window.name to cross domains
1) How to apply window.name to cross domains?
Window.name works by loading a cross-domain HTML file in iframe (generally dynamically created). The HTML file then assigns the string content passed to the requester to window.name. The requester can then retrieve the window.name value as a response.
2) limit
Cross-domain capability of iframe tags;
The ability of the window.name property value to persist after the document is refreshed (and the maximum allowed is about 2M).
3), for example,
For example: www.a.com/a.html and HTTP…
1) a.h HTML
<script>
var iframe = document.createElement('iframe');
iframe.style.display = 'none'; / / hide
var state = 0; // Prevents the page from being refreshed indefinitely
iframe.onload = function() {
if(state === 1) {
console.log(JSON.parse(iframe.contentWindow.name));
// Clear the created iframe
iframe.contentWindow.document.write(' ');
iframe.contentWindow.close();
document.body.removeChild(iframe);
} else if(state === 0) {
state = 1;
// When proxy.html is blank, it points to the current domain.
// Blocked a frame with origin "http://www.a.com/a.html" from accessing a cross-origin frame.
iframe.contentWindow.location = 'http://www.a.com/a.html'; }}; iframe.src ='http://www.b.com/b.html';
document.body.appendChild(iframe);
</script>
Copy the code
β‘ Contained in b.com/b.html:
<script>
window.name = 'Contents to be transmitted';
</script>
Copy the code
(4) Use HTML5’s newly introduced window.postmessage method
1) Window.postmessage method
New HTML5 feature that can be used to send messages to all other Window objects. It should be noted that we must ensure that all scripts are executed before sending MessageEvent. If it is called during function execution, subsequent functions will timeout and cannot be executed.
2), for example,
1) a.com/index.html:
<iframe id="ifr" src="b.com/index.html"></iframe>
<script type="text/javascript">
window.onload = function() {
var ifr = document.getElementById('ifr');
var targetOrigin = 'http://b.com'; / / if written 'http://b.com/c/proxy.html'
// postMessage will not be executed if it is written as 'http://c.com'
ifr.contentWindow.postMessage('I was there! ', targetOrigin);
};
</script>
Copy the code
(2) b.com/index.html:
<script type="text/javascript">
window.addEventListener('message'.function(event){
// Determine the message source address using the origin attribute
if (event.origin == 'http://a.com') {
alert(event.data); // pop "I was there!"
alert(event.source); // A reference to the window object in a.com and index.html
// Because of the same origin policy, event.source cannot access the window object}},false);
</script>
Copy the code
(5) cors
1) CORS is cross-domain
Cross-origin Resource Sharing (CORS) is a specification of browser technology. It provides a method for Web services to send sandbox scripts from different domains to avoid the same Origin policy of browsers and ensure secure cross-domain data transfer. Modern browsers use CORS in API containers such as XMLHttpRequest to reduce HTTP request risk sources. Unlike JSONP, CORS supports HTTP requirements in addition to GET requirements.
2) The server generally needs to add one or more of the following response headers:
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: POST, GET, OPTIONS
Access-Control-Allow-Headers: X-PINGOTHER, Content-Type
Access-Control-Max-Age: 86400
Copy the code
3) Cross-domain requests do not carry Cookie information by default. If you need to carry Cookie information, configure the following parameters:
"Access-Control-Allow-Credentials": true
/ / Ajax Settings
"withCredentials": true
Copy the code
4) The implementation of CORS in IE is XDR
var xdr = new XDomainRequest();
xdr.onload = function(){
console.log(xdr.responseText);
}
xdr.open('get'.'http://www.baidu.com'); . xdr.send(null);
Copy the code
5) Implementations in other browsers are in XHR
var xhr = new XMLHttpRequest();
xhr.onreadystatechange = function () {
if(xhr.readyState == 4) {if(xhr.status >= 200 && xhr.status < 304 || xhr.status == 304) {console.log(xhr.responseText);
}
}
}
xhr.open('get'.'http://www.baidu.com'); . xhr.send(null);
Copy the code
6) Implement CORS across browsers
unction createCORS(method, url){
var xhr = new XMLHttpRequest();
if('withCredentials' in xhr){
xhr.open(method, url, true);
}else if(typeofXDomainRequest ! ='undefined') {var xhr = new XDomainRequest();
xhr.open(method, url);
}else{
xhr = null;
}
return xhr;
}
var request = createCORS('get'.'http://www.baidu.com');
if(request){
request.onload = function(){... }; request.send(); }Copy the code
(6) Create script dynamically
Script tags are not restricted by the same origin policy.
function loadScript(url, func) {
var head = document.head || document.getElementByTagName('head') [0];
var script = document.createElement('script');
script.src = url;
script.onload = script.onreadystatechange = function() {
if(!this.readyState || this.readyState == 'loaded' || this.readyState == 'complete') {
func();
script.onload = script.onreadystatechange = null; }}; head.insertBefore(script,0);
}
window.baidu = {
sug: function(data) {
console.log(data);
}
}
loadScript('http://suggestion.baidu.com/su?wd=w'.function() {
console.log('loaded')});// Where is the content we requested?
// We can see the content introduced by script in source in the Chorme debug panel
Copy the code
(7) Use location.hash across domains
1) Principle:
The idea is to use location.hash to pass values.
2.
Suppose the file cs1.html under the domain name a.com is communicating with the file cs2.html under the domain name cnblogs.com.
A hidden iframe is automatically created. The SRC of the iframe points to the cs2.html page under the cnblogs.com domain name
β‘ cs2.html responds to the request and then passes the data by modifying the hash value of cs1.html
(3) At the same time, add a timer on cs1.html to determine whether the value of location.hash has changed at intervals. If it has changed, the hash value will be obtained
Note: Since the two pages are not in the same domain, IE and Chrome do not allow you to change the value of parent-location. hash, so use a proxy iframe under the a.com domain name.
The code is as follows:
First, the cs1.html file from a.com:
function startRequest() {
var ifr = document.createElement('iframe');
ifr.style.display = 'none';
ifr.src = 'http://www.cnblogs.com/lab/cscript/cs2.html#paramdo';
document.body.appendChild(ifr);
}
function checkHash() {
try {
var data = location.hash ? location.hash.substring(1) : ' ';
if(console.log) {
console.log('Now the data is '+ data); }}catch(e) {};
}
setInterval(checkHash, 2000);
Copy the code
Cs2.html under the domain cnblogs.com:
// Simulate a simple parameter handling operation
switch(location.hash) {
case '#paramdo':
callBack();
break;
case '#paramset':
/ / do something...
break;
}
function callBack() {
try {
parent.location.hash = 'somedata';
} catch(e) {
// The security mechanism of Internet Explorer and Chrome cannot be modified with parent-location. hash,
// So use an intermediate proxy iframe under the CNBlogs domain
var ifrproxy = document.createElement('iframe');
ifrproxy.style.display = 'none';
ifrproxy.src = 'http://a.com/test/cscript/cs3.html#somedata'; // Note that the file is in the "a.com" field
document.body.appendChild(ifrproxy); }}Copy the code
Domain name cs3.html under a.com:
// Since parent. Parent belongs to the same domain as itself, we can change its location.hash value
parent.parent.location.hash = self.location.hash.substring(1);
Copy the code
The websocket (8)
A collection of questions:
- What protocol does real-time collaborative editing use? websocket
- What is websocket and how is it different from polling?
- How is Websocket set up? Relationship to HTTP?
- Does Websocket have origin restriction?
1) What is websocket?
Web Sockets are a browser API that aims to provide full-duplex, bidirectional communication over a single persistent connection. (Same-origin policy does not apply to Web Sockets)
2) Web Sockets principle
The client sends an HTTP request to the server to upgrade the protocol. Then the server switches the protocol and returns the request to the client. The connection established thus far is upgraded from HTTP to Web Socket.
3) When will websocket be used?
This works only on servers that support the Web Socket protocol.
var socket = new WebSockt('ws://www.baidu.com');//http->ws; https->wss
socket.send('hello WebSockt');
socket.onmessage = function(event){
var data = event.data;
}
Copy the code
4) polling
- Polling is when the client repeatedly asks the server if there is any new information.
- Compatibility: Short polling > Long polling > WebSocket.
- Performance: Websocket > Long poll > Short poll.
(9) Nginx reverse proxy
1) What is Nignx?
- You don’t need the target server, but you need to set up a relay
nginx
Server, forForward the request. - It needs to be modified at the operations level, and it is possible that the requested resources are not within our control (third party), so this approach cannot be used as a general solution.
βΊ front-end security
1. Cross-site scripting attacks XSS
(1) What is XSS
Cross-site Scripting (XSS) attacks exploit vulnerabilities of web pages to inject malicious code into web pages so that users can execute the injected malicious code when loading the web pages.
(2) Attack types of XSS
There are three types of XSS:
- Nonpersistent span (also called reflex)
- Persistent cross-site (also called storage)
DOM
cross-site
1) Non-persistent cross-station (reflective)
β Attack Procedure
- The attacker constructs a special
URL
, which contains malicious code. - The user opens with malicious code
URL
When, the web server will send malicious code fromURL
Take out, splice inHTML
Is returned to the browser. - When the user’s browser receives the response, it parses it and executes the malicious code mixed in.
- Malicious code steals user data and sends it to the attacker’s website, or impersonates the user’s behavior and calls the target website interface to perform the operations specified by the attacker.
β‘ Attack Scenario
Reflective XSS (also known as non-persistent XSS) vulnerabilities are common in functions that pass parameters through urls, such as web site searches, jumps, and so on.
β’ Attack mode
Because users need to take the initiative to open malicious URL to take effect, attackers often combine a variety of means to induce users to click.
Reflective XSS can also be triggered by the contents of a POST, but the trigger condition is more stringent (the form submission page needs to be constructed and the user is directed to click), so it is very rare.
2) Persistent cross-site (storage)
β Attack Procedure
- The attacker submits malicious code to the database of the target website.
- When the user opens the target website, the website server takes out the malicious code from the database and splices it in
HTML
Is returned to the browser. - When the user’s browser receives the response, it parses it and executes the malicious code mixed in.
- Malicious code steals user data and sends it to the attacker’s website, or impersonates the user’s behavior and calls the target website interface to perform the operations specified by the attacker.
β‘ Attack Scenario
Stored XSS attacks (also known as persistent XSS) are common on web features with user-saved data, such as forum posts, product reviews, and user messages.
(3) hazards
It is one of the most dangerous types of cross-site scripting, more inside-out than reflective and DOM XSS, and therefore more harmful because it does not require manual triggering by the user.
Any Web program that allows users to store data may have stored XSS vulnerability. When an attacker submits a piece of XSS code, it will be received and stored by the server and XSS will be used when all visitors visit a page.
3) DOM cross-site
β Attack Procedure
- The attacker constructs a special
URL
, which contains malicious code. - The user opens with malicious code
URL
γ - The user browser receives the response and parses it to the front end
JavaScript
Take out theURL
And execute the malicious code. - Malicious code steals user data and sends it to the attacker’s website, or impersonates the user’s behavior and calls the target website interface to perform the operations specified by the attacker.
(2) the harm
DOM usually stands for objects in HTML, XHTML, and XML. Using DOM allows programs and scripts to dynamically access and update the content, structure, and style of documents. It does not require the server to parse the response directly, triggering XSS depends on browser-side DOM parsing, so preventing DOM XSS is completely the responsibility of the front end, be careful! .
Summary:
The differences between reflective and memory types are:
Stored XSS malicious code is stored in the database, reflective XSS malicious code is stored in the URL.
The first two differences of DOM are:
DOM TYPE XSS attack, the extraction and execution of malicious code is completed by the browser side, which is a security vulnerability of front-end JavaScript itself, while the other two XSS are security vulnerabilities of the server side.
Comparison of the three:
type Storage area The insertion point Reflective XSS URL HTML Type stored XSS Back-end database HTML The DOM model XSS Back-end database/front-end storage /URL The front-end JavaScript
(3) How to defend against XSS
Wherever there is input data, there can be XSS hazards.
1) Set HttpOnly
After the HttpOnly attribute is set in the cookie, the JS script cannot read the cookie information.
2) Escape strings
Most XSS attacks are carried out by the input and output of data as attack points. Therefore, data is filtered for these attack points.
Data includes input and output of front-end data and input and output of back-end data.
So what is data filtering? How do you filter the data?
Data filtering checks input formats, such as email, phone number, user name, password… Etc., input in accordance with the specified format. It’s not just the front end, the back end does the same filtering checks. Without data filtering, an attacker can bypass the normal input process and send Settings directly to the server using the relevant interface.
Therefore, the data can be filtered by encapsulating filtering functions to divert common input from several attackers without the browser parsing into script code.
function escape(str) {
str = str.replace(/&/g.'& ');
str = str.replace(/</g.'< ');
str = str.replace(/>/g.'> ');
str = str.replace(/"/g.'&quto; ');
str = str.replace(/'/g.'& # 39; ');
str = str.replace(/`/g.'the & # 96; ');
str = str.replace(/\//g.'/ ');
return str;
}
Copy the code
3) Whitelist
For displaying rich text, it is not possible to escape all characters as this would filter out the required format. In this case, whitelist filtering is usually adopted. You can also filter through the blacklist. However, because there are too many tags and tag attributes to be filtered, whitelist filtering is recommended.
- Detailed articles are added at π
- Discussion on Web front-end security policy XSS and CSRF, and how to prevent?
- Links: juejin. Cn/post / 697269…
2. Cross-site request forgery CSRF
(1) What is CSRF
Cross-site Request Forgery, also known as one-click attack or Session riding, commonly abbreviated CSRF or XSRF, Is a method of hijacking a user to perform unintended actions on a currently logged Web application. For example, the attacker induces the victim to enter a third party website, where he or she sends a cross-site request to the attacked website. Using the victim in the attacked website has obtained the registration certificate, bypassing the background user authentication, to impersonate the user to perform a certain operation on the attacked website.
(2) CSRF attack process
To complete a CSRF attack, the victim must complete two steps in sequence:
- Log in to trusted site A and generate it locally
Cookie
γ - Visit dangerous site B without logging out of A.
Looking at this, you might say, “If I don’t meet one of these two criteria, I won’t be attacked by CSRF.” Yes, it does, but you can’t guarantee that the following won’t happen:
- You can’t guarantee that after you log in to a website, you won’t open another one
tab
Page and visit another site. - You are not guaranteed to close your browser after you have done so locally
Cookie
Expires immediately. Your last session has ended. (Actually, closing the browser doesn’t end a session, but most people mistakenly think closing the browser is equivalent to logging out/ending the session……) - The site mentioned above may be a trusted and frequently visited site with other vulnerabilities.
(3) Characteristics of CSRF
- Attacks are generally launched on third party sites, not the site being attacked. The attacked site cannot prevent the attack from happening.
- Instead of stealing data directly, the attack uses the victim’s login credentials on the targeted site to impersonate the victim to submit the action.
- The attacker can not obtain the login credentials of the victim during the whole process, just “fake”.
(4) CSRF attack scenario
Cross-site requests can be made in a variety of ways:
- The picture
URL
, hyperlinks,CORS
γForm
Submit and so on. Part of the request can be directly embedded in third-party forums, articles, difficult to track. CSRF
It is usually cross-domain because outlands are usually more easily controlled by attackers. However, if there are easily exploited functions in the local domain, such as forums and comment areas for Posting pictures and links, the attack can be directly carried out in the local domain, and this attack is more dangerous.
(5) Common attack types of CSRF
1) CSRF of GET type
CSRF utilization of the GET type is very simple and requires only one HTTP request. It is typically utilized as follows:
<img src="http://bank.example/withdraw? amount=10000&for=hacker" >
Copy the code
After the victim to visit the page containing the img, the browser will automatically to http://bank.example/withdraw? Account =xiaoming&amount=10000&for=hacker Sends an HTTP request. Bank.example will receive a cross-domain request containing the victim’s login information.
2) POST CSRF
This type of CSRF is typically exploited using an auto-submitted form, such as:
<form action="http://bank.example/withdraw" method=POST>
<input type="hidden" name="account" value="xiaoming" />
<input type="hidden" name="amount" value="10000" />
<input type="hidden" name="for" value="hacker" />
</form>
<script> document.forms[0].submit(); </script>
Copy the code
When you visit the page, the form is automatically submitted, simulating a POST operation.
Post-type attacks are generally a little more stringent than GET, but still not complex. Any personal website, blog, website uploaded by hackers may be the source of attacks, back-end interface can not rely on the security of POST only above.
3) CSRF of link type
Link-type CSRFS are uncommon and require the user to click a link to trigger them, compared to the other two cases where the user opens the page and is caught. This type usually involves embedding malicious links in the pictures published in the forum, or inducing users to be lured in the form of advertisements. Attackers usually trick users into clicking with exaggerated words, such as:
<a href="http://test.com/csrf/withdraw.php?amount=1000&for=hacker" taget="_blank"Word-wrap: break-word! Important; "> <a/>Copy the code
(6) How to defend against CSRF
1) Verification code
It forces the user to interact with the application in order to complete the final request. This method can contain CSRF well, but the user experience is poor.
2) Referer check
This method has the lowest cost, but is not guaranteed to be 100% effective because the server does not always get the Referer and there is a risk that older browsers will forge the Referer.
3) token
The CSRF defense mechanism of token verification is recognized as the most appropriate solution.
(7) Difference between CSRF and XSS
- Generally speaking
CSRF
Is made up ofXSS
The implementation,CSRF
It is often calledXSRF
(CSRF
This can also be done by issuing a request directly from the command line. - So essentially,
XSS
ζ―Code injection problems.CSRF
ζ― HTTP problem.XSS
It’s the unfiltered content that causes the browser to execute the attacker’s input as code,CSRF
It’s because the browser is sendingHTTP
Request time. - Automatically take
cookie
, while the general websitesession
There arecookie
The inside (Token
Validation can be avoided).
- Detailed articles are added at π
- Discussion on Web front-end security policy XSS and CSRF, and how to prevent?
- Links: juejin. Cn/post / 697269…
ποΈ other questions
1. What debugging tools Chrome has used
Element
Panel (mouse);console
Panel (to output some prompts);sources
Panel (debugjs
);Network
Panel (view status of network requests, interface calls…) .
2. Understanding the browser kernel
(1) It is mainly divided into two parts: rendering engine and JS engine
Rendering engine:
Responsible for getting the content of the page (HTML CSS IMG…) , and calculate the display mode of the web page, and then output to the monitor or printer, the browser kernel is different for the syntax interpretation of the web page is also different, so the rendering effect is not the same.Js engines:
Parsing and executionjavascript
To achieve the dynamic effect of the web page.
(2) Part of the browser kernel
IE:
trident
The kernelFireforx:
gekco
The kernelSafari:
webkit
The kernelOpera:
Used to bepresto
The kernel,Opera
Now switch toGoogle - Chrome
ηBlink
The kernelChrome:
Blink
(based onwebkit
.Google
withOpera Software
Joint development)
π‘ 6. Conclusion
In this article, from the basics of HTTP, to browser caching issues, to cross-domain, front-end security issues, systematically comb through the browser in the front-end interview points.
Here, the end of this article! Hope to help you ~
If you need to add to the article, or find small details wrong, welcome friends to leave a message in the comment section or contact VX :MondayLaboratory, timely correction ~
Let this interview content more perfect, benefit more in the preparation of small partners!
Finally, I wish everyone who read this article can get their favorite offer ~π₯π₯π₯
π£ Egg One More Thing
π·οΈ PDF
π wechat public account Monday laboratory, click the navigation bar below the interview column briefly view the keyword to obtain ~
π·οΈ Update address
π offer comes to the interview column
π· οΈ set pieces
- If you think this article is helpful to you, you might as well like to support yo ~~π
- That’s all for this article! See you next time! π π π