DNS Resolution Process

If the corresponding IP address of the domain name is found, the resolution is complete. Otherwise, go to the next step

2. In the local OPERATING system, search for host (domain name hijacking is to modify the mapping between the host domain name and the IP address). Otherwise, go to the next step

3. Go to LDNS(local domain name server), usually in your city, not far away from you, about 80% of domain name resolution ends there at most

4. To RootServer, the root DNS Server returns to LDNS the address of the primary DNS Server (gTLD Server, such as.com

5.LDNS sends a query request to the gTLD Server based on this address, and the gTLD returns a Name Serve address. Nameserver is the registered domain Name Server of the website

6. The Name Server returns the target IP address to the LDNS based on the mapping information

7.LDNS caches the domain name and the corresponding IP address

8. The LDNS returns the resolution result to the user. The client caches the domain name IP address again, and the domain name resolution process is complete

TCP three-way handshake

First handshake: Both the client and the server are in CLOSED state at the beginning. When the client intends to establish a TCP connection, it sends a connection request packet segment to the server. In this case, the synchronization bit in the header is SYN=1 and an initial serial number is selected: SEq = X. The client enters the SYN-sent state.

Second handshake: After receiving the connection request packet segment, the server sends a confirmation message to the client if it agrees to establish a connection. Set the SYN bit and ACK bit to 1, ACK = X +1, and seq= Y. The server enters the SYN-RCVD state.

Third handshake: The client sends an acknowledgement to the server after receiving the acknowledgement from the server. The ACK of the packet segment is set to 1, the ACK number is Y +1, and its serial number is seq=x+1. The TCP connection is ESTABLISHED, and the client enters the ESTABLISHED state. After receiving the confirmation from the client, the server enters the ESTABLISHED state, completes the three-way handshake, and can send data.

TCP waved four times

First wave: Both the client and the server are in the ESTABLISHED state. The client sends a connection release packet to the server, stops sending data, and actively closes the TCP connection. When the client sets the stop control bit FIN (seq= U) at the head of the connection release packet to 1, the client enters the FIN-WaIT-1 state.

Second wave: After receiving the connection release packet segment, the server sends an acknowledgement with ack= U +1 and its serial number is V. Then the server enters close-wait state. In this case, the TCP connection is half-closed. After receiving the confirmation from the server, the client enters the FIN-WaIT-2 state.

Third wave: If the server has no data to send to the client, the application process notifies the TCP to release the connection. In this case, the connection release packet segment sent by the server must be FIN=1. Assume that the serial number of the server is W and the ack id is U +1.

Fourth wave: The client sends an acknowledgement message after receiving the connection release packet from the server. ACK = W +1 and seq= U +1. Then the client enters the time-wait state. The client enters the CLOSED state after the TIME of 2MSL set by the timer, and the TCP connection is terminated.

TCP/IP four-tier model

The application layer

The transport layer

The network layer

Network interface layer

TCP/IP five-tier model

The application layer

The transport layer

The network layer

Data link layer

The physical layer

OSI seven layer model

From top to bottom:

Application layer: provides services for users. Common protocols include HTTP, SNMP,FTP, AND SMTP DNS

Presentation layer: provides representation, encryption, compression, and ASCII for data

Session layer: establish and terminate sessions, determine whether data needs to be transferred over the network, SQL NFS

Transport layer: provides end-to-end interfaces for packet grouping, transport protocol, port encapsulation, and error verification TCP UDP

Network layer: IP address addressing, routing, ARP IP ICMP

Data link layer: MAC address compiles and transmits addressed frames IEEE 802.3/802.2 HDLC

Physical layer: Transmission of data over physical media in the form of binary data PPP(Point-to-Point Protocol) SLTP

TCP congestion control

TCP four congestion control algorithms: 1. Slow start 2. Congestion avoidance 3. Fast recovery

When CWND < ssTHRESH, use slow start (index increased by 1, 2, 4, 8, 16)

Stop using slow start and use congestion avoidance algorithm instead (linear +1) when CWND > SSTHRESH

When CWND = SSTHRESH, either the slow start algorithm or the congestion avoidance algorithm can be used

HTTP

HTTP common status code

Sorted by the first number: 1 for information, 2 for success, 3 for redirection, 4 for client error, and 5 for server error

Status code meaning
200 OK The request succeeded. Typically used for GET and POST requests
301 Moved Permanently Permanently move. The requested information has been moved to the new URI, which is returned
302 Found Temporary move. The resource is only temporarily moved, and the client continues to use the original URI
304 Not Modified Unmodified. The requested resource is not modify. the server returns this status code and does not return any resources
400 Bad Request Syntax error in the client request, which the server cannot understand (cause: the data submitted by the front end cannot find the corresponding entity in the background)
401 Unauthorized The current request requires user authentication
403 Forbidden The server received the request but refused to execute it
404 Not Found The server could not find the resource based on the user’s request
500 Internal Server Error The server had an internal error and could not complete the request

What are HTTP headers?

  • Common header: indicates some common information. For example, date indicates the time when the packet is created
  • Request header: The request packet is unique, such as cookie and if-modified-since
  • Response header: The response packet is unique, such as set-cookie and last-Modified
  • Entity header: describes the entity part. For example, allow describes executable request methods, content-Type describes the body type, and Content-encoding describes the body Encoding

Methods supported by HTTP

methods role
get Requests the specified page information and returns the response body, typically for data reading
post Submit data to a specified resource and ask the server to process it
head Gets the response header information of the server, often used by the client to view the performance of the server
options The requesting server returns all HTTP request methods supported by the resource, often used by clients to view the server’s performance
put Uploads its latest content to the specified resource location
delete The requesting server deletes the resource identified by the requested URI
connect A proxy server that changes the connection to a pipe, often used to communicate between an SSL encryption server and an unencrypted HTTP proxy server
trace The request server displays the request information it receives, often used to test or diagnose HTTP requests

What is the difference between GET and POST?

Get and POST are essentially TCP links. However, due to HTTP regulations and browser and server restrictions, there are some differences in their application processes:

  • The GET argument is passed through the URL; The post is placed in the request body
  • The parameters passed in the URL by a GET request are limited in length; Post does not exist (HTTP protocol is not specified because of browser and server limitations)
  • Get requests can only be URL encoded; Post requests can be encoded in several ways
  • Get request parameters are kept intact in the browsing history. Parameters in post are not retained
  • Get generates a TCP packet; Post generates two TCP packets
  • For get requests, the browser sends HTTP headers and data together, and the server responds with 200 OK; For a POST request, the browser sends a header, the server responds with 100 Continue, the browser sends data, and the server responds with 200 OK
  • Caching: A GET request is similar to a search process. Users can obtain data without connecting to the database every time, so they can use caching. Post requests typically do modification and deletion and must interact with the database, so caching is not possible

New features of HTTP2.0

  • HTTPS is inherently secure
  • Binary framing layer: Partitioning all transmitted information into smaller messages or frames and binary encoding (HTTP2.0 performance enhancement core)
  • Allows multiplexing: Sends requests and responses simultaneously over a shared TCP connection, based on the binary framing layer. HTTP messages are broken up into separate frames without breaking the semantics of the message itself, sent interlaced, and finally regrouped at the other end based on the flow ID and header
  • Server push: The server returns multiple responses in advance to push additional resources to the client based on the client’s request. Caching is supported (following the same origin policy; Based on the client’s request response)

HTTP caching mechanism

  1. There are two types of caching, depending on the header content of the response
  2. Strong Cache (status code: 200) : The browser reads the file from the local Cache and returns it without sending any requests to the server (related fields: cache-Control, Expires)
  3. Negotiated cache (status code 304) : The browser sends a request to the server to indicate whether the cache is available (last-modified/if-modified-since, Etag/ if-none-match).
  4. Header cache-control, Expires, last-modified/if-modified-since, Etag/ if-none-match
  5. process

What are the parts of an HTTP request

1. Request message (request line/request header/blank line/request data)

  • The request line

    Request method field, URL field, and HTTP protocol version

    For example, GET /index.html HTTP/1.1

    The GET method concatenates data after the URL and passes limited parameters

    Request method:

    GET, POST, HEAD, PUT, DELETE, OPTIONS, TRACE, CONNECT

  • Request header (key value form)

User-agent: indicates the type of the browser that generates the request.

Accept: List of content types recognized by the client.

Host: indicates the IP address of a Host

Content-type: the MIME Type of the request body (used in POST and PUT requests)

  • A blank line

Send carriage return and newline characters to inform the server that there are no more headers below

  • The request data

In the POST method, the data is sent as a key value

Request data is not used in the GET method, but in the POST method. The POST method is suitable for situations where a customer needs to fill out a form. The most commonly used request headers associated with request data are content-Type and Content-Length.

Such as:

GET/sample. JSP HTTP / 1.1

Accept:image/gif.image/jpeg, /

Accept-Language:zh-cn

Connection:Keep-Alive

Host:localhost

The user-agent: Mozila / 4.0 (compatible; MSIE5.01; The Window NT5.0)

Accept-Encoding:gzip,deflate

username=jinqiao&password=1234

2. Response message (status line/response header/blank line/response body)

  • The status line

    It consists of three parts: HTTP version + status code + text description of status code

    For example, HTTP/1.1 200 OK

  • Response headers

    Contains server type, date, length, content type, etc

    Server: Apache Tomcat / 5.0.12

    Date:Mon,6Oct2003 13:13:33 GMT

    Content-Type:text/html

    Last-Moified:Mon,6 Oct 2003 13:23:42 GMT

    Content-Length:112

  • A blank line

  • In response to the body

    This is the HTML page or JSON data returned by the server

    Let’s talk about HTTP caching

HTTP caching means that when a client requests a resource from the server, it arrives in the browser cache first, and if the browser has a copy of the resource to request, it can fetch the resource directly from the browser cache rather than from the original server.

According to whether need to request to the server by classification, can be divided into (mandatory cache to cache) through consultation according to whether can be used by single or multiple users to classification, can be divided into the cache (private cache, Shared cache) mandatory if effective, no longer need to interact with the server, and negotiate the cache whether effective or not, all need to interact with the server.

Benefits of using HTTP caching:

1. Reduce redundant data transmission and save network fees.

2. Relieved the server pressure, greatly improved the performance of the website

3. Accelerated the client loading web page speed

HTTP cache scheme:

1. The md5 hash/cache

2.CDN cache (CDN edge node CDN cache data, when the browser requests, CDN will judge and process the request on behalf of the source site)

HTTP caching mechanism

Strong cache

Cache-control: max-age= XXXX, public client and proxy server can cache this resource. If the client has a request for the resource within XXX seconds, it directly reads the cache,statu code:200, and sends an HTTP request to the server if the user refreshes the resource

Cache-control: max-age= XXXX, private allows only the client to cache the resource. The proxy server does not cache the client directly reads the cache in XXX seconds,statu code:200

Cache-control: max-age= XXXX, immutable The client reads the cache directly if it has requests for this resource within XXX seconds. Statu code:200 Does not send HTTP requests to the server even after the user flusher the cache

Cache-control: no-cache Skips setting strong cache, but does not prevent setting negotiated cache; If you have a strong cache, you will only use the negotiation cache if the strong cache fails. If you set no-cache, you will not use the negotiation cache.

Cache-control: no-store cache. This will cause the client and server to not cache, so there is no so-called strong cache, negotiation cache.

Negotiate the cache

Send a request -> see if the resource is expired -> Expired -> Request server -> server comparison resource is really expired -> Not expired -> Return 304 status code -> client with the cache of the old resource.

Of course, when the server discovers that the resource is truly expired, it does the following:

Send a request -> See if the resource is expired -> Expired -> Request server -> Server compare whether the resource is really expired -> Expired -> Return 200 status code -> If the client receives the resource for the first time, Write down max-age, etag, last-Modified, and so on in its cache-control.

If the resource has not changed, 304 is returned and the browser reads the local cache. If the resource has changed, return 200 to return the latest resource.

HTTPS

1. How HTTPS works

  1. The client sends an HTTPS request through the URL to request the server to establish an SSL link
  2. After receiving the request, the server returns the public key certificate
  3. The client verifies whether the public key certificate is valid. If the verification fails, a warning message is displayed. If the authentication passes, the pseudo-random number generator is used to generate the session key, and then the session key is encrypted with the public key of the certificate and sent to the server
  4. The server decrypts the session key through its own private key. At this point, both the client and the server hold the same session key
  5. The communication between the server and client is encrypted using the session key

2. HTTPS encryption mode

HTTPS uses asymmetric encryption to transfer a symmetric key, which is used by the server and client to encrypt and decrypt the sent and received data. The transmitted data is symmetrically encrypted.

  • Symmetric encryption DES: Encryption and decryption using the same key (fast)
  • Asymmetric encryption RSA: the sender uses the public key for encryption and the receiver uses the private private key for decryption (security)

Advantages and disadvantages of HTTPS

advantages

  • It can carry out information encryption, integrity check and identity authentication, which largely avoids the risk of information eavesdropping, information tampering and information hijacking.

disadvantages

  • The handshake phase is time-consuming, lengthening the page load time and increasing power consumption
  • HTTPS caching is not as efficient as HTTP and increases data overhead
  • SSL certificates cost money, and more powerful certificates cost more
  • An SSL certificate must be bound to an IP address. You cannot bind multiple domain names to the same IP address. Ipv4 resources cannot support such consumption

The difference between HTTPS and HTTP

HTTP: Hypertext Transfer protocol, a type of TCP, a network protocol used to transfer hypertext from the WWW server to the local browser

HTTPS: HTTP+SSL, a secure version of HTTP, is added to the SSL layer to implement encrypted transmission and identity authentication

The difference between

  • Data transmitted through HTTP is unencrypted, that is, plaintext transmission. HTTPS is a secure SSL encrypted transport protocol
  • HTTPS requires an SSL certificate. HTTP without
  • The HTTP port number is 80. HTTPS port number 443
  • HTTPS is based on the transport layer; HTTP is based on the application layer

5. HTTPS man-in-the-middle attack and its defense

MITM manin-the-middle attack: The attacker is equivalent to a communicator who intervenes in communication. The attacker knows all the communication contents of both parties and can arbitrarily add, delete and modify the communication contents of both parties without the knowledge of both parties.

Guarantee of Communication Process Security (bottom-up)

  1. Correctness of public key: Asymmetric encryption is used for communication between the two parties. In asymmetric encryption, the private key is not transferred, but the public key is made public.
  2. Digital certificate: The public key is provided by the peer party at the beginning of communication, but can easily be replaced by an intermediary. Therefore, when sending the public key, you must also provide the corresponding digital certificate to verify that the public key is from the peer party rather than the intermediary.
  3. Upper-layer CA certificate correctness: The upper-layer CA issues a digital certificate to an individual or organization. The upper-layer CA uses its private key to sign the personal certificate to ensure that the public key of the certificate is not tampered with.
  4. The private key of the root certificate is not leaked or tampered with: The upper-level CA certificate is also issued by the upper-level CA. The trust chain continues to the root certificate, which is self-signed.
  5. Devices are not maliciously modified before being distributed to consumers: root certificates are generally distributed through operating systems rather than networks; The initial operating system should be distributed in a raw, face-to-face manner. Therefore, hardware manufacturers cooperate with certification authorities to build the root certificate of the certification authority into the operating system of the device before delivery.

HTTPS, SSL, and TLS

Secure Socket Layer (SSL) is a protocol encryption Layer based on HTTPS. SSL is a protocol Layer based on HTTP and encrypts data transmitted through TCP. Therefore, HPPTS is short for HTTP+SSL/TCP. When SSL was updated to 3.0, The Internet Engineering Task Force (IETF) standardized SSL3.0. A few mechanisms were added (but almost the same as SSL3.0), and the standardized IETF was renamed TLS1.0(Transport Layer Security protocol). TLS is the new version 3.1 of SSL. TLS (Transport Layer Security Protocol) is a more secure upgrade to SSL. Since the term SSL is more commonly used, we will still refer to our security certificate as SSL. But when you buy SSL from Symantec, what you’re really buying is the latest TLS certificate, with a choice of ECC, RSA, or DSA encryption.

TLS/SSL is a specification for encrypted channels

HTTP1/2/3

HTTP1 was released in 1997. Web pages used to be mostly text and are now more rich media. Since the protocol was designed for that, it doesn’t work very well in many cases today

Blog.csdn.net/weixin_4120…

Queue head blocking means that when one request in the sequential request sequence is blocked for some reason, all the subsequent requests are also blocked, causing the client to fail to receive data. People have tried the following methods to solve queue head congestion:

Allocate resources on the same page to different domain names to increase the connection upper limit. Chrome allows up to six TCP persistent connections to be established for the same domain name by default. If a persistent connection is used, only one TCP request can be processed in a pipe at a time. All other requests are blocked until the current request is completed. In addition, if 10 requests occur at the same time under the same domain name, four of them will be queued until the ongoing request completes.

Disadvantages: high latency (queue head blocking), large HTTP headers, many duplicate requests in thousands of response messages, and insecure plaintext transmission

SPDY protocol: Google has revamped HTTP1 to compress the header

HTTP/2: HTTP/2: HTTP/2: HTTP/2: HTTP/2: HTTP/2: HTTP/2: HTTP/2: HTTP/2

HTTP2 new features: binary transmission (TCP protocol part of the features of the application layer), header compression, developed a dedicated HPACK algorithm, can reach 50-90% of the high compression rate, multiplexing (solve the browser under the same domain name request number problem)

HTTP2 summary:

All communication with domain names is done on a single connection.

A single connection can host any number of two-way data streams.

The data stream is sent as a message, which in turn consists of one or more frames that can be sent out of order because they can be reassembled according to the stream identifier at the head of the frame.

This feature greatly improves performance:

The same domain name occupies only one TCP connection. Multiple requests and responses are sent in parallel using the same connection. In this way, the download process of the entire page resources requires only a slow start and avoids the problem caused by multiple TCP connections competing for bandwidth.

Multiple requests/responses are sent in parallel and interleaved, without affecting each other.

In HTTP/2, each request can have a priority value of 31 bits. 0 indicates the highest priority. A higher value indicates a lower priority. With this priority value, clients and servers can take different policies when dealing with different streams to optimally send streams, messages, and frames.

Serverpush changes the traditional request-response mode to a certain extent. The server is no longer passive in responding to requests, but can also actively create streams. When the browser just requests HTML, it sends JS and CSS files that may be used to the client in advance to reduce the delay of waiting. This is called a “Server Push” (also called a Cache Push), and it should be added that the Server can actively Push, and the client can choose whether to accept or not. If the server pushes a resource that has already been cached by the browser, the browser can reject it by sending RST_STREAM. Active push also complies with the same origin policy. In other words, the server cannot push third-party resources to the client, but only after the confirmation of both parties

For compatibility, HTTP/2 continues HTTP/1’s “plaintext” features, allowing data to be transmitted in plaintext as before, without forcing encrypted communication, but in binary format without decryption.

But since HTTPS has become such a trend, and major browsers like Chrome and Firefox have publicly declared that they only support encrypted HTTP/2, the “de facto” HTTP/2 is encrypted. While HTTP/2 solves many of the problems of previous versions, it still has a huge problem, mainly caused by the underlying TCP protocol. The main disadvantages of HTTP/2 are the following:

TCP and TCP+TLS connection establishment delay

HTTP/2 is transmitted using TCP, and TLS is used for secure transmission using HTTPS. TLS also requires a handshake process, which requires two handshake delay processes:

TCP header blocking is not completely resolved

We mentioned earlier that in HTTP/2, multiple requests run in a SINGLE TCP pipe. However, HTTP/2 does not perform as well as HTTP/1 when packet loss occurs. In order to ensure reliable transmission, TCP has a special “packet loss and retransmission” mechanism. Lost packets must wait for retransmission confirmation. When HTTP/2 packet loss occurs, the entire TCP must wait for retransmission, which blocks all requests in the TCP connection (as shown in the following figure). In the case of HTTP/1.1, multiple TCP connections can be opened, but only one of them is affected, and the remaining TCP connections can still transmit data.

Google was aware of these problems when they launched SPDY, so they created a udP-based “QUIC” protocol that made HTTP run on QUIC instead of TCP. This “HTTP over QUIC” is the next big version of the HTTP protocol, HTTP/3. It achieved a qualitative leap on the basis of HTTP/2, and really “perfect” to solve the “queue head blocking” problem.

As mentioned above, QUIC is based on UDP, which is “connectionless” and requires no “handshake” or “wave” at all, so it is faster than TCP. In addition, QUIC also realizes reliable transmission, ensuring that the data will reach the destination. It also introduces “streams” and “multiplexing” similar to HTTP/2, where individual “streams” are ordered and may block due to packet loss, but other “streams” are not affected. Specifically, QUIC protocol has the following characteristics:

It realizes the functions of flow control and transmission reliability similar to TCP.

Although UDP does not provide reliable transmission, QUIC adds a layer on top of UDP to ensure reliable transmission of data. It provides packet retransmission, congestion control, and other features found in TCP.

Realize the quick handshake function.

Since QUIC is udP-based, it can use 0-RTT or 1-RTT to establish connections, which means that QUIC can send and receive data as quickly as possible, which can greatly improve the speed of first page opening. 0RTT connection is arguably QUIC’s biggest performance advantage over HTTP2.

Integrated TLS encryption function.

QUIC currently uses TLS1.3, which has more advantages than the earlier version of TLS1.3, the most important of which is the reduction in the number of RTT spent on the handshake.

Multiplexing, completely solve the problem of TCP squadron head blocking

Unlike TCP, QUIC enables multiple independent logical data flows over the same physical connection (see figure below). Realize the separate transmission of data stream, and solve the problem of TCP squadron head blocking.

  • HTTP/1.1 has two major drawbacks: inadequate security and poor performance.
  • HTTP/2 is fully compatible with HTTP/1 and is a “more secure HTTP, faster HTTPS”. Technologies such as header compression and multiplexing can make full use of bandwidth and reduce latency, thus greatly improving the Internet experience.
  • QUIC is based on UDP and is the bottom support protocol in HTTP/3. This protocol is based on UDP and takes the essence of TCP to achieve a fast and reliable protocol.

CSRF and XSS attack and defense means

The name of the way defense
CSRF(Cross-site request Forgery) The attacker steals the user’s identity and sends only requests under the user’s name Validate the HTTP Referer field, add the token to the request header and validate
XSS (Cross-site scripting attacks) Attackers embed malicious JS scripts in the page and attack users when they browse the page Verify the data submitted by users, filter the input content, and set important cookies to HTTP only
SQL injection Use SQL statements to log in without an account or even tamper with the database Check variable data types and formats, filter special symbols, bind variables using precompiled statements (best)

How cookies protect against XSS attacks:

Add set-cookie to the HTTP header, where

Httponly: This property prevents JS scripts from using document.cookie to access cookies

Secure: This property tells the browser to send cookies only if the request is HTTPS

Principle of CDN

The full name of CDN is Content Delivery Network. CDN network is to increase the Cache layer between the user and the server, mainly by taking over DNS implementation, the user’s request to the Cache to obtain the data of the source server, so as to reduce the speed of network access.