A network model

With the development of technology, the application of the computer is more and more widely, computer communication between the state of flowers, each information technology companies with independent computing services system will be set up their own computer communication rules, and this kind of situation will lead to can’t communication between heterogeneous computer, greatly hindered the development of network communication, thus in order to solve this problem, The INTERNATIONAL Organization for Standardization (ISO) developed the OSI model, which defines the standards for the interconnection of different computers. The OSI model divides the work of network communication into seven layers, namely the physical layer, data link layer, network layer, transmission layer, session layer, presentation layer and application layer.

The seven-layer model is the concept of design level, and each layer has fixed responsibilities and functions to be completed. The advantage of layering lies in clarity and functional independence, but too many layers will make the layers more complex. Although the functions of this layer are not needed, the context of this layer also needs to be constructed, which consumes system resources. Therefore, when implementing the network communication model, the seven-layer model is simplified and combined into four layers, namely, application layer, transport layer, network layer and network interface layer (the models and protocols between layers are collectively referred to as TCP/IP protocol cluster).

As can be seen from the figure above, the TCP/IP model combines the APPLICATION layer, presentation layer, and session layer of the OSI model, and combines the data link layer and physical layer of the OSI model into the network access layer.

The figure above also lists the protocols in the TCP/IP protocol stack corresponding to each layer model and the relationship between the protocols. For example, DNS is based on TCP and UDP, FTP, HTTP, and TELNET are based on TCP, NTP, TFTP, and SNMP are based on UDP, and TCP and UDP are based on IP, and so on… .

Layer of the OSI function TCP/IP protocol family
The application layer File transfer, E-mail, file services, virtual terminals TFTP, HTTP, SNMP, FTP, SMTP, DNS, RIP, Telnet
The presentation layer Data formatting, code conversion, data encryption There is no
The session layer Control the ability to talk between applications; For example, data is distributed to different software ASAP, TLS, SSH, ISO 8327 / CCITT X.225, RPC, NetBIOS, ASP, Winsock, BSD Sockets
The transport layer The basic function of transmitting data end-to-end TCP, UDP
The network layer Define IP addresses, define routing functions; Such as data forwarding between different devices IP, ICMP, RIP, OSPF, BGP, IGMP
Data link layer Define the basic format of the data, how it is transmitted, how it is identified SLIP, CSLIP, PPP, ARP, RARP, MTU
The physical layer In order tobinaryData form The transfer of data over physical media ISO2110, IEEE802

When one of our websites doesn’t work. I usually ping the site

Ping is arguably the best-known use of ICMP and is part of the TCP/IP protocol. The ping command can be used to check whether the network is connected, which helps us to analyze and determine network faults.

The HTTP protocol

The URI and URL

Each Web server Resource has a name so that clients can indicate what Resource they are interested in. The server Resource name is called a Uniform Resource Identifier (URI). Uris, like postal addresses on the Internet, uniquely identify and locate information resources around the world.

Uniform resource Locator (URL) is the most common form of resource identifier. A URL describes a specific location of a resource on a specific server.

Now almost all URIs are urls.

The second form of URI is the Uniform resource Name (URN). URN is used as a unique name for specific content, independent of the current location of the resource.

Structure of HTTP messages

Transactions and messages

How does a client transact with a Web server and its resources over HTTP? An HTTP transaction consists of a request command sent from the client to the server and a response result sent back from the server to the client. This communication takes place through formatted blocks of data called HTTP messages.

HTTP transaction

message

HTTP packets are plain text, not binary code. HTTP packets sent from a Web client to a Web server are called Request messages. The packets sent from the server to the client are called response packets.

An HTTP packet consists of three parts:

  • The starting line
  • The first field
  • The main body

Common HTTP header fields

A. General header field (header field used by both request and response packets)

It can appear in either a request message or a response message.

  • Date: indicates the time when a packet is created
  • Connection: manages connections
  • Cache-control: Cache Control
  • Transfer-encoding: Indicates the Transfer Encoding mode of the packet body

B. Request header field (the header field used in the request message)

Header used to send request packets from the client to the server. This section provides additional information about the request, client information, and priority of the response.

  • Host: indicates the server that requests resources
  • Accept: Indicates the media type that can be processed
  • Accept-charset: Indicates the accepted character set
  • Accept-encoding: Indicates the acceptable content Encoding
  • Accept-language: acceptable natural Language

**c, response header field (** header field used in response message)

Header used to return response packets from the server to the client. Additional content added to the response also requires the client to attach additional content information.

  • Accept-ranges: The acceptable range of bytes
  • Location: THE URI to redirect the client to
  • Server: indicates the installation information of the HTTP Server

D. Entity head field (the header field used in the entity part of the request message and response message)

The header used for the entity portion of request and response messages. Added information related to entities such as updated resource content.

  • Allow: indicates HTTP methods supported by resources
  • Content-type: Specifies the Type of the entity’s main class
  • Content-encoding: Encoding method applicable to the entity body
  • Content-language: The natural Language of the entity body
  • Content-length: specifies the number of bytes in the entity body
  • Content-range: Specifies the location Range of the body of an entity. This is used when making a partial request

methods

The Http protocol defines many ways to interact with the server. The most basic four methods are GET,POST,PUT, and DELETE. A URL address is used to describe a resource on the network, and THE GET, POST, PUT, and DELETE operations in HTTP correspond to searching, modifying, adding, and deleting this resource. The most common ones are GET and POST. GET is used to obtain or query resource information, and POST is used to update resource information.

GET, HEAD, PUT, POST, TRACE, OPTIONS, DELETE

GET vs. POST

GET and POST are two commonly used HTTP methods. The differences between them mainly include the following five aspects:

  1. In terms of functions, GET is generally used to obtain resources from the server, and POST is generally used to update resources on the server.
  2. From a REST service perspective, GET is idempotent, that is, it reads the same resource and always gets the same data. POST is not idempotent, because each request does not change the resource identically. Further, GET does not change resources on the server, whereas POST does;
  3. In terms of request parameters, the data of a GET request is appended to the URL, that is, the request data is placed in the request header of an HTTP packet as? Split URL and transfer data, parameters are concatenated with &. In particular, if the data is alphanumeric, send it as is; Otherwise, it will be coded as Application/X-www-form-urlencoded MIME strings (if Spaces, convert to +, if Chinese/other characters, BASE64 will be used to encode the strings, as follows: %E4%BD%A0%E5%A5%BD, where XX in % XX is the ASCII hexadecimal representation of the symbol); A POST request places the submitted data in the body of an HTTP request.
  4. In terms of security, POST is more secure than GET because the data submitted by A GET request will appear in clear text at the URL, and the PARAMETERS of the POST request are wrapped in the request body, which is relatively secure.
  5. In terms of the size of the request, the length of a GET request is limited by the URL length of the browser or server, allowing a small amount of data to be sent, while the SIZE of a POST request is not limited.

HTTP request structure: request mode + request URI + protocol and version

HTTP response structure: status code + reason phrase + protocol and its version

Status code

Each HTTP response packet returns with a status code. A status code is a three-digit code that tells the client whether the request was successful or whether additional action is required.

  • 1xx: indicates that the server receives the request from the client and the client continues to send the request.
  • 2xx: The server successfully receives and processes the request sent by the client.
  • 3xx: the server returns redirection information to the client.
  • 4xx: The request from the client contains illegal content.
  • 5xx: An unexpected error occurs because the server fails to process the client’s request properly.

  • 200 OK: Indicates that the request sent from the client to the server is processed and returned.
  • 204 No Content: Indicates that the request sent by the client to the client is processed successfully, but the response message returned does not contain the body of the entity (No resources can be returned).
  • 206 Patial Content: Indicates that the client makes a Range request and the server successfully executes this part of the GET request. The response packet contains the entity Content in the Range specified by content-range.
  • 301 Moved Permanently: Permanently redirects the requested resource to a new URL, and the new URL should be used afterwards.
  • 302 Found: Temporary redirection indicates that the requested resource has been assigned a new URL, and the request is expected to use the new URL.
  • 303 See Other: Indicates that a new URL is assigned to the requested resource. Use the GET method to obtain the requested resource
  • 304 Not Modified: Indicates that the server allows access to resources when the client sends a request with conditions (if-match, if-modified-since, if-none-match, if-range, or if-unmodified-since header in the GET request packet). However, the request to meet the conditions of the situation returned to change the status code;
  • 400 Bad Request: indicates that the Request packet contains syntax errors.
  • 401 Unauthorized: HTTP authentication is required without permission.
  • 403 Forbidden: The server denies the access (the access permission is incorrect)
  • 404 Not Found: Indicates that the requested resource cannot be Found on the server. This parameter is also used when the server rejects the request but does Not want to give a reason.
  • 500 Inter Server Error: Indicates that an Error occurs when the Server executes a request, which may be a Web application bug or some temporary errors.
  • 503 Server Unavailable: The Server is temporarily overloaded or is being stopped for maintenance and cannot process requests.

HTTP is an application layer protocol. Instead of worrying about the nuts and bolts of network communication, HTTP relegates the details to TCP/IP, a universal and reliable Internet transport protocol.

Before the HTTP client sends packets to the server, a TCP/IP Protocol needs to be established between the client and server using the Internet Protocol (IP) address and port number. And IP address is provided via the URL as http://207.200.21.11:80/index.html, and use of Domain Name service (Domain Name Services, DNS) http://www.lazyegg.net.

How a browser can display a simple HTML resource on a remote server over HTTP:

  • The browser resolves the server’s hostname from the URL;
  • The browser translates the server hostname into the server IP address.
  • The browser interprets the port number (if any) from the URL;
  • The browser establishes a TCP connection with the Web server.
  • The browser sends an HTTP request packet to the server.
  • The server sends back an HTTP response packet to the browser.
  • Close the connection and the browser displays the document

Protocol version

  • HTTP / 0.9

    HTTP the original version of the HTTP protocol. It only supports the GET method and only requests access to resources in HTML format

  • HTTP / 1.0

    • Added request mode POST and HEAD
    • No longer limited to HTML 0.9, content-Type supports multiple data formats, namely MIME multipurpose Internet mail extensions, such as Text/HTML, image/ JPEG, etc
    • It also supports cache, which allows clients to access a unified website within a specified period of time
    • The format of HTTP requests and responses has also changed. In addition to the data section, each communication must include headers (HTTP headers) that describe some metadata. Other new features include Status Code, multi-character set support, multi-part Type, authorization, cache, and Content encoding
    • However, version 1.0 works in such a way that only one TCP connection can be sent at a time. When the server responds, the connection is closed and the next TCP connection needs to be established again, but keepalive is not supported
  • HTTP / 1.0 +

    In the mid-1990s, to meet the rapid growth of the World Wide Web, many popular Web clients and servers quickly added features to HTTP, including persistent keep-alive connections, virtual hosting support, and proxy connection support, as unofficial de facto standards. This informal HTTP extension is often referred to as HTTP/1.0+

  • HTTP / 1.1

    • Http1.1 is the most popular HTTP protocol version. It was released in 1997 and is still the mainstream HTTP protocol version.
    • The introduction of persistent connections, or persistent connections, that is, TCP connections are not closed by default and can be reused by multiple requests without declaring connection: keep-alive.
    • Pipelining, which allows clients to send multiple requests simultaneously over the same TCP connection, was introduced, further improving the efficiency of the HTTP protocol.
    • Add methods: PUT, PATCH, OPTIONS, and DELETE.
    • The HTTP protocol has no state, and all information must be attached to each request. Many of the requested fields are duplicated, wasting bandwidth and affecting speed.
  • HTTP/2.0 (also known as HTTP-ng)

    • HTTP /2 was released in 2015 and is currently used sparingly.
    • HTTP /2 is a completely binary protocol. Headers and data bodies are binary, and are collectively referred to as “frames” : header and data frames.
    • Multiplexing TCP connections allows both the client and the browser to send multiple requests or responses at the same time in a single connection without one-to-one correspondence in order to avoid the problem of queue blocking. This two-way real-time communication is called Multiplexing.
    • HTTP/2 allows the server to send unsolicited resources to clients, known as server push.
    • Header compression mechanism was introduced to compress header information using GZIP or COMPRESS before sending.

What are the main differences between HTTP 1.0 and HTTP 1.1?

  1. HTTP1.0 mainly uses if-modified-since,Expires in the header. HTTP1.1 introduces more cache control policies such as Entity tag. If-unmodified-since, if-match, if-none-match, etc.
  2. Bandwidth optimization and network connection, HTTP1.0, there are some waste of bandwidth, such as the client only needs a part of an object, and the server will send the whole object over, and does not support breakpoint continuation function, HTTP1.1 is introduced in the request header range header field, which allows only a part of the resource request, The return code is 206 (Partial Content), which makes it easy for developers to make the most of bandwidth and connections.
  3. Error notification management, HTTP1.1 added 24 error status response code, such as 409 (Conflict) indicates that the requested resource and the current state of the resource Conflict; 410 (Gone) Indicates that a resource on the server is permanently deleted.
  4. The Host header processing, in HTTP1.0, assumes that each server is bound to a unique IP address, so the URL in the request message does not pass the hostname. However, with the development of virtual hosting technology, there can be multiple virtual hosts (multi-homed Web Servers) on a physical server, and they share the same IP address. HTTP1.1 both Request and response messages should support the Host header field, and an error (400 Bad Request) will be reported if there is no Host header field in the Request message.
  5. HTTP 1.1 supports long Connections and Pipelining processing that delivers multiple HTTP requests and responses over a SINGLE TCP connection, reducing the cost and latency of establishing and closing connections. Connection: keep-alive is enabled by default in HTTP1.1, somewhat compensating for the fact that HTTP1.0 creates a Connection on every request.

To transmit a packet, HTTP streams the content of the packet data through an open TCP connection in sequence. After receiving a data stream, TCP splits the data stream into small data blocks called segments. The segments are encapsulated in IP packets and transmitted over the Internet. All this work is handled by TCP/ IP software, and HTTP programmers see nothing.

A computer can have several TCP connections open at any one time. TCP is the port number that keeps all these connections running continuously.

An IP address will connect you to the right computer, and a port number will connect you to the right application.

HTTP transactions are delayed for several main reasons.

  1. The client first needs to determine the IP address and port number of the Web server based on the URI. If the host name in the URI has not been accessed recently, it can take tens of seconds to convert the host name in the URI to an IP address through the DNS resolution system 3.
  2. Next, the client sends a TCP connection request to the server and waits for the server to send back a request accept reply. Each new TCP connection has a connection establishment delay. This value is usually a second or two at most, but can add up quickly if there are hundreds of HTTP transactions.
  3. Once the connection is established, the client sends the HTTP request through the newly established TCP pipe. When the data arrives, the Web server reads the request packet from the TCP connection and processes the request
  4. The Web server then sends back the HTTP response, which also takes time.

Processing of HTTP connections

Serial connection

Suppose you have a Web page with three embedded images. The browser needs to initiate four HTTP transactions to display this page: one for the top-level HTML page and three for the embedded image. If each transaction requires (serially) a new connection, then the connection delay and slow start delay add up.

Parallel connection

Sends concurrent HTTP requests over multiple TCP connections.

HTTP allows clients to open multiple connections and perform multiple HTTP transactions in parallel. In this example, four embedded images are loaded in parallel, and each transaction has its own TCP connection.

An HTTP transaction connected to a faster server could easily use up all available Modem bandwidth. If multiple objects are loaded in parallel, each competing for the limited bandwidth, and each object is loaded proportionally at a slower rate, the performance gains are small to none.

But as mentioned earlier, even if they don’t actually speed up the page, parallel connections often make the user feel like the page loads faster, because when multiple component objects appear on the screen at the same time, the user can see the progress of the load. Web pages are expected to load faster if there is a lot of action going on across the screen, even if the stopwatch actually shows that the entire page took longer to download.

Parallel connections also have some disadvantages:

  • Each transaction opens/closes a new connection, consuming time and bandwidth.
  • Due to the slow start nature of TCP, the performance of each new connection is degraded.
  • The number of parallel connections that can be opened is actually limited

A persistent connection

Reuse TCP connections to eliminate connection and shutdown latency.

Web clients often open connections to the same site, and applications that initiate HTTP requests to a server are likely to make more requests to that server in the near future (for example, to get online images). This property is called site locality

HTTP/1.1 (and various enhanced versions of HTTP/1.0) allows HTTP devices to keep TCP connections open after a transaction has finished, in order to reuse existing connections for future HTTP requests.

Non-persistent connections are closed at the end of each transaction. Persistent connections remain open between transactions until the client or server decides to close them.

Persistent connections reduce latency and connection establishment overhead, keep connections tuned, and reduce the potential number of open connections. However, be careful when managing persistent connections, or you can accumulate a large number of idle connections that consume resources on both local and remote clients and servers.

Using persistent connections in conjunction with parallel connections is probably the most efficient approach. Today, many Web applications open a small number of parallel connections, each of which is a persistent connection. There are two types of persistent connections: older HTTP/1.0+ “keep-alive” connections, and modern HTTP/1.1 “persistent” connections.

The keep-alive header simply requests that the connection be kept active. After a keep-alive request is made, the client and server do not necessarily agree to a keep-alive session. They can close idle keep-alive connections at any time and limit the number of transactions handled by keep-alive connections at will. The keep-alive header is completely optional, but can only be used if Connection: keep-alive is provided.

HTTP/1.1 gradually discontinued support for keep-alive connections, replacing them with an improved design called persistent Connection.

Unlike HTTP/1.0+ keep-alive connections, HTTP/1.1 persistent connections are enabled by default. Unless otherwise specified, HTTP/1.1 assumes that all connections are persistent. To close a Connection after a transaction, an HTTP/1.1 application must explicitly add a Connection: close header to the message. This is an important difference from previous versions of the HTTP protocol, where Keepalive connections were either optional or not supported at all.

Pipe connection

Concurrent HTTP requests are made over a shared TCP connection.

HTTP/1.1 allows optional use of request pipes on persistent connections. Multiple requests can be queued before the response arrives. When the first request is sent across the network to a server halfway around the world, the second and third requests can begin to be sent.

There are several restrictions on piped connections

  • Pipes should not be used if the HTTP client cannot verify that the connection is persistent.
  • The HTTP response must be sent back in the same order as the request. HTTP packets do not have serial number labels, so if the received response is out of order, there is no way to match it to the request.
  • The HTTP client must be prepared for the connection to be closed at any time and to resend any outstanding pipelining requests.
  • HTTP clients should not pipe requests that have side effects (such as POST).

HTTPS

HTTP faults:

  1. Communication uses plaintext without encrypting data (content vulnerable to eavesdropping)
  2. Do not verify the identity of the communicator (easy to disguise)
  3. Unable to determine packet integrity (easily tampered inside)

HTTP protocol implementation itself is very simple, no matter who sent the request will return a response, so do not confirm the communication party, there are various hazards.

  • It is not possible to determine whether the Web server to which the request is sent is the one that returns the response as intended. It could be a disguised Web server.
  • It is not possible to determine whether the client to which the response is returned is the client that received the response as intended. It could be a disguised client.
  • Unable to determine whether the communicating party has access rights. Because some Web servers hold important information, they only want to give specific users permission to communicate.
  • There is no way to determine where or by whom the request came from. Even meaningless requests will be accepted.
  • Denial of Service (DoS) attacks on massive requests cannot be prevented.

Therefore, HTTP is not suitable for transmitting sensitive information, such as payment information such as credit card numbers and passwords.

To address this shortcoming of the HTTP protocol, another protocol is needed: Secure Sockets Layer hypertext transfer protocol HTTPS. To ensure data transmission security, HTTPS adds SSL (Secure Sockets Layer) to HTTP. SSL relies on certificates to verify the identity of the server and encrypts communication between the browser and server.

HTTP used in combination with SSL (Secure Socket Layer) is HTTPS

Comparison between HTTP and HTTPS

Data transmitted through HTTP is unencrypted, that is, plain text. Therefore, it is very insecure to transmit private information through HTTP. To ensure that private data can be encrypted and transmitted, Netscape designed the Secure Sockets Layer (SSL) protocol to encrypt data over HTTP, giving birth to HTTPS. To put it simply, HTTPS is a network protocol that uses SSL and HTTP to encrypt transmission and authenticate identities. It is more secure than HTTP.

The main differences between HTTPS and HTTP are as follows:

  1. HTTPS requires you to apply for a certificate from a CA. Generally, there are few free certificates, so a certain cost is required.
  2. HTTP is a hypertext transmission protocol, and information is transmitted in plain text. HTTPS is a secure SSL encryption transmission protocol.
  3. HTTP and HTTPS use completely different connections and use different ports, the former 80 and the latter 443.
  4. HTTP connections are simple and stateless; HTTPS is a network protocol that uses SSL and HTTP to encrypt transmission and authenticate identity. It is more secure than HTTP.

Symmetric and asymmetric encryption

There are two main encryption methods: one is shared key encryption (symmetric key encryption) and the other is public key encryption (asymmetric key encryption)

Shared key encryption (symmetric key encryption)

Encryption and decryption use the same key. Common symmetric encryption algorithms include DES, AES, and 3DES.

In other words, the encryption key is also sent to the other party. During key transmission, the key may be stolen. How can I solve this problem?

Public key (asymmetric key)

The public key uses a pair of asymmetric keys. One is called a private key and the other is called a public key. The private key is not known to anyone, and the public key is sent freely. Public key encrypted information, only the private key can decrypt. Common asymmetric encryption algorithms: RSA, ECC, etc.

That is, the sender uses the public key of the other party to encrypt the ciphertext, and the other party uses the private key to decrypt the received information.

Symmetric encryption Encryption and decryption use the same key, so the encryption speed is fast. However, because the key needs to be transmitted over the network, the security is not high.

Asymmetric encryption uses a pair of keys, a public key and a private key, so it has high security but slow encryption and decryption speed.

To solve this problem, HTTPS uses a mixture of symmetric encryption and asymmetric encryption.

SSL/TLS

Secure Sockets Layer (SSL) is called Secure Sockets Layer in Chinese. It was designed in the mid-1990s by Netscape.

The SSL protocol was designed to solve the insecure problem of HTTP transmission. By 1999, SSL had become the de facto standard on the Internet because of its widespread use. That was the year THE IETF standardized SSL. After standardization, the name was changed to TLS (short for Transport Layer Security), which is called Transport Layer Security protocol in Chinese.

Many articles refer to these two together (SSL/TLS) because they can be seen as different phases of the same thing.

The basic idea of SSL/TLS is to adopt public key encryption (public-key encryption), which means that the client first asks the server for the public key, and then encrypts information with the public key. After receiving the ciphertext, the server decrypts it with its own private key.

But there are two problems.

  • How to ensure that the public key is not tampered with?

Solution: Put the public key in the digital certificate. The public key is trusted as long as the certificate is trusted.

  • Public key encryption requires too much computation. How to reduce the time consumed?

    For each session, the client and server generate a session key, which is used to encrypt information. Because the “conversation key” is symmetrically encrypted, the operation is very fast, whereas the server public key is only used to encrypt the “conversation key” itself, which reduces the time consumed in the encryption operation.

Thus, the basic process of the SSL/TLS protocol looks like this:

  1. The server sends the asymmetric encryption public key to the client.
  2. The client encrypts the symmetrically encrypted key with the public key sent from the server and sends it to the server.
  3. The server decrypts the ciphertext with its own private key to obtain the symmetric encryption key.
  4. Symmetric encryption keys are used to encrypt and decrypt the messages to be transmitted.

HTTPS has a “handshake” before the request.

The password for encrypting data is determined during the handshake. During the handshake, the web site sends an SSL certificate to the browser. The SSL certificate is similar to the id card that you use everyday. It is an HTTPS website’s proof of identity. Because of public-key encryption password can only be decrypted generated when applying for a certificate, so the browser in the generated password before need to check the current access and binding on the certificate of domain name is consistent, but also to validate the certificate issuing authority, if validation fails the browser will give a certificate error prompt.

certificate

In fact, there are many types of certificates we use, and SSL certificates are just one of them. The certificate format is defined by the X.509 standard. The SSL certificate transmits Public keys and is a Public Key Infrastructure (PKI) certificate.

The common certificates are as follows:

  1. SSL certificate used to encrypt the HTTP protocol, also known as HTTPS.
  2. Code signing certificates, used to sign binaries such as Windows kernel drivers, Firefox plug-ins, Java code signing, and so on.
  3. Client certificate, used to encrypt messages.
  4. Two-factor certificate, this type of certificate is used in the USB Key used by e-Banking Pro.

These certificates are issued by the certified Certificate Authority (CA). The types of certificates that can be applied for vary from enterprise to individual, and the prices vary. All certificates issued by the CA are trusted certificates. For SSL certificates, if the accessed website is the same as the website bound to the certificate, the browser can pass the authentication without error messages.

HTTP+ Encryption + Authentication + Integrity Protection =HTTPS

Why does the server send the certificate to the client

There are so many services on the Internet that require certificates to authenticate identity that the client (operating system or browser, etc.) cannot build all the certificates and needs to send the certificates to the client through the server.

Why does the client validate the received certificate

Man-in-the-middle attack

Client <------------ attacker <------------ Server forges certificate to intercept requestCopy the code

How does the client verify the received certificate

To answer this question, we need to introduce Digital Signature.

+---------------------+ | A digital signature | |(not to be confused | |with a digital | |certificate) | +---------+ + -- -- -- -- -- -- -- -- + | is a mathematical | - hash - > the | | news - private key encryption -- - > | digital signature | | technique as 2 | + -- -- -- -- -- -- -- -- -- + + -- -- -- -- -- -- -- -- + | the to validate the | |authenticity and | |integrity of a | |message, software | |or digital document. | +---------------------+Copy the code

A digital signature is generated by encrypting a piece of text with a hash and private key.

Suppose message passing occurs between Bob, Susan, and Pat. Susan sends the message to Bob with a digital signature, and when Bob receives the message, he can verify that the received message is Susan’s

+---------------------+ | A digital signature | |(not to be confused | |with a digital | |certificate) | +---------+ | Is a mathematical | - hash -- - > | message digest | | technique as 2 | + -- -- -- -- -- -- -- -- -- + | to validate the | | | authenticity and | | |integrity of a | | |message, Software | to | or digital document. | + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- + | | | + -- -- -- -- -- -- -- -- + + -- -- -- -- -- -- -- -- -- + | | digital signature - public key to decrypt -- - > the | | news +--------+ +---------+Copy the code

That assumes, of course, that Bob knows Susan’s public key. More importantly, the public key, like the message itself, cannot be sent directly to Bob over an insecure network. At this point, Certificate Authorities (cas) are introduced. There are not many cas, and the Bob client has the certificates of all trusted cas built in. The CA digitally signs Susan’s public key (and other information) to generate a certificate.

After Susan sends the certificate to Bob, Bob verifies the certificate signature through the CA certificate’s public key.

Bob trusts CA and CA trusts Susan so that Bob trusts Susan. This is how a Chain Of Trust is formed.

In fact, the Bob client has the CA’s Root Certificate built in. In HTTPS, the server sends the Certificate Chain to the client.

How HTTPS works

  1. The Client accesses the Server using the HTTPS URL and requires an SSL connection with the Server
  2. The Server returns the pre-configured public key certificate to the client.
  3. Client authenticates the public key certificate: For example, whether the certificate is within the validity period, whether the certificate is used to match the site requested by the Client, whether the certificate is in the CRL revocation list, and whether its upper certificate is valid is a recursive process until the Root certificate (the Root certificate built into the operating system or the Root certificate built into the Client) is verified. If the verification passes, continue; if the verification fails, a warning message is displayed.
  4. The Client uses the pseudo-random number generator to generate a symmetric key for encryption, encrypts the symmetric key with the public key of the certificate, and sends the symmetric key to the Server.
  5. The Server decrypts the message using its own private key to obtain the symmetric key. At this point, both the Client and Server have the same symmetric key.
  6. The Server encrypts plaintext A with the symmetric key and sends it to the Client.
  7. The Client decrypts the ciphertext of the response using the symmetric key and obtains plaintext A.
  8. The Client initiates an HTTPS request again and encrypts plaintext B using the symmetric key. The Server decrypts the ciphertext using the symmetric key to obtain plaintext B.

The advantages of the HTTPS

Although HTTPS is not completely secure, organizations with root certificates and encryption algorithms can also carry out man-in-the-middle attacks. However, HTTPS is still the most secure solution in the current architecture, which has the following benefits:

  1. HTTPS authenticates users and servers to ensure that data is sent to the right clients and servers.
  2. HTTPS is a network protocol that uses SSL and HTTP to encrypt transmission and authenticate identity. It is more secure than HTTP and protects data from theft and alteration during transmission, ensuring data integrity.
  3. HTTPS is the most secure solution under the current architecture, and while it is not absolutely secure, it significantly increases the cost of man-in-the-middle attacks.
  4. Google tweaked its search engine in August 2014, saying that “HTTPS encrypted sites will rank higher in search results than comparable HTTP sites.”

The disadvantage of the HTTPS

Although HTTPS has great advantages, it still has disadvantages in comparison:

  1. HTTPS handshake is time-consuming, which lengthens the page loading time by nearly 50% and increases power consumption by 10% to 20%.
  2. HTTPS connection caching is not as efficient as HTTP, which increases data overhead and power consumption, and even affects existing security measures.
  3. SSL certificates cost money, the more powerful the certificate cost higher, personal websites, small websites do not need to generally do not use.
  4. SSL certificates usually need to be bound to IP addresses. Multiple domain names cannot be bound to the same IP address. IPv4 resources cannot support such consumption.
  5. The HTTPS protocol also has a limited range of encryption and has little effect on hacker attacks, denial of service attacks and server hijacking. Most importantly, the SSL certificate credit chain system is not secure, especially in cases where some countries can control the CA root certificate, man-in-the-middle attacks are just as feasible.

Switch from HTTP to HTTPS

If you need to switch your website from HTTP to HTTPS how do you do that?

You need to change all the links in your page, such as JS, CSS, images, etc., from HTTP to HTTPS. For example, change www.baidu.com to https://www.baidu….

BTW, although it switches HTTP to HTTPS, it is recommended to keep HTTP. So we can do HTTP and HTTPS compatibility when switching, the specific implementation is to remove the HTTP header in the page link, so that the HTTP header and HTTPS header can automatically match. For example, change www.baidu.com to //www.baidu….

Cookie/Session

Cookie

What is a Cookie and how is it used?

Because HTTP is a stateless protocol, if there is no mechanism to save the user access status when a client accesses a Web application through a browser, the operation of the application cannot be continuously traced. For example, when a user adds an item to the cart, the Web application must keep the cart state while the user browses other items so that the user can continue adding items to the cart.

Cookies are a browser caching mechanism that can be used to maintain a session between the client and the server. Because we’re going to talk about sessions in the next one, it’s important to emphasize that the cookie stores the session on the client (session stores the session on the server)

Here is the most common login cases to explain the use of cookies:

  1. First, the user initiates a login request to the server in the client browser
  2. After a successful login, the server returns the login user information in a cookie to the client browser
  3. When the client browser receives the cookie request, it saves the cookie locally (either in memory or on disk, depending on usage)
  4. On subsequent visits to the Web application, the client browser will bring the local cookie with it so that the server can retrieve user information based on the cookie

Session

What is a session and what are the mechanisms for implementing it?

Session is a mechanism for maintaining a session between the client and the server. However, unlike cookies, which store session information locally on the client, session stores the session on the browser.

We also use the login case as an example to explain the process of using session:

  1. First, the user initiates a login request in the client browser
  2. After a successful login, the server saves the user information on the server and returns a unique session id to the client browser.
  3. The client browser stores this unique session identifier
  4. The client browser will carry the unique session identifier with it when accessing the Web application later, so that the server can find user information based on this unique identifier.

If you return a unique session identifier to the client browser and save it for future access, isn’t that a cookie?

Yes, a session is just a session mechanism, and in many Web applications, the session mechanism is implemented through cookies. That is to say, it only uses the function of cookies, not using cookies to save sessions. The session saves session information to the server through the cookie function, which is contrary to the mechanism of the cookie to save the session on the client.

Further, a session is a mechanism for maintaining a session between a server and a client, and it can be implemented in different ways. Take the popular small program as an example to illustrate the implementation scheme of a session:

  1. First of all, after the user logs in, the user login information needs to be saved in the server, here we can use Redis. For example, a userToken is generated for the user, the userId is the key, the userToken is the value is stored in Redis, and the userToken is returned to the applet side.
  2. The applets receive the userToken, cache it, and then bring the userToken with them whenever they access the back-end service.
  3. In subsequent services, the server only needs to compare the userToken brought by the small program with the userToken in Redis to determine the login status of the user.

Session is different from cookie

  1. Cookies are a caching mechanism provided by browsers that can be used to maintain a session between a client and a server
  2. A session is a mechanism for maintaining a session between a client and a server, either through cookies or other means.
  3. If the session is implemented with cookies, the session is saved in the client browser
  4. The session mechanism provides sessions that are stored on the server.

Cookies and sessions are both used to track the identity of a browser user, but they are used in different scenarios.

Cookies are used to store user information for example

  1. We save the user information that has logged in in the Cookie. The next time you visit the website, the page can automatically fill in some basic information for you to log in.
  2. In other words, the next time you visit the website, you do not need to log in again. This is because we can store a Token in the Cookie when the user logs in. The next time you log in, you only need to search the user according to the Token value (for security reasons, Relogin usually requires rewriting Token);
  3. You do not need to log in to the website again to access other pages.

The main purpose of a Session is to record user status through the server. A typical scenario is a shopping cart. When you add an item to the cart, the system doesn’t know which user is doing it because the HTTP protocol is stateless. After creating a specific Session for a particular user, the server can identify that user and keep track of that user.

Cookies are stored on the client, while sessions are stored on the server. Therefore, Session security is relatively higher. If you want to store some sensitive information in the Cookie, do not directly write into the Cookie. It is better to encrypt the Cookie information and decrypt it on the server when it is used.

DNS protocol

Domain Name System (DNS)

In order to get hierarchical domain name space, domain name space is designed.

Domain name spatial information is distributed to multiple hosts, which are called DNS servers. These servers are hierarchical.

  • The root server

    The root server typically does not store any information about these domains, but delegates its permissions to some other server and stores references to those servers. There are multiple root servers, each of which covers the entire domain space. They are distributed all over the world.

  • The primary server

    Files that store the extents it governs. It is responsible for creating, maintaining, and updating the zone file and storing it on local disk.

  • The secondary server

    Neither create nor update files. Zone files from master server or other secondary server. If updates are required, they must be done by the master server.

Name and address resolution

DNS was designed as a client-server program. A host that needs to map addresses to names or names to addresses invokes a DNS client called a resolver. The resolver accesses the nearest DNS server and issues a mapping request. If the DNS server has this information, it satisfies the requirements of the resolver. Otherwise, the resolver can either find another server or ask another server to provide this information.

Iterative parsing

When the root DNS server receives an iterative query request from the local DNS server, it either gives the IP address of the query or tells it which TOP-LEVEL DNS server to refer the next query to. Repeat the preceding steps to obtain the IP address of the domain name to be resolved, and return the result to the host that initiated the query.

To improve efficiency, a cache is used to store the recently found IP address. When the same customer or another customer requests the same IP address, it can be quickly found from the cache. The cache periodically clears mappings whose TTL expires.