"HTTP and HTTPS" dry 1.2W words [absolute value]

Home at this time began to snow, in Shenzhen I, friendly tips: “we should pay attention to the body, be careful not to heat stroke”! (Save your face Shenzhen, it’s winter…)

This blog is mainly about graphic HTTP and HTTPS books in the dry goods, may last half a month cycle, hope to solve you in the network of difficult problems and may exist in the interview embarrassment, welcome everyone to pay attention to and like the work, with your support, I will continue to update more dry goods to quench your thirst!!

An overview of the

There are not many books on network protocols, one is the HTTP Authoritative Guide and the other is TCP/IP Detailed Explanation. The two books are certainly authoritative, but if scholars read them, it will be difficult to understand and difficult to learn. You can also read another blog I wrote about the Internet and things like that

Illustrated HTTP is easy to understand. It introduces each case to the reader vividly and reduces the sense of dead grass when reading.

This blog once again takes a look at the graphic HTTP and gives a more thorough explanation of HTTP and HTTPS. With this blog, you won’t need to spend time looking at diagrams of HTTP, and you’ll save time generalizing and learning other things. Here we go:

First, network foundation

The Web uses a Protocol named HTTP (HyperText Transfer Protocol) as a specification to complete a series of operations from the client side to the server side, and Protocol refers to the convention of rules. It can also be said that the Web is built on the HTTP protocol for communication.

1.1 Basic NETWORK TCP/IP

Commonly used networks operate on the basis of the TCP/IP protocol family, of which HTTP is a subset.

1.1.1 TCP/IP protocol family

For computers and network devices to communicate with each other, they must be based on the same method. An important aspect of the TCP/IP protocol family is layering. The TCP/IP protocol family is divided into four layers: application layer, transport layer, network layer, and data link layer.

The application layer

The TCP/IP protocol family stores various common application services. FTP and DNS are two of them, and HTTP is also in this layer.

The transport layer

The transport layer provides data transfer between two computers on a network connection to the upper application layer. At the transport layer there are two different protocols: TCP Transmission Control Protocol and UDP User Data Protocol.

The network layer

The network layer is used to process packets of data as they flow over the network. A packet is the smallest unit of data transmitted over a network. This layer specifies the path to the other computer and the packet to the other computer. The role of the network layer is to select a transmission route among many options when it is transmitted to and from the other computer through multiple computers or network devices.

The link layer

Used to handle the part of the hardware connected to the network. Including the control operating system, hardware device driver, NIC (network adapter commonly known as network card) and optical fiber and other physical visible parts. Hardware categories are within the scope of the link layer.

1.1.2 TCP/IP Communication Flow

When the TCP/IP protocol family is used for network communication, the communication with the peer party is hierarchical and sequential. The sender goes down from the application layer and the receiver goes up from the lower layer.

Using HTTP as an example: First, the client as the sender sends an HTTP request at the application layer to view a Web page:

Then, in order to facilitate transmission, the data [HTTP request packets] received from the application layer are segmented at the transport layer [TCP protocol], and each packet is marked with serial number and port number and then forwarded to the network layer.

At the network layer [IP protocol] added the MAC address as the communication destination to forward to the link layer.

The server on the receiving end receives data at the link layer and sends it to the upper layer in sequence, all the way to the application layer. HTTP requests sent by clients are received only when they are transmitted to the application layer.

The whole flow chart is as follows:

1.2 Resolving DNS Services by Domain Name

The DNS service is an application-layer protocol like HTTP. It provides domain name to IP address resolution service.

1.3 Relationship between Protocols and HTTP

Through the following figure, see what roles each protocol plays in HTTP protocol communication? Imagine visiting the hackr.jp/xss/Web page

Two, the simple HTTP protocol

This section mainly describes the HTTP protocol structure, mainly using HTTP/1.1

2.1 Communication is achieved through the exchange of request and response

Here’s a concrete example:

The Get at the beginning of the line indicates the type of server access requested, called a method. The subsequent string /index.html identifies the resource object to be accessed. The final HTTP/1.1 indicates the HTTP version number and is used to indicate the HTTP protocol function used by the client. A request for access to a resource on a /index.html page on an HTTP server.

The request messageIs composed of request method, request URI, protocol version, optional request header field, and content entity.

The response packet consists of the protocol version, status code, cause phrase, optional response header field, and entity body.

2.2 HTTP is a protocol that does not save state

HTTP is a stateless protocol that does not save state. The HTTP protocol itself does not store the state of communication between requests and responses. This means that HTTP does not persist requests or responses that have been sent. Here’s a diagram to make it easier to understand

HTTP1.1 is a stateless protocol, but Cookie technology is introduced in order to achieve the desired state-preserving functionality. With cookies, you can manage state.

2.3 HTTP Request Methods

2.3.1 the GET

The GET method is used to request access to a resource identified by a URI. The specified resource is parsed by the server and the response content is returned. That is, if the requested resource is text, it is returned as is, and if it is CGI, it is returned as executed output.

2.3.2 POST

The POST method is used to transfer the body of the entity

While GET can also be used to transfer the body of an entity, it is not generally used, and the primary purpose of POST is not to GET the body of the response.

2.3.3 the PUT

The PUT method is mainly used to transfer files.

Similar to file uploading over FTP, the file content must be contained in the body of the request packet and saved to the location specified by the request URI. However, HTTP 1.1’s PUT method does not have an authentication mechanism. Anyone can upload files, which is a security problem. Comparable Web sites whose architectures use the REST standard can be open to the use of the PUT method.

An example of request-response using the PUT method

2.3.4 HEAD

HEAD: obtains the packet HEAD

The HEAD method is the same as the GET method, except that it does not return the body part of the packet. Used to verify the validity of the URI and the date and time of resource updates.

An example of request-response using the HEAD method

2.3.5 DELETE

The DELETE method is used to DELETE files. It is the opposite of PUT. The DELETE method deletes the specified resource according to the request URI.

An example of request-response using the DELETE method

2.3.6 OPTIONS

The OPTIONS method is used to query the supported methods for the resource specified by the request URI.

An example of request-response using the OPTIONS method

2.3.7 TRACE

The TRACE method is a way for the Web server to loop back previous request traffic to the client.

The client can TRACE how the sent request was processed and modified. This is because requests that want to connect to the source target server may be routed through a proxy, and the TRACE method is used to confirm that a sequence of operations took place during the connection.

2.3.8 CONNECT

CONNECT requires the use of the tunnel protocol to CONNECT to the broker

The CONNECT method requires that a tunnel be established when communicating with the proxy server to realize TCP communication using the tunnel protocol. Traffic is encrypted and transmitted through network tunnels using SSL and TLS protocols.

The CONNECT method format is as follows:

An example of request · response using the CONNECT method

2.4 Cookie state management

HTTP is a stateless protocol that does not manage the status of previous requests and responses. That is, the request cannot be processed based on the previous state. While preserving the feature of stateless protocol, Cookie technology is introduced to solve similar contradictory problems. Cookie technology controls client status by writing Cookie information in request and response packets.

The Cookie notifies the client to save the Cookie based on the set-cookie header field in the response packet sent from the server. When the client sends a request to the server next time, the client automatically adds the Cookie value to the request packet and sends the request packet.

The first request without Cookie information

2. The second request [with Cookie information state]

The above shows the situation of Cookie interaction. The contents of HTTP request packets and response packets are as follows:

Request message [status without Cookie information]
Response message [Cookie information generated by the server]
Request message [automatically sends saved Cookie information]

3. HTTP packet information

3.1 HTTP message

The information used for HTTP interaction is called HTTP packets. HTTP packets sent by the requesting end (client) are called request packets, and those sent by the responding end (server) are called response packets.

HTTP packets are generally divided into a header and a packet body. The two are separated by the initial blank line (CR+LF). Usually, it is not necessary to have a message body.

3.2 Request Packets and Response Packets

3.2.1 Request Message

3.2.2 Response Packets

HTTP status code

The HTTP status code is responsible for representing the return result of the CLIENT’S HTTP request, marking the normal processing of the server, and notifying the error.

4.1 Status code Informs the server of the returned request result

The status code is responsible for describing the returned request results when the client sends a request to the server. The status code lets the user know whether the server handled the request normally or if an error occurred.

Category of the status code

4.2 2 xx success

The response from 2XX indicates that the request was processed normally.

2 200 OK

In the response message, the information returned with the status code will change depending on the method. For example, when the GET method is used, the entity of the requested resource is returned as a response. When the HEAD method is used, the entity HEAD of the requested resource is not returned along with the body of the message.

4.2.2 204 No Content

The status code indicates that the request received by the server is successfully processed, but the response packet returned does not contain the body part of the entity. Also, it is not allowed to return the body of any entity. For example, when a 204 response is returned after processing a request from the browser, the page displayed by the browser is not updated.

Holdings of 206 Partial Content

This status code indicates that the client made a range request and that the server successfully executed that part of the GET request. The response message contains the entity Content in the Range specified by content-range.

4.3 3XX Redirection

The 3XX response results indicate that the browser needs to perform some special processing to properly handle the request.

4.3.1 301 version Permanently

支那

Permanent redirect. This status code indicates that the requested resource has been assigned a new URI and that the URI to which the resource now refers should be used later. That is, if the URI corresponding to the resource is already bookmarked, it should be saved again as indicated in the Location header field.

4.3.2 302 Found

Temporary redirection. This status code indicates that the requested resource has been assigned a new URI and is expected to be accessed by the user using the new URI. Similar to 301 Moved Permanently, but the resource represented by the 302 status code is not Permanently Moved, but only temporarily. In other words, the URI of a resource that has been moved may change in the future. For example, the user saves the URI as a bookmark, but does not update the bookmark as if the 301 status code were present, but retains the URI of the page that returns the 302 status code.

4.3.3 303 See Other

This status code indicates that because another URI exists for the requested resource, the GET method should be used to GET the requested resource.

The 303 status code has the same functionality as the 302 Found status code, but differs from the 302 status code in that the 303 status code explicitly states that the client should use the GET method to obtain the resource.

4.3.4 304 Not Modified

This status code indicates the condition that the server allows the request to access the resource when the client sends a conditional request 2, but the condition is not met. The 304 status code returned does not contain any response body. Although 304 is classified as 3XX, it has nothing to do with redirection.

4.4 4XX Client Errors occur

The 4XX response results indicate that the client is the cause of the error.

4.4.1 400 Bad Request

The status code indicates that a syntax error exists in the request packet. When an error occurs, you need to modify the content of the request and send the request again. In addition, the browser treats the status code as if it were 200 OK.

4.4.2 401 Unauthorized

The status code indicates that the request must be authenticated through HTTP (BASIC authentication or DIGEST authentication). In addition, if the user has been requested once before, the user authentication fails.

4.4.3 403 Forbidden

This status code indicates that access to the requested resource was denied by the server. It is not necessary for the server to give a detailed reason for the rejection, but if it is desired, the reason can be described in the body of the entity so that the user can see it.

4.4.4 404 Not Found

This status code indicates that the requested resource could not be found on the server. In addition, it can be used when the server rejects the request without giving a reason.

4.5 5XX Server Error

The response from 5XX indicates an error occurred on the server itself.

4.5.1 500 Internal Server Error

This status code indicates that an error occurred on the server side while executing the request. It could also be a Web application bug or some temporary glitch.

4.5.2 of 503 Service Unavailable

This status code indicates that the server is temporarily overloaded or is down for maintenance and is unable to process requests at this time. If you know in advance how long this will take, it is best to write the RetryAfter header field and return it to the client.

HTTP header

5.1 HTTP Header

HTTP request and response packets must contain the HTTP header, which provides information for the client and server to process the request and response respectively.

HTTP request packet

In a request, an HTTP packet consists of methods, URIs, HTTP versions, and HTTP header fields.

The following example is the header of a request message when accessing hackr.jp

HTTP response packet

In the response, the HTTP packet consists of the HTTP version, status code (number and reason phrase), and HTTP header field.

The following example is the header of the response message returned when hackr.jp/ was requested earlier.

Among many fields in packets, the HTTP header field contains the most abundant information. The header field exists in both the request and response packets and contains information related to HTTP packets.

5.2 HTTP header Fields

5.2.1 Transmitting Important Information in the HTTP header Field

The HTTP header field is one of the elements of HTTP packets. In HTTP communication between client and server, headers are used in both requests and responses

Part field, which plays a role in passing additional important information. The header field provides the browser and server with information such as the size of the packet body, language used, and authentication information.

5.2.2 HTTP header Field Structure

HTTP header fields consist of header field names and field values separated by colons (:).

Header field name: field valueCopy the code

For example, the content-type field in the HTTP header indicates the object Type of the packet body.

Content-Type: text/html
Copy the code

In the example above, the header field is called Content-Type and the string text/ HTML is the field value.

5.2.3 Four TYPES of HTTP header Fields

HTTP header fields are classified into the following four types based on actual usage.

General Header Fields: The Header used by both request and response packets.
Request Header Fields: The Header used when sending Request packets from the client to the server. Supplementary information about the additional content of the request, client message information, priority of the response content, and so on.
Response Header Fields: The Header used to return Response packets from the server to the client. Additional content added to the response also requires the client to attach additional content information.
Entity Header Fields: the Header used for the Entity part of the request and response packets. Added entity-related information such as when the resource content was updated.

5.2.4 HTTP/1.1 Header Field Overview

1. Generic header field

2. Request header field

3. Response header field

4. Entity header field

5.3 HTTP/1.1 Generic header Fields

The common header field refers to the header used by both the request and response packets.

5.3.1 Cache-Control

You can manipulate how the Cache works by specifying an instruction for the header field cache-control.

The parameters of instructions are optional, and multiple instructions are separated by a comma. Directives with the header field cache-control can be used in requests and responses.

Cache-Control: private, max-age=0, no-cache 
Copy the code

Cache-control instructions at a glance

Cache request instruction

Cache response instruction

Public instruction

Cache-Control: public 
Copy the code

When you specify the use of a public directive, it makes it clear that other users can take advantage of the cache.

Private instruction

Cache-Control: private 
Copy the code

When a private directive is specified, the response only takes a specific user as an object, as opposed to the behavior of a public directive. The cache server provides resource caching for this particular user, and the proxy server does not return requests from other users to the cache.

No – the cache instructions

Cache-Control: no-cache 
Copy the code

The purpose of the no-cache directive is to prevent an expired resource from being returned from the cache. If the request sent by the client contains the no-cache command, the client will not answer the request

Receive cached responses. The cache server in the “middle” must then forward the client request to the source server.

No – the store instruction

Cache-Control: no-store 
Copy the code

When the no-store directive 1 is used, it is implied that the request (and corresponding response) or response contains confidential information.

An instruction specifying cache duration and authentication

S – maxage instructions

Cache-control: s-maxage=604800 (unit: seconds)Copy the code

The s-maxage directive performs the same functions as the Max-age directive, except that it only works with a common cache server 2 that can be used by multiple users. That is

The directive has no effect on a server that repeatedly returns a response to the same user, Says Mr.

Max – age instructions

Cache-control: max-age=604800 (unit: seconds)Copy the code

When a client sends a request that contains a max-age directive, the client receives the cached resource if the cache time value is determined to be smaller than the specified value.

In addition, when a max-age value of 0 is specified, the cache server usually needs to forward requests to the source server. When the server returns a response containing a max-age directive, the cache server does not confirm the validity of the resource. The max-age value represents the maximum length of time the resource has been cached.

Min – fresh instructions

Cache-control: min-fresh=60 (unit: second)Copy the code

The Min-fresh directive requires the cache server to return cached resources that have not been cached for at least the specified time. For example, when min-fresh is specified for 60 seconds, resources that have passed 60 seconds cannot be returned as a response.

Max – stale instructions

Cache-control: max-stale=3600 (unit: second)Copy the code

Using max-stale indicates that cached resources are received even if they expire. If the instruction does not specify a parameter value, the client will receive the response no matter how long elapsed. If the specified value is specified in the command, the stale will still be received by the client as long as it remains within the specified period of time specified by max-stale.

Only – if – cached command

Cache-Control: only-if-cached 
Copy the code

Using the only-if-cached directive means that the client will only ask for the target resource to be returned if the cache server has cached it locally. In other words, the directive requires that the cache server not re-run

The response is loaded and the resource is not revalidated. If the local cache of the request cache server does not respond, the status code 504 Gateway Timeout is returned.

5.3.2 Connection

The Connection header field does two things.

Controls header fields that are no longer forwarded to the agent
Managing persistent Connections

**** controls the header field ** that is no longer forwarded to the agent

The Connection header field can be used to control the header field (hop-by-hop header) that is not forwarded to the proxy any more between the client sending the request and the server returning the response.

Managing persistent Connections

5.3.3 Via

The header field Via is used to track the transmission path of request and response messages between the client and the server. The header field Via is used not only to track forward packets, but also to avoid requests

Loopback occurs. Therefore, the header field content must be appended as it passes through the broker.

5.4 Request Header Field

The request header field is used in the request packet sent from the client to the server to supplement the additional information of the request, client information, and the priority of the response.

5.4.1 the Accept

The Accept header field informs the server of the media types that the user agent can handle and the relative priority of the media types. You can specify multiple media types at once using the type/subtype form.

5.4.2 the Accept – Charset

The accept-charset header field can be used to inform the server of the character set supported by the user agent and the relative priority of the character set. In addition, multiple character sets can be specified at once. With the first word

The value of the weight Q can be used to indicate the relative priority.

5.4.3 the Accept – Encoding

5.5 Response header Field

The response header field is the field used in the response packet returned by the server to the client, which is used to supplement the additional information of the response, server information, and additional requirements for the client

Information.

5.5.1 the Accept – Ranges

Accept-Ranges: bytes 
Copy the code

The accept-ranges header field is used to tell the client whether the server can handle a range request to specify a portion of the server’s resources. There are two types of field values that can be specified, bytes for range requests and None for range requests.

5.5.2 Age

Age: 600 
Copy the code

The header field Age tells the client how long ago the source server created the response. Field values are in seconds. If the server that creates the response is a cache server, the Age value indicates the time between the cached response initiating authentication again and the completion of authentication. The agent must add the header field Age when creating the response

5.5.3 the Location

The header field Location is used to direct the response recipient to a resource at a different Location than the request URI. Almost all browsers, upon receiving a response containing the header field Location, will force an attempt to access the prompted redirect resource.

5.6 Entity header Field

The entity header field is the header used by the entity part contained in the request message and response message, and is used to supplement entity-related information such as the update time of the content.

5.6.1 Allow

Allow: GET, HEAD 
Copy the code

The header field Allow is used to inform the client that all HTTP methods of the resource specified by request-URI can be supported. When the server receives an unsupported HTTP method, it sends a status code

405 Method Not Allowed Is returned as a response. At the same time, all supported HTTP methods are written to the header field Allow and returned.

5.6.2 the content-type

Content-Type: text/html; charset=UTF-8 
Copy the code

The header field content-Type specifies the media Type of the object in the entity body. As with the header field Accept, the field value is assigned as type/subtype.

5.6.3 Expires

Expires: Wed, 04 Jul 2020 08:26:05 GMT 
Copy the code

The header field Expires tells the client when the resource Expires. The cache server responds to the request with a cache when it receives a response containing the header field Expires

A copy of the response is stored until the time specified by the Expires field value. When the specified time passes, the cache server turns the request to the source server as it is sent

Resources.

When the source server does not want the cache server to cache the resource, it is best to write the same time value in the Expires field as in the header field Date.

5.6.4 Last-Modified

Last-Modified: Wed, 23 May 2020 09:59:55 GMT 
Copy the code

The header field last-Modified indicates when the resource was Last Modified. In general, this value is the request-URI that specifies when the resource is modified. But similar to using CGI scripts into

When processing row dynamic data, this value may become the time when the data was finally modified.

5.7 is the header field of the Cookie service

5.7.1 Set – cookies

Set-Cookie: status=enable; expires=Tue, 05 Jul 2020 07:26:31
Copy the code

When the server is ready to start managing the state of the client, various information is given in advance.

The Set value of a field – cookies

5.7.2 cookies

Cookie: status=enable 
Copy the code

The header field Cookie informs the server that the client will include the Cookie received from the server in the request when it wants HTTP state management support. Received multiple

Cookies can also be sent in the form of multiple cookies.

The above tells the story of a lot of theoretical knowledge, we insist, when you are very tired, is walking uphill road, refueling insist!!

HTTPS- Ensure Web security

6.1 Disadvantages of HTTP

HTTP has the following major shortcomings:

The integrity of the message could not be proved, so it may have been tampered with
Communications use clear text (not encryption) and the content can be eavesdropped
The identity of the communicating party is not verified, so it is possible to encounter camouflage

6.2 HTTP + Encryption + Authentication + Integrity Protection = HTTPS

HTTPS is often used for communication on the Web login page and shopping and settlement page. When using HTTPS for communication, use https:// instead of http://. In addition, when a browser visits a Web site with valid HTTPS communication, a lock symbol appears in the browser’s address bar. The display of HTTPS varies from browser to browser.

6.2.1 HTTPS is HTTP with SSL shell

HTTPS is not a new protocol at the application layer. The HTTP communication interface is replaced by the Secure Socket Layer (SSL) and Transport Layer Security (TLS) protocols. Typically, HTTP communicates directly with TCP. When SSL is used, it evolves to communicate with SSL first and then with SSL and TCP. In short, HTTPS is HTTP in the shell of THE SSL protocol.

With SSL, HTTP has the encryption, certificate, and integrity protection features of HTTPS.

SSL is independent of HTTP. Therefore, SSL can be used with other protocols, such as SMTP and Telnet, that run on the application layer. SSL is the most widely used network security technology in the world today.

6.2.2 Public Key Encryption for Exchanging Keys

SSL uses a type of encryption called public-key cryptography.

Encryption and decryption use keys. A password cannot be decrypted without a key; conversely, anyone with a key can decrypt it. If the key is obtained by an attacker, the encryption is also

It loses its meaning.

1. Dilemma of shared key encryption

Encryption and decryption using the same key is called Common key crypto system, also known as symmetric key encryption.

If the shared key is used for encryption, you must also send the key to the peer party. But how do you do it safely? When a key is forwarded over the Internet, if the communication is monitored the key can fall into the hands of an attacker, thus losing the purpose of encryption. You also have to secure the keys you receive.

Key sending problems:

2. Use the public key of both keys for encryption

Public key encryption solves the difficulty of shared key encryption.

Public-key encryption uses a pair of asymmetric keys. One is called a private key and the other is called a public key. As the name implies, a private key cannot be known to anyone else, whereas a public key can be freely distributed and available to anyone.

In public-key encryption mode, the sender uses the public key of the other party to encrypt the ciphertext. After receiving the encrypted message, the other party uses its private key to decrypt the encrypted message. In this way, there is no need to send the private key for decryption, and there is no need to worry about the key being eavesdropped and stolen by an attacker.

3. HTTPS uses the hybrid encryption mechanism

HTTPS uses a mixture of shared key encryption and public key encryption. If the key can be exchanged securely, it is possible to consider using public-key encryption only for communication. However, public key encryption is slower than shared key encryption.

Therefore, we should make full use of their respective advantages and combine a variety of methods for communication. Public key encryption is used in the key exchange and shared key encryption is used in the subsequent stage of establishing communication exchange messages.

4. Certificate proving the correctness of the public key

There are some problems with public-key encryption. It is impossible to prove that the public key itself is a genuine public key. For example, how to prove that the public key received is the public key issued by the intended server when you are trying to establish public-key encryption communication with a certain server. Perhaps the real public key has been replaced by an attacker during the public key transfer.

To solve the above problems, public key certificates issued by digital Certificate Certification Authority (CA) and its related authorities can be used.

A digital certificate Authority is in the position of being a trusted third party organization for both client and server. To introduce the business process of a digital certificate authority. First, the server operator applies for a public key from a digital certificate Authority. After identifying the identity of the applicant, the digital certificate Authority will digitally sign the applied public key, allocate the signed public key, and bind the public key into the public key certificate.

The server sends the public key certificate issued by the Digital Certificate Authority to the client for public key encryption communication. Public key certificates can also be called digital certificates or simply certificates. The client receiving the certificate can use the public key of the DIGITAL certificate Authority to verify the digital signature on the certificate. Once the authentication is successful, the client can know two things: first, the public key of the authentication server is a real and valid digital certificate authority. Second, the server’s public key is trustworthy.

The public key of the authentication authority here must be securely transferred to the client. When using communication methods, how to secure transfer is very difficult, therefore, most browser developers publish

The public key of common authentication authority is implanted internally in advance.

5. Secure communication mechanism of HTTPS

To better understand HTTPS, let’s look at HTTPS communication steps.

Step 1: The Client sends a Client Hello packet to start SSL communication. The packet contains the specified VERSION of SSL supported by the client and a list of Cipher Suite components

Encryption algorithm used and key length etc.).

Step 2: When SSL communication is enabled, the Server responds with Server Hello packets. As with the client, the message contains the SSL version as well as the encryption component. The server’s encryption component content is filtered from the received client encryption component.

Step 3: Then the server sends a Certificate packet. The message contains a public key certificate.

Step 4: The Server sends a Server Hello Done packet to notify the client that the INITIAL SSL handshake negotiation is complete.

Step 5: After the first SSL handshake, the Client responds with a Client Key Exchange packet. The packet contains a type of communication encryption called pre-master

Secret specifies a random password string. The packet is encrypted with the public key in Step 3.

Step 6: The client sends a Change Cipher Spec packet. The packet prompts the server that the communication after the packet is encrypted with the pre-master secret key.

Step 7: The client sends a Finished packet. The packet contains the overall checksum of all packets so far connected. The success of the handshake negotiation depends on whether the server is correct

Declassify the text as a criterion.

Step 8: The server also sends a Change Cipher Spec packet.

Step 9: The server also sends a Finished packet.

Step 10: After exchanging Finished packets between the server and client, the SSL connection is established. Of course, the communication is protected by SSL. This is where application layer protocol communication starts, that is, sending HTTP requests.

Step 11: Application layer protocol communication, that is, sending HTTP responses.

Step 12: Finally disconnect from the client. When the connection is disconnected, the close_notify packet is sent. After this step, a TCP FIN packet is sent to close the communication with TCP.

Below is a diagram of the process. The figure illustrates the entire process of establishing HTTPS communication from using only server-side public key certificates (server certificates).

Maybe a lot of people don’t understand how CA organization, Server side and Client side actually work. It took me about 2 hours to sort out a picture and text for everyone.

Extension 1: Is SSL slow?

The problem with HTTPS is that it slows down when SSL is used.

There are two types of SSL slowness. One is slow communication. On the other hand, the processing speed slows down due to the large consumption of CPU and memory resources.

Compared to using HTTP, network load can be 2 to 100 times slower. In addition to TCP connections and SENDING HTTP requests and responses, SSL communication is required, which inevitably increases the overall traffic volume.

The other point is that SSL must be encrypted. Encryption and decryption are required on both the server and client. So in terms of results, more than HTTP

Consume the hardware resources of the server and client, resulting in increased load.

There is no fundamental solution to the slow speed problem, we will use hardware like SSL accelerators to improve the problem. The hardware is dedicated to SSL communication, which can improve the computing speed of SSL several times compared with software. Use the SSL accelerator only for SSL processing to share the load.

Extension 2: Why not always use HTTPS?

If HTTPS is so secure, why don’t all Web sites use HTTPS all the time?

One reason is that encrypted communication consumes more CPU and memory resources than plain text communication. If every communication is encrypted, it consumes a considerable amount of resources, and the number of requests that can be processed on a single computer is bound to decrease.

Especially when the most visited Web sites in the encryption process, they bear the load should not be underestimated. When encrypting, not all content

Encrypt, but encrypt only when information needs to be hidden to save resources.

Certificates are essential for HTTPS communication. The certificates used must be purchased from a certification authority (CA). The certificate price may vary slightly depending on the certification body

The same. Typically, a one-year license costs tens of thousands of yen (10,000 yen is now about 600 yuan). Money, money, money, hey, hey, hey

Authentication to confirm the identity of the visiting user

Some Web pages are intended to be viewed only by specific people, or simply by you. To achieve this goal, authentication is essential. Let’s learn about authentication

Mechanism.

HTTP Authentication mode

BASIC Certification
DIGEST Authentication
SSL client authentication
FormBase authentication [form-based authentication]

7.1 BASIC authentication

BASIC authentication is an authentication method defined from HTTP/1.0. Even now there are still some websites that use this authentication method. Is the Web server and communication

Authentication mode between clients.

Step 1: When requested resources require BASIC authentication, the server returns a response with the WWW-Authenticate header field along with the status code 401Authorization Required. This field contains the authentication mode (BASIC) and the Request-URI security domain string (realm).

Step 2: The client receiving the status code 401 needs to send the user ID and password to the server in order to pass BASIC authentication. The sent string consists of the user ID and password, which are concatenated with a colon (:) and then Base64 encoded.

Step 3: The server that receives the Authorization request containing the first field verifies the correctness of the authentication information. If the validation is successful, a response containing the request-URI resource is returned.

BASIC authentication is Base64 encoding, but it is not encryption. It can be decoded without any additional information. In other words, BASIC authentication on unencrypted communication lines such as HTTP is highly likely to be stolen if wiretapped because the user ID and password are encoded in plaintext.

BASIC authentication is not flexible enough to use and does not offer the level of security that most Web sites expect, so it is not commonly used.

7.2 DIGEST authentication

To compensate for BASIC’s weaknesses, DIGEST authentication has been available since HTTP/1.1. DIGEST authentication also uses challenge/response, but does not send a plaintext password as BASIC does.

In the so-called challenge response mode, one party first sends the authentication request to the other party, and then calculates the response code using the challenge code received from the other party. Finally, the response code is returned to the other party for authentication.

Step 1: When requesting resources to be authenticated, the server returns a response with the wwW-Authenticate header field along with the 401Authorization Required status code.

This field contains the temporary challenge code (random number, Nonce) required for the authentication of the query response mode.

Step 2: The client that receives the 401 status code returns a response containing the Authorization information of the header field required for DIGEST authentication.

Step 3: After receiving the Authorization request of the first field, the server confirms the correctness of the authentication information. After authentication, a response containing the request-URI resource is returned.

In this case, some information about successful Authentication is written to the header field authentication-info.

DIGEST authentication provides a higher level of security than BASIC authentication, but is still weak compared to HTTPS client-side authentication. DIGEST authentication provides a protection mechanism against password eavesdropping, but there is no protection mechanism against user impersonation.

DIGEST authentication, like BASIC authentication, is not as flexible to use and still falls short of the high level of security that most Web sites seek. So its scope of application is also

There are limits.

7.3 SSL Client Authentication

SSL client authentication uses the HTTPS client certificate to complete the authentication. With client certificate authentication (explained in the HTTPS chapter), the server can verify whether access is available

From a logged in client.

7.3.1 Procedure for SSL Client Authentication

To implement SSL client authentication, distribute the client certificate to the client and install the certificate on the client.

Step 1: After receiving a request for authentication resources, the server sends a CertificateRequest packet asking the client to provide a client certificate.

Step 2: After the user selects the Client Certificate to be sent, the Client sends the Client Certificate information to the server in the form of Client Certificate packets.

Step 3: The server can obtain the public key of the client after verifying the client certificate and start HTTPS encryption communication.

7.3.2 Two-factor AUTHENTICATION is Adopted for SSL Client Authentication

In most cases, SSL client authentication does not rely solely on certificates, but is usually combined with form-based authentication (explained later) to form two-factor authentication. The so-called two-factor authentication refers to the authentication process requires not only the password factor, but also the applicant to provide other information, so as to act as

Another factor is the way authentication is used in combination with it.

In other words, the SSL client certificate of the first authentication factor is used to authenticate the client computer, and the password of the other authentication factor is used to determine that this is the user’s behavior. After two-factor authentication, you can confirm that the user himself is accessing the server using the correctly matched computer.

7.4 Form-based Authentication

Form-based authentication is not defined in the HTTP protocol. The client sends the login information (Credential) to the Web application on the server

Certificate result certification. I’m not going to go into that.

Eight, based on HTTP function add-on protocol

8.1 SPDY to eliminate HTTP bottlenecks

Google released SPDY (pronounced SPeeDY) in 2010 with the goal of solving HTTP performance bottlenecks and shortening Web page load times

(50%).

8.1.1 HTTP Bottlenecks

In order to display the updated content in as real-time as possible, as soon as the content is updated on the server, that content needs to be directly fed back to the client interface. As simple as it may seem,

HTTP, however, is not up to the task.

Using the HTTP protocol to detect content updates on the server requires frequent confirmations from the client to the server. If there are no content updates on the server, then fruitless communication occurs.

8.1.2 Does SPDY eliminate Web bottlenecks

You want to use SPDY without having to make any special changes to the content side of the Web, and both the Web browser and the Web server have to make some changes to accommodate SPDY. Is there a good

Several Web browsers have adapted SPDY accordingly. In addition, the Web server has also carried on the experimental nature of application, but the technology into the actual Web site is not

It’s not going well. Because SPDY basically just multiplexes the communications of a single domain name (IP address), the improvement is limited when resources under multiple domain names are used on a Web site.

SPDY is certainly an effective technique for eliminating HTTP bottlenecks, but many Web sites have problems that are not solely caused by HTTP bottlenecks. There are other areas that can be studied more carefully to speed up the Web itself, such as improving the way Web content is written.

8.2 WebSocket for Full-duplex Communication using a Browser

Using Ajax and Comet technologies to communicate can speed up Web browsing. However, the problem is that if HTTP protocol is used for communication, it cannot completely solve the bottleneck problem. WebSocket network technology is a new protocol and API to solve these problems.

8.2.1 WebSocket design and functions

WebSocket is a standard for full-duplex communication between Web browsers and Web servers. The WebSocket protocol is standardized by IETF, and the WebSocket API is standardized by W3C. WebSocket technology, which is still under development, is designed to address problems caused by a bug that came with XMLHttpRequest in Ajax and Comet.

8.2.2 WebSocket Protocols and Features

Once the WebSocket protocol communication connection is established between the Web server and the client, all subsequent communication depends on this special protocol. Data in any format, such as JSON, XML, HTML or images, can be sent to each other during communication.

Let’s list the main features of the WebSocket protocol.

1. Push function

The server can push data to the client. This way, the server can send data directly without waiting for the client to request it.

2. Reduce traffic

As long as a WebSocket connection is established, you want it to stay connected. Not only is the total overhead per connection reduced compared to HTTP, but there is also less traffic due to the small size of the WebSocket header.

8.3 Long-awaited HTTP/2.0

The current mainstream HTTP/1.1 standard has not been revised since RFC2616 in 1999. With technologies like SPDY and WebSocket emerging, it’s hard to say HTTP/1.1 is still the protocol for today’s Web.

HTTP/2.0 is discussed around seven major technologies, and at this stage (August 13, 2012), most of the following protocols are favored. But, uh, the discussion is still going on, so

Major changes cannot be ruled out.

Nine, the Web attack technology

9.1 Attack Technologies Against the Web

The simple HTTP protocol itself does not have security problems, so the protocol itself is hardly the target of attack. Servers and clients using the HTTP protocol, as well as running on the server

Resources such as Web applications on servers are the targets.

9.1.1 HTTP does not provide necessary security features

Developers need to design and develop their own authentication and session management functions to meet the security requirements of Web applications. Designing your own means a variety of implementations. “

As a result, the security level is not perfect, but the Web applications that are still in operation are hidden behind a variety of security bugs that can be easily abused by attackers.

9.1.2 Requests can be tampered with on the client side

You can launch attacks against Web applications by loading attack codes into HTTP request packets. Attack codes are transmitted through URL query fields or forms, HTTP headers and cookies

Enter, if there is a security vulnerability in the Web application at this time, the internal information will be stolen, or be the attacker to get management rights.

9.1.3 Attack Mode Against Web Applications

There are two attack modes against Web applications

Take the initiative to attack
Passive aggression

Take the initiative to attack

Active attack refers to an attack mode in which attackers directly access Web applications and pass in attack codes. Since this mode attacks resources on the server directly, the attacker needs to be able to access those resources.

The representative attacks in active attack mode are SQL injection attack and OS command injection attack.

Passive aggression

Attack enterprise internal network by using user identity

Using passive attack, you can launch attacks on enterprise networks that cannot be accessed directly from the Internet. Once a user steps into the trap set by the attacker, even the Intranet of the enterprise can be attacked within the network that the user can access.

9.2 Security Vulnerabilities caused by session management negligence

Session management is a necessary function for managing user status. However, if session management is not properly implemented, user authentication status may be stolen.

9.2.1 Session hijacking

Session Hijack means that an attacker obtains a user’s Session ID by some means and uses the Session ID to disguise himself as a user.

Web applications with the authentication function use the session management mechanism based on the session ID as the mainstream mode to manage the authentication status. The session ID records client information such as cookies and services

The server manages one-to-one matching between session ID and authentication status.

Here are several ways an attacker can obtain a session ID.

Infer session IDS through informal generation methods
Steal session ids through eavesdropping or XSS attacks
Session Fixation is used to forcibly obtain a Session

9.2.2 Fixed Session Attacks

Session Fixation attack is passive attack for Session hijacking, which is used to steal target Session IDS. Session Fixation attack is used to force users to use Session IDS specified by attackers.

9.3 Password Cracking

Password Cracking is a Password Cracking attack. Attacks are not limited to Web applications, but also include other systems (such as FTP and SSH). This section describes how to crack passwords for Web applications that have the authentication function.

There are two ways to crack a password.

Password trial-and-error across the network
Decryption of encrypted passwords (refers to the attacker invades the system and has obtained encrypted or hashed password data) – in addition to authentication breaking attacks, there are SQL injection attacks to evade authentication, cross-site scripting, attacks to steal password information and other methods.
Password trial and error over the network

9.4 DoS attack

A Denial of Service attack is an attack that stops a running Service. Sometimes called a denial-of-service attack or denial of service attack. DoS attacks are not limited to Web sites, but also include network devices and servers.

There are two DoS attacks.

The centralized utilization of access requests results in resource overload, where the resources are exhausted and the service is effectively stopped.
Stop the service by attacking a security hole.

Among them, the centralized use of access request DoS attack, simply means to send a large number of legitimate requests. The server is difficult to distinguish between a normal request and an attack request, so it is difficult to defend against

Disable DoS attacks.

Distributed Denial ofService (DDoS) attacks launched by multiple computers are called Distributed Denial ofService attacks. DDoS attacks usually take advantage of infected computers

The attacker’s springboard.

conclusion

This blog will spend about half a month to extract the quality content of GRAPHIC HTTP and add my own understanding and feelings. I hope it will be helpful to you. I also hope you can like, follow and comment on it and make progress together!!

But when you want to give up, that’s the time to test your endurance and “pull you out” of a crowd of mediocre people. The harder it is, the more you are on the way up.

“HTTP and HTTPS” dry 1.2W words [absolute value]