Difference between HTTP and HTTPS
HTTP is a Hypertext Transfer Protocol (Hypertext Transfer Protocol). HTTP is a Protocol and specification for transmitting Hypertext data, such as text, pictures, audio and video, between two points in the computer world
The main content of HTTP is divided into three parts: Hypertext, Transfer and Protocol.
- Hypertext is more than just text, it can also transfer pictures, audio, video, and even click on a text or image
hyperlinks
The jump. - These concepts can be collectively referred to as data, and transmission is the process in which data is transferred from one end system to another through a series of physical media. Usually we call the party that transmits the packet
The requester
, the party receiving the binary packet is calledReply party
. - Protocol refers to the norms for transferring and managing information on networks (including the Internet). Just as people need to follow certain rules to communicate with each other, computers need to follow certain rules to communicate with each other. These rules are called protocols, but network protocols.
Speaking of HTTP, the TCP/IP network model has to be mentioned, which is usually a five-tier model. As shown in the figure below
However, it can also be divided into four layers, that is, the link layer and the physical layer are expressed as the network interface layer
Another is the OSI seven-layer network model, which adds a presentation layer and a session layer on top of the five-layer protocol
The full name of HTTPS is Hypertext Transfer Protocol Secure. From its name, we can see that HTTPS is more Secure than HTTPS. In fact, HTTPS is not a new application-layer Protocol. It is a combination of HTTP + TLS/SSL, and security is what TLS/SSL does.
In other words, HTTPS is HTTP with SSL on top.
So, what are the main differences between HTTP and HTTPS?
- The simplest, HTTP protocol on the address bar is
http://
The protocol for HTTPS in the address bar ishttps://
At the beginning
http://www.cxuanblog.com/
https://www.cxuanblog.com/
Copy the code
- HTTP is a protocol without secure encryption. Its transmission process is easy to be monitored by attackers, data is easy to be stolen, and sender and receiver are easy to be forged. HTTPS is a secure protocol, which can solve these problems through key exchange algorithm – signature algorithm – symmetric encryption algorithm – digest algorithm.
- The default port for HTTP is 80 and the default port for HTTPS is 443.
HTTP Get is different from Post
HTTP includes many methods. Get and Post are the two most commonly used methods in HTTP. Basically, 99% of HTTP methods are used in Get and Post methods, so it is necessary for us to have a deeper understanding of these two methods.
- The get method is generally used for requests, such as when you type in the browser address bar
www.cxuanblog.com
A get request is sent, and its main feature is to ask the server to return the resource, while the POST method is generally used forThe < form > form
Get is equivalent to a pull/ pull operation and POST is equivalent to a push/ push operation. - The get method is not secure, because your request parameters will be spelled after the URL in the process of sending the request, making it easy for attackers to steal your information and cause damage and forgery.
/test/demo_form.asp? name1=value1&name2=value2Copy the code
The POST method puts parameters in the request body, which is not visible to the user.
Asp HTTP/1.1 Host: w3schools.com name1=value1&name2=value2Copy the code
-
A GET request has a URL with a length limit, whereas a POST request places parameters and values in the message body, with no requirement for data length.
-
Get requests are actively cached by browsers, whereas POST requests are not, unless set manually.
-
Get requests are harmless when repeated back/forward operations are performed in the browser, while POST operations resubmit the form request.
-
A GET request generates a TCP packet during transmission. Post generates two TCP packets during sending. For get requests, the browser sends both HTTP headers and data, and the server responds with 200 (return data). For POST, the browser sends a header, the server responds with 100 continue, the browser sends data, and the server responds with 200 OK (returns data).
What is stateless protocol, HTTP is stateless protocol, how to solve
The Stateless Protocol means that the browser has no memory for transaction processing. For example, a client may close the browser after requesting a web page, and then start the browser again to log in to the site, but the server does not know that the client closed the browser once.
HTTP is a stateless protocol that has no memory for user actions. Most users probably don’t believe that. They probably think that every time they enter a username and password to log in to a site, they will not re-enter the username and password the next time they log in. That’s not really what HTTP does. What does is a mechanism called cookies. It gives the browser the ability to remember.
If your browser allows cookies, viewing the chrome: / / Settings/content/cookies
That means your memory chip is powered…… When you want the server to send a request, the server sends you an authentication message. When the server receives the request for the first time, it creates a Session space (the Session object is created), generates a sessionId, and passes the ** set-cookie in the response header: JSESSIONID=XXXXXXX ** command to send a response to the client requesting to set cookies; After receiving the response, the client sets a Cookie with **JSESSIONID=XXXXXXX ** on the local client. The Cookie expires at the end of the browser session.
Next, when the client sends a request to the same website each time, the request header will carry the Cookie information (including the sessionId). Then, the server obtains the value named JSESSIONID by reading the Cookie information in the request header and obtains the sessionId of the request. In this way, your browser has the ability to remember.
Another way is to use the JWT mechanism, which is also a mechanism to make your browser memorable. Unlike cookies, JWT is information stored on the client and is widely used in single sign-on situations. JWT has two characteristics
- JWT Cookie information is stored in
The client
, instead of server memory. In other words, JWT can directly authenticate the Token locally. After the authentication, the Token will be sent to the server in the Session with the request. In this way, the server resources can be saved and the Token can be authenticated multiple times. - JWT supports cross-domain authentication, Cookies can only be used in
The domain of a single node
Or itssubdomain
The effective. If they try to access through a third node, they are blocked. Using JWT can solve this problem, using JWT can passMultiple nodes
User authentication, that’s what we call itCross-domain authentication
.
Differences between UDP and TCP
Both TCP and UDP reside in the transport layer of the computer network model and are responsible for transferring data generated by the application layer. Let’s talk about the characteristics and differences between TCP and UDP
What is the UDP
UDP stands for User Datagram Protocol. It speeds up communication by eliminating the need for a so-called handshake operation, allowing other hosts on the network to transfer data before the receiver agrees to communicate.
A datagram is a transport unit associated with a packet-switched network.
UDP has the following characteristics
- UDP can support bandwidth-intensive applications that tolerate packet loss
- UDP is characterized by low latency
- UDP can send a large number of packets
- UDP allows DNS lookup, an application-layer protocol built on top of UDP.
What is the TCP
TCP stands for Transmission Control Protocol. It helps you determine whether your computer is connected to the Internet and the data transfer between them. A TCP connection is established through a three-way handshake, which is used to initiate and confirm a TCP connection. Once the connection is established, data can be sent, and when the data transfer is complete, the connection is disconnected by shutting down the virtual circuit.
The main features of TCP are as follows
- TCP ensures that connections are established and packets are sent
- TCP supports error retransmission
- TCP supports congestion control and can delay transmission in case of network congestion
- TCP provides error checksums to identify harmful packets.
Difference between TCP and UDP
The following list lists some differences between TCP and UDP for you to understand and remember.
TCP | UDP |
---|---|
TCP is a connection-oriented protocol | UDP is a connectionless protocol |
TCP establishes a connection before sending data | UDP can send large amounts of data directly without establishing a connection |
TCP rearranges packets in a specific order | UDP packets have no fixed sequence and are independent of each other |
TCP transmission is slow | UDP transfers will be faster |
The TCP header has 20 bytes | UDP header bytes require only 8 bytes |
TCP is heavyweight, requiring three handshakes to establish a connection before sending any user data. | UDP is lightweight. No trace connections, message ordering, etc. |
TCP performs error verification and error recovery | UDP also checks for errors, but discards the wrong packets. |
TCP has send confirmation | UDP did not send confirmation |
TCP uses handshake protocols, such as SYN, SYN-ACK, and ACK | No handshake protocol |
TCP is reliable because it ensures that data is delivered to the router. | There is no guarantee that data will be delivered to the destination in UDP. |
TCP three handshakes and four waves
The TCP three-way handshake and the four-way wave are also popular interview questions, which correspond to the TCP connection and release process, respectively. Here’s a quick look at these two processes
TCP three-way handshake
Before we look at the process, we need to understand a few concepts
Message type | describe |
---|---|
SYN | This message is used to initiate and establish a connection. |
ACK | Confirm the SYN message received by the peer party |
SYN-ACK | Local SYN messages and earlier ACK packets |
FIN | Used to disconnect |
-
SYN: The full name is Synchronize Sequence Numbers. Is a handshake signal used by TCP/IP to establish a connection. A signal that is first sent when establishing a TCP connection between a client and a server. When the client receives a SYN message, it generates a random value X in its segment.
-
Syn-ack: After receiving the SYN, the server opens the client connection and sends a SYN-ACK. The acknowledgement number is set to one more than the received serial number, X + 1, and the server selects another random serial number Y for the packet.
-
ACK: Acknowledge character, indicating that the data sent has been received correctly. Finally, the client sends the ACK to the server. The serial number is set to the received confirmation value, Y + 1.
If you use real life as an example
Xiao Ming – Client and Xiao Hong – Server
- Xiao Ming calls Xiao Hong. After he gets through, Xiao Ming says hello, can you hear me? This is equivalent to establishing a connection.
- Xiao Hong responds to Xiao Ming, can you hear me? Can you hear me? This is like asking for a response.
- Xiao Ming hears Xiao Hong’s response and says, ok, this is a confirmation link. After that, Xiao Ming and Xiao Hong can talk/exchange messages.
TCP waved four times
Using four waves during the connection termination phase, each end of the connection terminates independently. Let’s describe the process.
- First, the client application decides to terminate the connection (the server can also choose to disconnect). This causes the client to send the FIN to the server and enter
FIN_WAIT_1
State. When the client is in the FIN_WAIT_1 state, it waits for an ACK response from the server. - Then, in step 2, when the server receives a FIN message, it immediately sends an ACK message to the client.
- When the client receives an ACK response from the server, the client enters
FIN_WAIT_2
State, and then wait for theFIN
The message - After the server sends an ACK message, it sends a FIN message to inform the client that it can shut down the server.
- When the client receives a FIN message sent from the server, the client status changes from FIN_WAIT_2 to
TIME_WAIT
State. Clients in TIME_WAIT state are allowed to re-send ACKS to the server to prevent information loss. The amount of time a client spends in TIME_WAIT depends on its implementation, and after waiting some time, the connection is closed and all resources (including port numbers and buffer data) on the client are released.
Again, you can use the call example above to describe it
- Xiao Ming said to Xiao Hong, all my things have been said, I have to hang up the phone.
- “Received,” said Xiao Hong. “I still have some things to say.”
- After a number of seconds, small red also said, small red said, I said, now can hang up
- After xiao Ming received the message, he waited for some time and hung up the phone.
Brief the differences between HTTP1.0/1.1/2.0
The HTTP 1.0
HTTP 1.0 was introduced in 1996, and since then its popularity has been phenomenal.
- HTTP 1.0 provides only the most basic authentication, and at this point the user name and password are not encrypted, making it easy for prying eyes.
- HTTP 1.0 was designed to use short links, where each transmission of data goes through TCP’s three-way handshake and four-way wave, which is less efficient.
- HTTP 1.0 only uses if-Modified-since and Expires in headers as criteria for cache invalidation.
- HTTP 1.0 does not support breakpoint continuation, which means that all pages and data are sent each time.
- HTTP 1.0 assumes that only one IP can be bound to each computer, so the URL in the request message does not pass the hostname.
The HTTP 1.1
HTTP 1.1 came three years after HTTP 1.0 was developed, in 1999, with the following changes
- HTTP 1.1 uses the digest algorithm for authentication
- HTTP 1.1 uses long connections by default. Long connections are established once and can be transmitted multiple times. After the transmission is complete, the connection can be disconnected only once. The connection duration of a long connection can be specified in the request header
keep-alive
To set the - HTTP 1.1 added e-tag, if-unmodified-since, if-match, if-none-match and other cache control headers to control cache invalidation.
- HTTP 1.1 supports breakpoint continuation by using the
Range
To implement. - HTTP 1.1 uses virtual networks, where multiple virtual hosts (multi-homed Web Servers) can exist on a single physical server and share a single IP address.
The HTTP 2.0
HTTP 2.0 is a standard developed in 2015 with the following major changes
The head of compression
Because HTTP 1.1 comes up a lotUser-agent, Cookie, Accept, Server, RangeFields like “, “and”, “can take up hundreds or even thousands of bytes, whereas” Body “is often only tens of bytes, leading to a heavy header. The HTTP 2.0 usingHPACK
Algorithm for compression.Binary format
HTTP 2.0 uses a binary format closer to TCP/IP and ditched ASCII to improve parsing efficiencyStrengthen the security
Since security has become a top priority, HTTP2.0 generally runs on HTTPS.multiplexing
That is, each request is used for connection sharing. One request corresponds to one ID, so there can be multiple requests on a connection.
Please describe the common HTTP headers
This is an open question because there are many HTTP headers. Here are just a few examples. For details, please refer to my other article
Mp.weixin.qq.com/s/XZZR0945I…
There are four types of HTTP headers: generic headers, entity headers, request headers, and response headers. Introduce them separately
General header
There are three common headers, Date, cache-control, and Connection
Date
Date is a generic header that can appear in both request and response headers, and its basic representation is as follows
Date: Wed, 21 Oct 2015 07:28:00 GMT
Copy the code
Greenwich Mean Time, which is eight hours behind Beijing Time
Cache-Control
Cache-control is a common header, which can appear in both request and response headers. Cache-control is a variety of headers. Although this is a common header, there are some features of the request header, some of which are unique to the response header. The main categories are cacheability, threshold, revalidating and reloading, and other features
Connection
Connection determines whether the network Connection will be closed after the current transaction (a three-way handshake and a four-way wave) completes. There are two types of Connection: persistent Connection, that is, the network Connection is not closed after the completion of a transaction
Connection: keep-alive
Copy the code
The other is a non-persistent connection, in which the network connection is closed after a transaction is completed
Connection: close
Copy the code
Other common headers for HTTP1.1 are as follows
Entity header
Entity headers are HTTP headers that describe the content of the message body. Entity headers are used in HTTP requests and responses. The content-Length, Content-language, and Content-Encoding headers are entity headers.
-
Content-length The entity header indicates the size of the entity body, in bytes, to be sent to the receiver.
-
Content-language The entity header describes the Language that is acceptable to the client or server.
-
Content-encoding Another tricky property, this entity header is used to compress the media type. Content-encoding indicates what Encoding is applied to the entity.
Common content encodings include gzip, COMPRESS, Deflate, and Identity. This attribute can be applied to request packets and response packets
Accept-Encoding: gzip, deflate //Content-Encoding: gzip // Response headerCopy the code
Here are some entity header fields
The request header
Host
The Host header specifies the domain name of the server (for virtual hosts) and, optionally, the TCP port number on which the server listens. If no port number is given, the default port for the requested service is automatically used (for example, 80 is automatically used for requesting an HTTP URL).
Host: developer.mozilla.org
Copy the code
The above Accpet, Accept-language, and Accept-Encoding are request headers for content negotiation.
Referer
The HTTP Referer attribute is part of the request header. When a browser sends a request to a Web server, it usually carries the Referer with it, telling the server from which the page was linked, so that the server can obtain some information for processing.
Referer: https://developer.mozilla.org/testpage.html
Copy the code
If-Modified-Since
If-modified-since is usually used with if-none-match to verify the validity of local resources owned by the proxy or client. The update date and time of the resource can be determined by confirming the header field last-Modified.
The server responds with 200 if the resource has been updated since Last-Modified, and 304 if the resource has not been updated since Last-Modified.
If-Modified-Since: Mon, 18 Jul 2016 02:36:04 GMT
Copy the code
If-None-Match
If-none-match HTTP request header makes the request conditional. For the GET and HEAD methods, the server will only send back the requested resource in status 200 if the server does not have an ETag matching the given resource. For the other methods, the request is processed only if the ETag of the final existing resource does not match any of the listed values.
If-None-Match: "c561c68d0ba92bbeb8b0fff2a9199f722e3a621a"
Copy the code
Accept
The accept request HTTP header notifies the client of a MIME type it understands
Accept-Charset
The accept-charset attribute specifies the character set accepted by the server for processing form data.
Common character sets are: UTF-8-Unicode character encoding; Iso-8859-1 – Character encoding of the Latin alphabet
Accept-Language
The header field accept-language is used to tell the server which natural Language sets (Chinese, English, etc.) the user agent can handle, and the relative priority of the natural Language sets. Multiple sets of natural languages can be specified at once.
Request headers we will cover these in general, and an article will delve into all of them in detail. Here is a summary of response headers, based on HTTP 1.1
Response headers
Access-Control-Allow-Origin
A returned HTTP header might have access-Control-allow-Origin, where access-Control-allow-Origin specifies a source that tells the browser to Allow that source to Access the resource.
Keep-Alive
Keep-alive indicates the keepalive time of a non-continuous Connection. You can specify the keepalive time.
Server
The server header contains information about the software used by the original server to process the request.
Overly verbose and detailed Server values should be avoided because they may reveal internal implementation details, which could make it easy for attackers to discover and exploit known security vulnerabilities. For example, write it this way
Server: Apache/against 2.4.1 (Unix)Copy the code
Set-Cookie
Set-cookie Is used by the server to send the sessionID to the client.
Transfer-Encoding
The header field transfer-encoding specifies the Encoding method used to transmit the packet body.
HTTP /1.1 transport encoding is only valid for block transport encoding.
X-Frame-Options
HTTP header fields are self-extensible. Therefore, in the application of Web server and browser, there will be various non-standard header fields.
The header x-frame-options field belongs to the HTTP response header and is used to control the display of Web content within the Frame tag of other Web sites. Its main purpose is to prevent clickjacking attacks.
Here is a summary of the response headers, based on HTTP 1.1
What happens when you enter the URL in the address bar
This is also a frequently asked interview question. So let’s take a look at what happens from the time you type in the URL to the time you respond.
- First, you need to enter the URL you want to visit in your browser, as follows
You shouldn’t be able to access it, right
- Then, the browser will check whether the domain name is cached by the local DNS based on the URL you enter. Different browsers have different Settings for DNS. If the browser caches the URL you want to access, it will return the IP address directly. If your URL is not cached, the browser will make a system call to query the host
hosts
Check whether the file has an IP address. If yes, the system returns the IP address. If not, a DNS query is issued to the network.
Let’s start with what DNS is. There are two ways to identify hosts on the Internet, by hostname and IP address. We like to remember by name, but routes in communication links prefer fixed-length, hierarchical IP addresses. So there is a need for a host name to IP address translation service, this service is provided by DNS. The full Name of DNS is Domain Name System. DNS is a distributed database implemented by hierarchical DNS servers. DNS runs on UDP and uses port 53.
DNS is a hierarchical database, and its main hierarchy is as follows
In addition, there is another important DNS server, which is the Local DNS server. Strictly speaking, the local DNS server does not belong to the above hierarchy, but the local DNS server is crucial. Each Internet Service Provider (ISP), such as an ISP in a residential area or an organization, has a local DNS server. When a host connects to an ISP, the ISP provides the IP address of a host, and the host has the IP address of one or more local DNS servers. By accessing network connections, users can easily determine the IP address of the DNS server. When a host sends a DNS request, the request is sent to the local DNS server, which acts as a proxy and forwards the request to the DNS server hierarchy.
If the local DNS server fails to find the destination IP address, the local DNS sends a DNS query to the root DNS server.
Note: DNS involves two types of query: Recursive query and Iteration query. “Computer Networks: The top-down Approach” unexpectedly does not give the difference between recursive query and iterative query, looked for information on the Internet probably understand the next.
If the root DNS server cannot tell the local DNS server which TOP-LEVEL DNS server to access next, a recursive query is used.
Iterative queries are used if the root DNS server can tell the DNS server which top-level DNS server it needs to access next.
After the root DNS server > top-level DNS server > authoritative DNS server, the authoritative server informs the local server of the destination IP address, and the local DNS server informs the user of the IP address to be accessed.
- Step 3: The browser needs to establish a TCP connection with the target server, and the three-way handshake is required. For details about the handshake, see the preceding answer.
- After the connection is established, the browser initiates a request to the target server
HTTP-GET
Requests, including urls, use long connections by default after HTTP 1.1, requiring only one handshake to transfer data multiple times. - If the target server is a simple page, it returns directly. However, for some large sites, the site is often not directly returned to the host name of the page, but directly redirected. The status code returned is 301,302 redirection code starting with 3. After obtaining the redirection response, the browser finds the redirection address in the Location item of the response message, and the browser can access it again in the first step.
- The browser then resends the request with the new URL and returns a status code of 200 OK, indicating that the server can respond to the request and return the packet.
How HTTPS works
We have described how HTTP works, and here is how HTTPS works. Because we know that HTTPS is not a new protocol, but rather
So, when we talk about HTTPS handshake, it’s actually SSL/TLS handshake.
TLS is an encryption protocol designed to secure communication over the Internet. A TLS handshake is the process of starting and using a TLS encrypted communication session. During the TLS handshake, communication parties on the Internet exchange information with each other, verify cipher suites, and exchange session keys.
A TLS handshake occurs every time a user navigates to a specific website over HTTPS and sends a request. In addition, TLS handshakes also occur whenever any other communication uses HTTPS, including API calls and DNS queries over HTTPS.
The TLS handshake process varies according to the type of key exchange algorithm used and the password suite supported by both parties. We discuss this process in terms of RSA asymmetric encryption. The whole TLS communication flow chart is as follows
- Before communication, the HTTP three-way handshake is performed. After the handshake is complete, the TLS handshake is performed
- ClientHello: The client sends a message to the server
hello
Message to initiate the handshake process. Client support will be embedded in this messageTLS Version number (TLS1.0, TLS1.2, TLS1.3)
, client supported password suite, and a stringRandom number of client
. - ServerHello: After the client sends the Hello message, the server sends a message containing the SSL certificate of the server, the password suite selected by the server, and a random number generated by the server.
- Authentication: The client certificate authority authenticates the SSL certificate and sends it
Certificate
A packet containing a public key certificate. Finally the server sendsServerHelloDone
As ahello
The response to the request. The first part of the handshake is over. Encryption stage
: After the first phase of the handshake is complete, the client sendsClientKeyExchange
As a response, this response contains a type calledThe premaster secret
The key string is the string encrypted using the public key certificate above. The client then sendsChangeCipherSpec
Tell the server to decrypt this using a private keypremaster secret
The client then sends the stringFinished
Tell the server it’s done sending.
A Session key is a public key encrypted with a public key certificate.
Secure asymmetric encryption is realized
: Then, the server sends it againChangeCipherSpec
和Finished
Tell the client that decryption is complete, thus achieving RSA asymmetric encryption.
Article Reference:
What is a TLS handshake?
Recursive and Iterative DNS Queries
DNS recursive query and iterative query
TCP three-way handshake and four-way wave
HTTP/1.0 AND 1.1, WHAT ARE THE DIFFERENCES?
TCP Connection Termination
Transmission_Control_Protocol
SYN
TCP 3-Way Handshake (SYN, SYN-ACK,ACK)
What are the major improvements in HTTP/2 over 1.0?
TCP vs UDP: What’s the Difference?
Computer network 7 layer model
HTTP often meet test questions