Said in the previous

Let’s start with a question before we talk about a topic.

What happens when you type rainbowinpaper.com into your browser and press Enter?

Maybe this question has the same answer in everyone’s mind, but I believe you will also have questions about this answer. It always seems that some places are not quite clear. So we are no longer limited to the usual answer, but to try to answer as specific as possible, without missing any details.

Because this question can be an infinitely difficult question to see how deep your Web foundation is, and I’m a curious person, I’m going to try to dig infinitely deep into the knowledge chain. After all, as programmers, we don’t have to know how to fix computers, but when someone asks you how the Internet works, we should have a fairly accurate answer.

This article uses the web request process as an example to talk to you about how the various protocols of the browser’s computer network work, how packets are sent and received, and how the browser parses the received file, from entering the URL and hitting return to the final page being presented to you. I’ll try to explain what happens at each step of a request from your computer to the target server, hopefully to give you a different perspective on the issues.

Is intended to write an article to the browser into the topic, but found that the more knowledge when sorting data digging, the more reference articles more and more, fully finished is certainly not a tens of thousands of words to say not clear, so I will points several articles published, everyone read per each article also digested, of course, if there are any errors or omissions in the article, Comments are welcome and I will add to this article. If you have any unclear questions after reading the article, I hope you can raise them in the comments section. I will summarize all the questions at the end of the article to help later people to see them.

Cut to the chase.

According to the Definitive HTTP Guide, the answers to the above questions can be summarized as follows:

  • The browser resolves the server hostname from the URL
  • The browser translates the server host name into the server IP address
  • The browser resolves the port number (if any) from the URL
  • The browser establishes a TCP connection with the Web server
  • The browser sends an HTTP request packet to the server
  • The server returns an HTTP response packet to the browser
  • Close the connection and the browser displays the document

In fact, this summary is not detailed enough on the current browser, but the process is no problem. Here I also summarized a picture, and we will follow this idea to explain the process in detail.

But before we get to that, we need to understand the concept of network layering.

The network layer

One of the earliest layers of the network that we encountered was probably the OSI 7-layer reference model.

OSI model data transfer:

When you send a packet, you add layers of headers and data, and when you receive a packet, you unpack layers of headers and take them apart step by step, and finally when they reach the corresponding application, they become data.

Typically, the teaching process uses a five-tier reference model:

The network layer role agreement unit
The application layer Support a variety of network applications FTP, SMTP, HTTP message
The transport layer Process-to-process data transfer TCP, UDP Message segment
The network layer Packet routing and forwarding of data from the source host to the destination host IP, ICMP, and OSPF The datagram
Data link layer Assemble frames from datagrams passed down from the network layer Ethernet, the PPP frame
The physical layer Bit transmission The bit

In practice, we still use the TCP/IP model:

The network layer Protocol stack
The application layer HTTP, FTP, DNS
The transport layer TCP, UDP
Internet layer/network layer IP, ARP
Network interface layer/data link layer/network access layer Ethernet, ATM, Frame Relay

TCP/IP combines the application layer, presentation layer, and session layer into the application layer, and the physical layer and data link layer into the network interface layer.

TCP/IP is not only TCP and IP, but a set of protocols including FTP,SMTP,TCP,UDP,IP, and ARP.

Therefore, an HTTP request goes through the following protocols: HTTP, TCP, IP, AND ARP.

Browser parsing URL

Before we say URL, let’s know its family:

URI = Uniform Resource Identifier URL = Uniform Resource Locator URN = Uniform Resource Name Specifies the Uniform Resource Name

The detailed definition here does not expand to say, the platitudes of things, want to understand the specific article at the end of the relevant article.

URL consists of six parts: protocol, host, port, path, query parameters, and anchor point.

It’s worth noting that the browser will automatically complete URL, for example, I input is rainbowinpaper.com, actually access is http://rainbowinpaper.com/.

The browser knows the following from the URL:

  • Protocol “http”

    Using HTTP

  • Resource “/”

    The requested resource is the home page (index)

Did you enter a URL or a search keyword?

When the protocol or host name is invalid, the browser sends the text entered in the address bar to the default search engine. In most cases, when text is passed to a search engine, the URL carries a specific string of characters that tell the search engine that the search came from that particular browser.

Converts non-ASCII Unicode characters

  • The browser checks the input to see if it contains noa-z.A-Z.0-9.-or.The character of
  • Here the host name israinbowinpaper.com, so there are no non-ASCII characters; If there is one, the browser will use it for the hostname partPunycodecoding

Check the list of HSTS

  • Browser check comes with a list of “pre-loaded HSTS (HTTP Strict Transport Security)” sites that request that the browser use HTTPS only for connections
  • If the site is in the list, the browser will use HTTPS instead of HTTP; otherwise, the original request will be sent using HTTP
  • Note that even if a site is not on the HSTS list, it can ask the browser to access it using the HSTS policy. After the browser makes the first HTTP request to the web site, the web site returns a response to the browser, which only sends the request using HTTPS. However, this very first HTTP request may subject the user to a downgrade attack, which is why modern browsers preload the LIST of HSTS.

After entering the URL, the browser parses the protocol, host, port, path, and other information and constructs an HTTP request.

// The request method is GET, the path is the root path, and the HTTP version is 1.1GET / HTTP / 1.1
Copy the code

However, the HTTP request may not be sent because the browser first checks the local cache for information consistent with the request header.

Browser cache

HTTP caching mechanism

The browser cache mechanism, also known as HTTP cache mechanism, is based on the cache identifier of HTTP packets, which will be described in more detail in HTTP requests. Before we look at the browser caching mechanism, let’s take a quick look at HTTP packets. There are two types of HTTP packets:

  1. The format of a Request packet is as follows: Request line – HTTP header (general information header, Request header, and entity header) – Request packet body (Only POST has a packet body) :

The first line is the request line, followed by HTTP headers.

  1. HTTP Response packet. The packet format is as follows: Status line – HTTP header (general information header, Response header, entity header) – Response packet body, as shown in the following figure:

The first line is the status line, followed by HTTP headers.

Cache process analysis

The browser communicates with the server in reply mode, that is, the browser initiates an HTTP request and the server responds to the request. After the browser sends the request to the server for the first time and gets the request result, it decides whether to cache the result or not according to the cache identifier of the HTTP header in the response packet. If yes, the request result and cache identifier are stored in the browser cache. The simple process is as follows:

We can see from the figure above

1. Each time the browser initiates a request, it first searches the browser cache for the result of the request and the cache IDENTIFIER

2. Each time the browser receives a return request result, it stores the result and the cache id in the browser cache

These two conclusions are the key to the browser cache mechanism, which ensures that every request is cached and read. Once we understand the browser cache rules, all the problems will be solved.

To help you understand, we have divided the caching process into two parts, namely mandatory caching and negotiated caching, depending on whether the HTTP request needs to be re-initiated to the server.

Mandatory cache

Forced caching is the process of looking up the result of a request to the browser cache and deciding whether to use the cached result based on the result’s caching rules. When the browser sends a request to the server, the server returns the Cache rule to the browser in the HTTP header of the HTTP response packet along with the request result. The fields that Control the mandatory Cache are Expires and cache-Control, respectively. Cache-conctrol has a higher priority than Expires.

Here I use a table to compare the difference between the two fields:

Expires Cache-Control
Protocol version HTTP / 1.0 HTTP / 1.1
Field value format Mon, 16 Apr 2021 00:00:00 GMT Public, private, no-cache, no-store, max-age=600……
source Exists in the response header returned by the server Response headers and request headers
disadvantages The server time and the browser time may not be consistent, resulting in failure Time eventually fails

Cache-control has a higher priority than Expires, so cache-control is cached based on the value of cache-Control. Max-age =600 means that if the request is made again within 600 seconds, the cached result will be used to enforce the Cache.

Note: Cache-control is a better option than Expires in cases where you can’t determine whether the client’s time is synchronized with the server’s, so only cache-Control works when both exist.

Now that we understand the process of forced caching, let’s think about it more broadly:

Where does the browser cache reside, and how do I determine if the mandatory cache is in effect in the browser?

Here we take the blog request as an example. The request whose status code is gray indicates that the mandatory cache is used. The Size value of the request indicates the location of the cache, which are respectively from memory cache and from disk cache.

Verify in the browser:

accessrainbowinpaper.com/- > 200 – > Close the blog TAB – > Reopen – > 200(from disk cache) – > Refresh – > 200(from Memory cache)

In this order, you can look at the debug network window to see the specific request information.

It is not hard to see that the last step of the refresh, there is both memory cache and disk cache.

To solve this problem, we need to understand from memory cache and from disk cache, as follows:

  1. From memory cache: The memory cache has two characteristics, namely fast access and timeliness:

    1. Fast read: The memory cache directly stores the compiled and parsed files into the memory of the process, occupying certain memory resources of the process, and facilitating the fast read of the next run.
    2. Timeliness: Once the process is shut down, its memory is emptied.
  2. From disk cache: The cache is directly written to the disk file. To read the cache, I/O operations are performed on the disk file stored in the cache and the cache content is parsed again. The reading of the cache is complex and slower than that of the memory cache.

In the browser, js and image files are directly stored in the memory cache after being parsed. Therefore, when refreshing the page, you only need to read from the memory cache directly. CSS files are stored in disk files, so every rendering page needs to be read from disk cache.

At this point, if the browser request does not hit the mandatory cache, the browser will find out if the negotiated cache exists.

Negotiate the cache

Negotiation cache is a process in which the browser sends a request to the server with the cache ID after the cache is invalid, and the server decides whether to use the cache based on the cache ID. There are two main situations:

  1. Negotiation cache takes effect, return 304
  2. Failed to negotiate cache, return 200 and request result

Similarly, the identity of the negotiation cache is returned to the browser in the HTTP header of the response packet together with the request result. The fields controlling the negotiation cache are as follows: Last-modified/if-modified-since and Etag/if-none-match, where Etag/if-none-match has a higher priority than last-modified/if-modified-since.

  1. Last-ModifiedReturns the time when the resource file was last modified on the server when the server responds to a request.
  2. If-Modified-SinceIs returned with the last request when the client initiates the request againLast-ModifiedValue, which tells the server when the resource was last modified when it was last requested. The server received the request and found that the request header containedIf-Modified-SinceField will be based onIf-Modified-SinceIs compared with the last modification time of the resource on the server. If the last modification time of the resource on the server is greater thanIf-Modified-Since, the resource is returned with the status code of 200; Otherwise, 304 is returned, indicating that the resource is not updated and the cached file can continue to be used.
  3. EtagIs a unique identifier (generated by the server) that returns the current resource file when the server responds to a request.
  4. If-None-MatchIs a unique identifier returned from the previous request when the client initiates the request againEtagValue, which tells the server the unique identifying value returned by the last request for the resource. When the server receives the request, it finds that the request header containsIf-None-Match, will be based onIf-None-MatchThe field value is associated with the resource in the serverEtagIf the values are consistent, 304 is returned, indicating that the resource is not updated and the cached file is still used. If they are inconsistent, the resource file is returned with the status code 200.

Caching flow diagram

conclusion

The processing of the browser cache is done at this point, and it’s time to make the network request.

The DNS

DNS protocol

Because IP address is not easy to remember and can not display the Name and nature of the address organization, people design Domain Name, and through the DNS (Domain Name System) protocol to map the Domain Name and IP address to each other, so that people more convenient access to the Internet. Instead of remembering the number of IP addresses that can be read directly by the machine. Mapping a domain name to an IP address is called forward lookup, and mapping an IP address to a domain name is called reverse lookup.

DNS consists of query request and query response. The packet structures of request and response are basically the same. DNS can be transmitted using UDP or TCP, and the port number is 53. But in most cases DNS uses UDP for transport. DNS uses TCP for zone transmission and UDP for other times.

Lookup DNS cache

The browser checks for caches in four places at this stage. The first place is the browser cache, which is DNS records.

Browsers maintain DNS records for websites you visit for a fixed period of time. Therefore, it is the first place to run DNS queries. The browser first checks to see if the url has a corresponding DNS record in the browser to find the IP address of the target url.

The second area of the browser to check is the operating system cache. If the DNS record is not in the browser cache, the browser will make a system call to the operating system, which is getHostName under Windows.

The third area the browser needs to check is the router cache. If DNS records are not on your computer, the browser will maintain DNS records with the router it is connected to.

If the router to which it is connected also has no DNS record, the browser checks whether there is a cache in the ISP. ISP cache is the cache of the local DNS server of your local communication service provider. ISP maintains its own DNS server, and the essence of CACHING DNS records is to reduce the request time and achieve the effect of fast response. Once you’ve visited certain sites, your ISP may cache those pages for quick access next time (note: queries between hosts and local DNS servers are recursive).

If there is no DNS record in the preceding four steps, it indicates that there is no DNS cache. In this case, you need to initiate a DNS query to find the IP address of the target ADDRESS (note: the local DNS server and other DNS servers are iterated to avoid excessive pressure on the root DNS server).

We proposed two concepts above: recursive query and iterative query

(1) Recursive query: the local machine sends a query request to the local domain name server and waits for the final result. If the local DNS server fails to resolve the problem, the system queries the IP address of another DNS server as a DNS client until the FINAL IP address is obtained. (2) Iterative query: the local DNS server queries to the root DNS server, the root DNS server tells it where to query next, and then it looks again, each time it is as a client to each server query.

The DNS server

First, the local computer must know the IP address of the DNS server, or it cannot access the Internet. You can use the DNS server to know the IP address of a domain name. The IP address of the DNS server, which may be dynamic, is assigned by the gateway each time it accesses the Internet. This is called the DHCP mechanism (described later). Or it could be a fixed address assigned in advance.

There are several public network DNS servers available, the most famous of which is Google 8.8.8.8.

The DNS is a distributed domain name server (DNS). Each DNS server only maintains a partial mapping between IP addresses and network addresses. No DNS server can maintain all mappings.

There are roughly three types of DNS server: root DNS server, top-level Domain (TLD) DNS server, and permission DNS server.

  • Root DNS Server, as of 10:33 am on September 28, 2021, the root server system consists of 1,404 instances of 12 independent root server operators. The 13 root DNS servers are operated by 12 independent organizations. A list of root DNS servers and organizations can be found atroot-servers.org/The root DNS server provides the IP address of the TLD server.
  • Top-level domain DNS serverThere are TLD servers or server clusters for each top-level domain such as com, org, net, edu, and gov and for all national domains such as UK, FR, CA, and JP. For a list of all top-level domains, seetld-list.com/. The TDL server provides the IP address of the authoritative DNS server.
  • Permission DNS serverThe organization of publicly accessible hosts on the Internet, such as Web servers and mail servers, must provide accessible DNS records that map the names of these hosts to IP addresses. An organization’s authoritative DNS server houses these DNS records.

In addition to the above three DNS servers, another important DNS server that is not part of the DNS hierarchy is the local domain name server (also known as the authoritative domain name server). The local DNS server is the default DOMAIN name server (DNS) configured on the PC for resolution. Common telecom, linkup, Google, Ali and other local DNS services.

Domain name hierarchy

How does the DNS server know the IP address of every domain name? The answer is hierarchical query.

Take www.bilibili.com for example. WWW is a level 3 domain, bilibili is a level 2 domain, and com is a top-level domain.

Not only that, but at the end of all domain names, there’s actually a root domain name.

For example, the real domain name www.bilibili.com is www.bilibili.com.root, or www.bilibili.com. Root is omitted because the root domain name is the same for all domain names.

The next level of the root domain is called a top-level domain (TLD), such as.com and.net. The next level is called “second-level domain” (SLD), such as the.bilibili at www.bilibili.com, which users can register. At the next level are host names, such as the WWW in www.bilibili.com, also known as “tertiary domain names”, which are assigned by users to servers in their own domains and can be assigned by users at will.

To summarize, the hierarchy of domain names is as follows.

Host name. secondary domain name. Top-level domain name Root domain name

DNS record types

The corresponding relationship between domain name and IP is called “record”. Records can be divided into different types depending on the usage scenario.

Common DNS record types are as follows.

(1) A: Address, which returns the IP Address pointed to by the domain name.

(2) NS: Name Server, which returns the address of the Server that stores the next-level domain Name information. The record can only be set to a domain name, not an IP address.

(3) MX: Mail eXchange, which returns the address of the server that receives emails.

(4) CNAME: Canonical Name, which returns another domain Name, i.e. the current query is a jump to another domain Name.

PTR: Pointer Record. It is only used to query a domain name from an IP address.

Here is the resolution configuration for my own domain name rainbowinpaper.com:

There is a TTL configuration, which is short for Time To Live, which is the Time the DNS server cache domain name resolution information.

The hierarchical query

The DNS server performs hierarchical query based on the domain name level.

To be clear, each level of domain name has its own NS record. The NS record points to the DNS server of the level of domain name. These servers know the various records of the next level of domain name.

Hierarchical search is to search NS records of domain names at each level from the root domain name to the final IP address. The process is as follows.

  1. Query the NS records and A records of the TOP-LEVEL DNS server from the root DNS server
  2. Query the NS record and A record of the secondary DNS server from the top-level DNS server
  3. Get the IP address of “host name” from “secondary DNS”
  4. The local DNS server returns the obtained IP address to the operating system and caches the IP address itself
  5. The operating system returns the IP address to the browser and caches the IP address itself
  6. At this point, the browser gets the IP address corresponding to the domain name and caches it

If you look closely at the procedure above, you may notice that there is no mention of how the DNS server knows the IP address of the root DNS server. The NS records and IP addresses of the root DNS server do not change, so they are built into the DNS server.

Based on the built-in root DNS IP address, the DNS server issues query requests to all of these IP addresses, asking the top-level DNS server com at www.baidu.com. NS record. The first root DNS server to reply will be cached, and only requests will be sent to that server in the future, and so on…

It is worth noting that DNS query packets pass through many routers and devices before reaching servers such as the root domain name. Each device or router uses a routing table to determine which path is the fastest choice for packets to reach their destination. If you go further here, you will get involved in routing algorithms, and if you are interested in routing algorithms, you can look them up. I won’t dig any deeper.

conclusion

As we mentioned earlier, the browser cache, operating system cache, router cache, and local domain name server are also queried between 1 and 8. After the DNS resolution is complete, we have the IP address of the target domain name, and then we can establish the TCP connection.

A TCP connection

Socket and stream socket

An endpoint is identified by a combination of sender and receiver sockets in a network. A socket uniquely identifies a host and a process on the network.

Socket = (host IP address, port number)

When the browser gets the IP address of the destination server and the port number given in the URL (the default HTTP port number is 80 and HTTPS port number is 443), it calls the system library function socket to request a TCP stream socket.

The request is first sent to the transport layer, where it is encapsulated into a TCP segment. The destination port is added to the header, and the source port is selected from the dynamic port range of the system kernel (ip_local_port_range on Linux).

What is a Socket?

Take an example: A and B chat on QQ. QQ is an independent application, so it corresponds to two sockets, one on A’s computer and the other on B’s computer. When A says to B, “Let’s go and have fun this weekend! “, this sentence is A piece of data, the data will be stored in A computer Socket, when A’S QQ and B’s QQ successfully connected, A Socket will send this paragraph of data to B’s computer, but B hasn’t seen it for the moment, because the data will be stored in THE Socket of B’s computer. The Socket then presents the data to B.

Why do we need more sockets during data transmission?

A: Because different applications correspond to different sockets, and sockets ensure that QQ data will not run around, will not be an impulse to MSN up. Because QQ and MSN two applications Socket content is completely different.

So what is inside a Socket?

Socket address! A socket address is a data structure, and we are based only on the TCP transport protocol as an example. Socket address this data structure contains: address type, port number, IP address, fill byte these four kinds of data.

One caveat here is that Chrome requires a maximum of six TCP connections in the same domain at any one time, and any more will have to wait.

TCP protocol

The transport layer has two protocols:

Transmission Control Protocol (TCP) is a connection-oriented, reliable, byte stream – based transport layer communication Protocol.

User Datagram Protocol (UDP).

The characteristics of

  1. TCP is a connection-oriented (virtual) transport layer protocol.

  2. Each TCP connection can have only two endpoints, and each TCP connection can be point-to-point only.

  3. TCP provides reliably delivered services with no errors, no loss, no duplication, and sequential arrival. Reliable and orderly, not lost, not heavy.

  4. TCP provides full duplex communication.

    Send cache: Data ready to be sent & data sent but not acknowledged.

    Receive cache: Data that arrives sequentially but has not yet been read by the receiving application & data that does not arrive sequentially.

  5. TCP Byte – oriented stream.

    Stream: A sequence of bytes flowing into or out of a process.

    TCP treats the data handed over by an application as just a series of unstructured byte streams.

TCP header format:

Three-way handshake

The establishment of a TCP connection requires a three-way handshake:

  • SYN: Synchronous Establishes connection

    If the value is 1, the request is for establishing a connection and does not carry application layer data.

  • ACK: Acknowledgement

    If the value is 1, it is a request to establish a connection in response to a request with the SYN value 1.

  • Seq: indicates the sequence number

    It’s a random number.

  • Ack: Confirm the sequence number

    Indicates that all serial numbers before this serial number have been confirmed and data from the beginning of this serial number is expected to be obtained.

  1. From the beginning both sides are inCLOSEDState. Then the server starts listening for a port and entersLISTENState, ready to receive an external TCP connection, waiting for a client connection request.
  2. The client sends a message to the serverConnect request message segment, the first synchronization bit in the requestSYN = 1And select an initial serial numbersequence, shorthandseq = x. The SYN segmentData is not allowed, consumes only one serial number. At this point, the client entersSYN-SENDState.
  3. After receiving the client connection, the server needs to confirm the client packet segment. Set both the SYN and ACK bits to 1 in the acknowledgment packet segment. The confirmation number is ACK = x + 1 and the initial number is seq = y. The server is the TCP connectionAllocate cache and variablesAnd returns to the clientAcknowledgment message segmentTo allow connections. Note that this message segment alsoCan’t carry data, but it also consumes a serial number. At this point, the TCP server entersSyn-received (Synchronously RECEIVED)State.
  4. After receiving the response from the server, the client also needs to confirm the connection. Confirm that the ACK in the connection is set to 1 and the serial number isseq = x + 1, the confirmation number isack = y + 1. According to TCP, this packet segment may or may not carry data. If no data is carried, the sequence number of the next packet segment is stillseq = x + 1. The client is the TCP connectionAllocate cache and variablesAnd returns an acknowledgement to the server.Can carry data. At this point, the client entersESTABLISHEDState.
  5. After receiving the customer’s confirmation, the server also logs inESTABLISHEDState.

On the third handshake, you can carry data. The first two handshakes cannot carry data.

If the first two handshakes can carry data, the attacker only needs to send a large amount of data in the SYN packet in the first handshake to attack the server. As a result, the server consumes more time and memory space to process the data, increasing the risk of attack.

On the third handshake, the client is in the ESTABLISHED state, and you can confirm that the server is receiving and sending data properly.

What happens if the third handshake fails?

  • The state of the server isSYN-RCVDIf the CLIENT fails to receive an ACK packet, the server resends the SYN+ACK packet
  • If the server fails to receive an ACK from the client after repeatedly sending SYN+ ACKS, the server sends an RST packet to forcibly close the connection

Four times to wave

Of course, in addition to the three-way handshake for establishing a TCP connection, there is also the corresponding four-way handshake for disconnecting a TCP connection. When the data transfer is complete, you can close the connection. The closing can be initiated by the server or the client. When the connection ends, “resources” (caches and variables) in the host are released.

Here we are also directly above:

  • Stop bit FIN: If FIN=1, it indicates that the sender has sent all data and the connection needs to be released.
  1. At first the two sides are in an ESTABLISHED state. The client sends the connection release packet segment, stops sending data, and actively closes the TCP connection. The client changes to the FIN-WaIT-1 state. Note that the client also becomes half-close, which means that it cannot send packets to the server, but can only receive them.

    The FIN = 1, seq = uCopy the code
  2. After receiving the packet, the server sends an acknowledgement packet to the client and changes to the closed-wait state. The client receives the acknowledgement from the server and changes to the FIN-Wait2 state.

    ACK=1, seq=v, ACK= u+1Copy the code
  3. After sending data, the server sends a connection release packet to the client, enters the last-ACK state, and actively closes the TCP connection.

    FIN=1, ACK=1, SEq = W, ACK= U +1Copy the code
  4. After receiving the connection release segment from the server, the client changes to the time-wait state and sends back an acknowledgement segment to the server. The client then closes the connection completely after the timeout timer reaches 2MSL (maximum packet segment life).

    ACK=1, seq=u+1, ACK= w+1Copy the code

    If the client does not receive a resending request from the server within the Maximum Segment Lifetime (MSL), the client will have to wait a sufficient amount of time to receive the resending request. Otherwise, the client resends the ACK.

Wait for 2MSL meaning

What happens if you don’t wait?

If the client does not wait, the client directly runs away. When the server still has many data packets to send to the client and the client port is occupied by a new application, the client receives useless data packets, causing packet chaos. Therefore, it is safest to wait for the packet from the server to die before starting a new application.

So, one MSL is not enough, why wait for 2 MSL?

  • An MSL ensures that the last ACK message from the active closing party in the four waves will finally reach the peer end
  • One MSL ensures that the peer end can receive FIN packets that have not received ACK retransmission

That’s what waiting 2MSL is all about.

Here we have said the TCP connection management, these are more interview investigation, we can look at the TCP packet serial number is how to use, reliable transmission is how to achieve.

TCP packet

TCP packet size

Ethernet packets are fixed in size, starting at 1518 bytes and later increasing to 1522 bytes. 1500 bytes are payload and 22 bytes are head information.

The IP packet in the Ethernet packet payload, it also has its own header information, requires a minimum of 20 bytes, so the IP packet payload is up to 1480 bytes.

TCP packets are in the IP packet payload. It requires at least 20 bytes of header information. Therefore, the Maximum payload of TCP packets is 1480-20 = 1460 bytes, which is called Maximum Segment Size (MSS). Because IP and TCP protocols tend to have additional headers, the ACTUAL TCP payload is about 1400 bytes.

Thus, a single 1500 byte message requires two TCP packets. A major improvement in HTTP/2 is that HTTP headers are compressed so that an HTTP request can be placed in a SINGLE TCP packet instead of being split into multiple packets, which increases speed.

TCP packet Number (SEQ)

A packet of 1400 bytes, so a large amount of data sent at once, must be divided into multiple packets. For example, a 10MB file needs to send more than 7100 packets.

When sending packets, TCP numbers each packet (SEQ) so that the receiving party restores the packet in sequence. In case of bag loss, you can also know which bag is missing.

The number of the first packet is a random number. Just to make it easier to understand, we’ll call it package number 1. Given that the payload length of this packet is 100 bytes, it can be inferred that the next packet number should be 101. This means that each packet is given two numbers: its own number and the number of the next packet. The recipient knows in what order they should be restored to the original file.

TCP packet assembly

After receiving the TCP packet, the operating system does the assembly restore. Applications do not process TCP packets directly.

For applications, you don’t have to worry about the details of data communication. The data required by the application is stored in TCP packets in their own format (such as HTTP).

TCP does not provide any mechanism for representing the size of the original file, which is dictated by the application layer protocol. The HTTP protocol, for example, has a content-Length header that indicates the size of the body. For the operating system, it is constantly receiving TCP packets and assembling them in order, one packet at a time.

The operating system does not process the data inside the TCP packet. Once the TCP packets are assembled, they are handed over to the application. The TCP packet contains a port parameter, which is used to specify which application is forwarded to listen on the port.

TCP reliable transmission

Network layer: provides best effort delivery, unreliable transmission.

Transport layer: Use TCP to achieve reliable transmission.

Reliable: Ensure that the byte stream read by the receiver process from the cache is exactly the same as the byte stream sent by the sender.

TCP implements reliable transmission mechanism:

  1. Verification: The same as UDP verification, adds a false header.

  2. Ordinal number: one byte in one ordinal number. The ordinal field refers to the ordinal number of the first byte of a message segment.

  3. confirm

  4. Retransmission: Indicates that the TCP sender retransmits the sent packet segment if it does not receive the confirmation within the specified time (retransmission time).

    Timeout retransmission

    TCP uses an adaptive algorithm to dynamically change the retransmission time (RTTs).

    What if the retransmission time is too long?

    Redundant ACK (Redundant acknowledgement)

    Whenever an out-of-order packet segment larger than the expected number arrives, a redundant ACK is sent indicating the number of the next expected byte.

    The sender has sent segments 1, 2, 3, 4, and 5

    The receiver receives 1 and returns an acknowledgement to 1 (the first byte of the acknowledgement number 2)

    The recipient receives 3 and still returns an acknowledgement to 1 (the first byte of the acknowledgement number 2)

    The recipient receives 4 and still returns an acknowledgement to 1 (the first byte of the acknowledgement number 2)

    The recipient receives 5 and still returns an acknowledgement to 1 (the first byte of the acknowledgement number 2)

    After receiving three redundant ACKS for packet segment 1, the sender considers that packet segment 2 is lost and retransmits packet segment 2 (fast retransmission)

Fast retransmission solves only one problem, the timeout problem, and it still faces a difficult choice, namely, retransmission of the previous one or retransmission of all problems. For the example above, should I retransmit packet 2 or 2, 3, 4, 5?

Because the sender does not know who sent the three consecutive Acks (2). Maybe the sender sent 20 copies, 6,10,20. Thus, it is likely that the sender will retransmit the data from 2 to 20 (which is the actual implementation of some TCP). This is a double-edged sword.

A better approach is called Selective Acknowledgment (SACK), which requires adding a SACK to the TCP header.

SACK (selective determination)

  • Tell the sender which data is missing and which data has been received in advance
  • Enable TCP to resend only the lost segment (e.g., 3) and not all subsequent packets (e.g., 4, 5).

Flow control

The goal is to slow the sender down, to allow the receiver to receive in time.

TCP uses the sliding window mechanism for traffic control.

During communication, the receiver dynamically adjusts the size of the sender’s sending window according to the size of its receiving cache, that is, RWND (Receive Window). The sender’s sending window takes the minimum value of RWND and CWND (congestion window) **.


Send window = M i n { Receiving window r w n d Congested Windows c w n d } Send window =Min\{receive window RWND, congestion window CWND \}

When user A sends data to user B and establishes A connection, user B tells user A: “my RWND is 400 (bytes).” set each packet segment to 100B and set the initial sequence number of the packet segment to 1.

At this point, host A no longer sends data. If host B sends A response again setting the new receive window, if the response is lost, do hosts A and B enter infinite wait?

The answer is no.

TCP sets a persistent timer for each connection, which starts as soon as one of the TCP connections receives a zero window notification from the other.

If the duration timer expires, a zero window probe message segment is sent. The current window value is given when the receiver receives the probe message segment.

If the window remains 0, the sender resets the persistence timer. This avoids the situation in the above problem.

It is worth noting that TCP ensures the correctness of data transmission by using continuous ARQ protocol and sliding window protocol in the data link layer, thus providing reliable transmission and flow control.

ARQ, or Automatic retransmission reQuest, is one of the error correction protocols of the data link layer and transport layer in the OSI model. It implements reliable information transfer on the basis of unreliable services by using two mechanisms: acknowledgement and timeout. If the sender does not receive an acknowledgement frame within a period of time after sending it, it usually resends it. ARQ includes stop wait ARQ and continuous ARQ.

Simply put, stop waiting ARQ protocol is to send a packet, wait for a response, send the next one if there is no response, timeout retransmission. The disadvantage of this approach is obvious, that is, the channel utilization is very low.

Continuous ARQ protocol sends multiple groups of packets continuously, and then realizes reliable transmission through accumulative confirmation.

Continuous ARQ protocol is usually used in combination with the sliding window protocol. The sender needs to maintain a sending window, as shown in the following figure:

The sliding window protocol maintains a sliding window between the sender and the receiver, with the sender as the sending window and the receiver as the receiving window, and this window can slide forward as time changes. It allows the sender to send multiple packets without waiting for confirmation. TCP’s sliding window is in bytes.

As shown in the figure below, there are four concepts: send window: have sent and received confirmation of the data (not send window and send buffer), have been sent but not yet received confirmation of the data (located in send window), allowed to send but not yet sent data (located in send window), send window outside temporarily not allowed to send data in the buffer zone.

There are also four concepts in the receive window: data that has been sent for confirmation and delivered to the host (outside the receive window and the receive buffer), data that has not been received in order (inside the receive window), data that is allowed (inside the receive window), and data that is not allowed (inside the receive window).

Rules:

(1) Any data that has been sent must be temporarily retained until confirmation is received, so as to be used in the case of timeout retransmission.

(2) Only when the sender A receives the confirmation message segment from the receiver, the sender window can slide several serial numbers forward.

(3) When the data sent by sender A has not received confirmation after A period of time (controlled by A timeout timer), it shall use the back N-frame protocol (GBN) to return to the place where the confirmation number was last received and resend this part of data.

Congestion control

Why congestion control when you have flow control?

After the client and the server connection is established, both sides send window is obtained by flow control and congestion window, but the window is only in view of the host and server performance and decision, the entire network congestion level determines the sent data can not reach, so you can control the congestion as a common maintenance network flow of the gentleman’s agreement.

The conditions for congestion are: total demand for resources > available resources

Congestion control is a global process

  • All hosts and routers are involved
  • And all factors related to the degradation of network transmission performance
  • It was the result of our joint efforts

Four congestion control algorithms:

Slow start and congestion avoidance

Fast retransmission and fast recovery

Assumptions:

  1. Data is sent in one direction, and only confirmation is sent in the other.

  2. The receiver always has enough cache space, so the size of the send window depends on congestion.

    Congestion control: a window value set by the sender based on the estimated network congestion, reflecting the current network capacity.

Slow start and congestion avoidance

One packet is sent initially, two more (exponentially increasing) are sent when the response is received, and the threshold is reachedssthreshAfter, it becomes additive growth. If packet loss occurs, the network is congested, and the system returns to the original packet, sets a new threshold, and repeats the previous operation.

Fast retransmission and fast recovery

Fast retransmission and fast Recovery are upgraded versions of the above methods. The difference is that in case of network congestion, they revert to the new threshold level and add growth instead of the initial value.

thinking

We’ve talked about HTTP caching, DNS parsing, TCP connections, and HTTP requests. Of course, most interviews will ask about congestion control, reliable transmission, and then network engineers will cover that. But since we want to understand how our data is transmitted over the network, So we’re going to dig a little deeper, and let’s start with a question:

In a LAN, the IP of our mobile phone and computer is the LAN IP assigned by the router, so when our computer requests a page, how to fill in the source IP? Even if we have the source IP and the destination IP, how can our request travel over the network and get to the destination server quickly and accurately? Once we get a response, how do we determine whether the response should go to the phone or the computer?

In this view, we have only learned a few common protocols in the upper and middle layers of the network. To understand what a complete network loop is, we need to go further. IP datagram, ARP process is out of the browser computer network knowledge, but also some very interesting knowledge, I say as easy as possible, if not interested in HTTP request can jump.

The IP datagram

The TCP packet segment is sent to the network layer. The network layer adds an IP header to it, which contains the IP address of the destination server and the IP address of the local host, and encapsulates it into an IP datagram. Here we take IPv4 as an example to introduce the content of the IP protocol.

Important protocols in the network layer: IP protocol, is one of the most important protocols in the TCP/IP system.

IP supports four protocols: ADDRESS resolution Protocol (ARP), Reverse Address resolution Protocol (RARP), Internet Control Message Protocol (ICMP), and Internet Group Management Protocol (IGMP).

IP datagram format

  • Version: IPv4/IPv6

  • The distinction between service types represents the priority of the current packet, and the kernel processes packets according to the order of priority in the queue

  • Flags, flags, and slice offsets, since datagrams must be sharded if their length exceeds the NETWORK MTU (maximum transmission unit), are related to sharding rules

  • TTL: indicates the shelf life of IP packets. Passes through a router -1, becomes 0 and is discarded. Ping is often used to check network conditions. Ping uses the ICMP protocol. Setting the TTL for error packets is used to detect network conditions

  • The protocols are specified as TCP and UDP. The following are the parameters of some protocols.

    Agreement, ICMP IGMP TCP EGP IGP UDP IPv6 ESP OSPF
    The field values 1 2 6 8 9 17 41 50 89

MTU: indicates the upper limit of data that can be encapsulated by a data frame at the link layer. The MTU of the Ethernet is 1500 bytes.

So when IP datagrams exceed this value, they are fragmented.

Datagram sharding The total length
Raw datagram 3820
Datagram slice 1 1420
Datagram sheet 2 1420
Datagram sheet 3 1020

IP packet transfer process over the Internet

  • Create a group for the source host
  • The destination address is added to the packet header
  • Packets are sent to neighboring routers
  • Router receives packets
  • Select the next router using the destination address and forward it
  • Packet arrival passes the packet to the final destination router

After IP grouping, enter the data link layer for data frame packaging:

IPv4 address

IP address: a 32-bit / 4-byte identifier unique in the world that identifies the interface of the router host.


I P address = { < The network number > < The host > } IP address =\{< network number >< host number >}

11011111.00000001.00000001.00000001 = 223.1.1.1 ( Dotted decimal ) 11011111.00000001.00000001.00000001 = 223.1.1.1 (dotted decimal)

IPv4 classification

(Hint: Category A in the figure should be 1-127)

IPv6 Extension

Motivation for reform:

  1. The finite address space defined by IPv4 will be exhausted
  2. As the number of devices increases, the address configuration needs to be automated and simplified
  3. Internet backbone routers have the ability to maintain large routing tables and support flat routing mechanisms
  4. IP layer security requirements: IPSec is not popular
  5. Demand for better real-time QoS(quality of Service) support

Difficulties in reform: The dependence of IP and the subsequent inertia brought by IP lead to the change of IP will change the entire Internet

IPv6 features:

  1. Huge address space and hierarchical implementation: 128 – bit address, multi-level subnet address allocation
  2. Simplified headers and flexible extensions
  3. Authentication and encryption at the network layer: Fully supports IPSec, enabling users to encrypt data and verify packets
  4. QoS requirements: A router can identify packets of the same data flow
  5. Better support for mobility: dynamically obtain IP addresses for call/session time
  6. Efficient hierarchical addressing and routing structure: aggregated addressing, segmented only by the sending host
  7. Enhanced multicast and flow control
  8. Automatic configuration technology
  9. New protocol for neighboring node interaction: Neighbor discovery protocol implements the interaction management of neighboring nodes (nodes on the same link)

Transition from IPv4 to IPv6:

  1. Dual-stack mechanism: Runs both IPv4 and IPv6 stacks on the same device
  2. Tunneling mechanism: In a gateway site equipped with dual protocol stacks, IPv6 packets are encapsulated in IPv4 packets and transmitted over the IPv4 network. After arriving at the end of the tunnel, the packets are unsealed and restored to IPv6 packets
  3. Translation mechanism: The translation gateway translates IP header addresses between IPv4 and IPv6 networks and translates packets according to different protocols

Subnets and subnet masks

There are 256 hosts connected to the subnet where 192.168.0.1 is commonly used. However, there are only a few common devices connected to the subnet, which causes a lot of IP waste. So how to divide the subnet more reasonably? Also, how does our router know if the requested target IP is an extranet or a LAN address? At this point, the concept of subnet division and subnet mask should be introduced.

Let’s take a look at the weaknesses of classified IP addresses:

  1. IP address space utilization is sometimes low.
  2. The two-tier IP address is not flexible.

subnetting

After subnets are created for an organization, the organization still acts as a network. That is, networks outside the organization cannot see subnets within the organization.

Subnet mask

Suppose we now need only four network numbers in a subnet: 192.168.0.0 to 192.168.0.3

We mark the part of the network number as 1 and the part of the host number as 0 to get a subnet mask: 255.255.255.252

How to determine whether an IP address is the current network segment:

You can find that the host id conflicts with the subnet mask, so you can determine that the current IP address does not belong to the current network segment.

Class A default subnet mask, 255.0.0.0, A subnet can hold A maximum of 16.77 million computers.

The default subnet mask for class B is 255.255.0.0. A subnet can hold a maximum of 60,000 computers.

The default subnet mask of class C is 255.255.255.0. A subnet can hold a maximum of 254 computers.

But the subnet mask is a bit too long, and it is not easy to see the number of host numbers in the network segment. There is also a CIDR method:

192.168.0.0/30: The network identifier/network address is preceded by the slash. 30 means there are 30 ones for the subnet mask, which makes it much clearer and easier to write.

If 192.168.0.0/30 wants to communicate with 8.8.8.8, it will first judge the network segment and use a 30-bit mask to bit 8.8.8.8. The network segment is 8.8.8.8, which is obviously different from our network identifier 192.168.0.0. The network identifier is different, that is, it is not in the same network segment.

Here’s another example to understand:

Say a little more

We usually deal with broadband in the operator, the operator can imagine as a large router, we deal with broadband is equivalent to the router there to apply for a private IP, through the operator’s agent, to achieve the effect of Internet access. The carrier divides each user in each region into different subnets to efficiently utilize IP addresses.

After all, IPV4 addresses are limited. If every mobile phone, computer and internet-connected device has a unique IP address, the existing IPV4 address pool will be far from sufficient.

Here is my own home Internet information:

It can be seen that the subnet mask is 255.255.255.255, which means that my IP address can only be 117.60.208.89. Imagine that when the operator assigns IP to me, it creates a subnet, but the subnet can only have one IP address, because I only need one IP address. This allows operators to maximize the use of each IP address in the available IPV4 address pool. It should be noted that this IP address is not the IP address of my router. This IP address is the IP address for me to access the Internet. It may have been converted by many routers before reaching my home.

Take a look at my WiFi connection:

The router is generous in the allocation of IP addresses. It can be seen from the subnet mask that there can be more than 200 devices in the current subnet. Anyway, all the requests are routed through the router proxy and then transferred to the IP address given by the carrier, so it doesn’t matter if more allocation points are wasted.

DHCP protocol

Dynamic Host Configuration Protocol DHCP is an application-layer protocol that uses client/server mode. The client and server communicate with each other in broadcast mode based on UDP. To put it simply, the router we use in daily life allocates IP, subnet mask and other information through DHCP when we plug in the network cable or connect the mobile phone to the wireless.

DHCP provides a plug and play networking mechanism. Hosts can dynamically obtain IP addresses, subnet masks, default gateways, DNS server names, and IP addresses from servers, enabling address reuse, mobile users to join the network, and IP address renewal.

  1. The host broadcasts DHCP discovery messages. “Is there a DHCP server?” , trying to find the server in the network and get an IP address from the server.
  2. The DHCP server broadcasts DHCP packets. “Yes!” “Yes!” “Yes!” , the server will assign an IP address and related configuration to the host on a first-come, first-served basis.
  3. The host broadcasts DHCP request packets. “Did I use the IP address you gave me?” , the host requests an IP address from the server.
  4. The DHCP server broadcasts DHCP confirmation messages. “Go ahead!” To formally assign IP addresses to hosts.

Network Address Translation (NAT)

Network Address Translation (NAT) : The NAT software is installed on the router that connects to the Internet on a private Network. The router installed with the NAT software is called a NAT router and has at least one valid external global IP Address. NAT is classified into SNAT and DNAT, which translates a private IP address into a public IP address and a public IP address into a private IP address.

When a computer on the LAN sends a data request to an extranet server, the router with NAT capability changes the source IP address in our data request packet to an extranet IP address. The extranet server receives the data request and returns the data to our extranet IP address. The router receives the data and forwards it to us, so the LAN host can connect to the Internet.

To be more specific, NAT includes not only IP addresses but also port numbers for address replacement. To be specific, in the packets that we request to connect to the extranet server, besides the source and destination IP addresses, there are also source and destination port numbers.

The destination port number is fixed, such as 21 or 80. But the source port number is randomly generated (randomly assigned by the browser when it creates the TCP connection). When a packet reaches the NAT device, the private IP address is replaced with a public IP address, and the port number is also replaced with a port number randomly generated by NAT.

NAT port numbers correspond to hosts on the LAN, and the NAT device maintains a table of port numbers and hosts. When the extranet server returns data to the NAT device, the NAT device uses the port number in the returned data packet to find the host on the LAN and forwards the data.

As mentioned above, the private NETWORK IP allocated by our router goes through multi-layer NAT and is finally forwarded to the Internet by a server with a public IP address.

conclusion

IP datagrams, IPv4, subnet partitioning, subnet masks, NAT, DHCP. Now let’s put these together:

First of all, the general household computer is connected to the network cable provided by the service provider by the router and then connected to the Internet through the IP address, subnet mask and other information allocated by DHCP service.

TCP packets are sent to the network layer, delivered to THE IP protocol and processed into IP packets. According to the size of TCP packets and the MTU of the network adapter, THE IP packets are divided into multiple IP packets. The IP packets are sent to the data link layer and encapsulated into frames, which are then converted into digital signals by the physical layer and sent to the router.

By matching the destination address with the subnet mask, the router knows whether the data is destined for the Intranet or the Internet. Then, the router replaces the source address and port in the data through NAT, and also saves the mapping cache of the request in the router for delivering the response to the source host on the Intranet.

The router then sends the data to the next router, which router should it send to next? The ARP process can solve this problem.

The ARP process

IP addresses belong to the network layer, but IP addresses need to be exchanged across different physical networks during transmission. In this case, if a host wants to send a frame to another host, it is not enough to know its IP address, but also needs to know its valid “hardware address”.

The IP datagram then goes to the link layer, which adds a frame header to the packet containing the MAC address of the local built-in network card and the MAC address of the gateway (local router). As mentioned earlier, if the kernel does not know the MAC address of the gateway, it must perform an ARP broadcast to query its address.

In essence, ARP solves the problem of where to go next hop. ARP automatically runs.

ARP protocol

ARP (Address resolution Protocol) provides a mapping between 32-bit IPv4 addresses and Ethernet 48-bit MAC addresses (hardware addresses).

ARP cache: Each host has an ARP cache, which stores an IP address-physical address mapping table and dynamically updates it.

The MAC address must be used when transmitting the data frame on the link of the real network.

ARP: Maps the IP address of a host or router to a MAC address. ARP belongs to IP layer (network layer) in TCP/IP model and data link layer in OSI model.

Arp -a [host address] : view the ARP cache arp -d [host address] : delete the ARP cache ARP -s Host address MAC address: add a cache information (this cache is static and stored for a long time. The storage duration varies depending on the system)

ARP protocol usage process:

Check the ARP cache. If a corresponding entry exists, a MAC frame is written into the ARP cache. If no, a frame with the destination MAC address FF-FF-FF-FF-FF is used to encapsulate and broadcast an ARP request group. After receiving the request, the destination host unicast an ARP response packet to the source host. After receiving the packet, the source host writes the mapping to the ARP cache (updated every 10-20 minutes).

Four typical ARP scenarios are as follows:

  1. The sender is a host, and the IP packet needs to be sent to another host on the network. In this case, ARP is used to find the destination physical address.
  2. The sender is the host, and the IP datagram is sent to a host on another LAN. ARP first finds a router on the LAN, and the router does the rest of the work.
  3. The sender is a router, and the IP packet needs to be sent to another host on the local network. In this case, ARP is used to find the destination physical address.
  4. The sender is a router, and the IP datagram is sent to a host on another LAN. ARP first finds another router on the LAN, and the router does the rest of the work.

To summarize the above four cases plus ARP caching:

  • The destination IP address and its own IP address reside on the same network segment

    • The ARP cache has the MAC address of the destination IP address: it is directly sent to the physical address.
    • Arp cache Does not have the MAC address of the destination IP address: Sends ARP broadcasts to request the MAC address of the destination IP address, caches the MAC address, and sends data to report the MAC address.
  • The destination IP address and its own IP address are on different network segments

    In this case, you need to send the packet to the default gateway, so you need to obtain the MAC address of the gateway.

    • The ARP cache has the MAC address of the default gateway: IP data is directly reported to the default gateway and then forwarded to the Internet by the gateway.
    • The ARP cache does not have the MAC address of the default gateway: it still sends an ARP broadcast request for the MAC address of the default gateway, caches the address, and reports data to the gateway.

ARP produce certainly has its advantages, the world there are a lot of different network, they use different forms of physical address, the heterogeneous network to communicate with each other, the IP address of the unified format, plus the ARP address resolution protocol to solve the problem, and the implementation process of ARP is computer software automatically, the user has not perceived.

Here so far, we have a general idea of our data is how to transfer the entire network, USES a lot of network protocol, middle, of course, I simply said one transmission process, and I’m here also analysis to the network layer, only actually following and data link layer encapsulation into frames and the electric signals and optical signals in the physical transformation, But we’ve already laid a foundation for building on our underlying knowledge. Returning to the topic of computer networks in browsers, let’s talk about the last major protocol in this article: HTTP.

The HTTP request

World Wide Web Overview

The World Wide Web (WWW) is a large-scale, on-line information repository/data space, which is a collection of countless network sites and Web pages.

Access resources (text, video, audio) through uniform Resource Locator urls (URlS) that are unique identifiers.

Users click on hyperlinks to retrieve resources that are sent to the user via hypertext Transfer Protocol (HTTP). The Web works as a client/server, with the web client being the browser the user uses and the server running on the host where the web documents reside. The world Wide Web uses the hypertext markup language HTML, which makes it easy for web page designers to move from one interface to another and display it on their own screens.

The World Wide Web originated from CERN, the Quantum physics laboratory in Geneva, Europe. It is the emergence of WWW technology that makes the Internet develop rapidly beyond imagination. This technology based on TCP/IP has rapidly become the largest information system on the Internet that has been developed for decades in a short period of ten years. Its success is attributed to its simplicity and practicality. There are a number of protocols and standards behind the WWW that enable it to do such a great job, the Web protocol family, including the HTTP Hypertext Transfer protocol.

In 1990, HTTP became the supporting protocol for the WWW.

HTTP history

HTTP / 0.9

  • There is only one command GET
  • Response type: hypertext only
  • There is noheaderInformation that describes the data
  • The server closes the TCP connection after sending the packet

The 1991 prototype version of HTTP was called HTTP/0.9. This protocol has a number of serious design flaws and should only be used for older client interactions. HTTP/0.9 only supports GET methods. It does not support MIME types, HTTP headers, or version numbers for multimedia content. HTTP/0.9 was originally defined to get simple HTML objects, and was soon superseded by HTTP/1.0.

HTTP / 1.0

  • Support POST, HEAD and other request methods
  • increasestatus codeheader
  • Multi-character set support, multi-part sending, permissions, caching, etc
  • Response: No longer limited to hypertext (MIME type support)

1.0 was the first widely used version of HTTP. HTTP/1.0 adds version numbers, various HTTP headers, some additional methods, and handling of multimedia objects. HTTP/1.0 made possible Web pages with vivid images and interactive tables that helped make the World Wide Web widely accepted.

HTTP / 1.0 + & HTTP / 1.1

  • Persistent connection. The TCP three-way handshake occurs once before any connection is established. Finally, when all the data has been sent, the server sends a message indicating that no more data will be sent to the client; The client closes the connection (disconnects TCP)
  • Supported methods:GET , HEAD , POST , PUT ,DELETE , TRACE , OPTIONS
  • Significant performance optimizations and feature enhancements, chunking transfer, compression/decompression, content cache negotiation, virtual hosting (hosts with a single IP address have multiple domain names), faster response, and more bandwidth savings through increased caching

In the mid-1990s, many popular Web clients and servers were rapidly adding features to HTTP to meet the needs of the rapidly expanding and commercially successful World Wide Web. Many of these features, including persistent keep-alive connections, virtual hosting support, and proxy connection support, have been added to HTTP and become unofficial de facto standards. This informal HTTP extension is often referred to as HTTP/1.0+.

HTTP/1.1 focuses on correcting structural flaws in HTTP design, clarifying semantics, introducing important performance optimizations, and removing some bad features. HTTP/1.1 also includes support for more complex Web applications and deployment styles that were developing in the late 1990s.

HTTP/1.1 protocol deficiencies

  • Only one connection can correspond to one request at a time
  • Most browsers allow up to six concurrent connections for the same domain name
  • Only clients are allowed to initiate requests
  • There can only be one response to one request
  • Headers are transmitted repeatedly in multiple requests for the same session
  • This typically adds 500 to 800 bytes of overhead per transfer
  • If cookies are used, the added overhead can sometimes reach thousands of bytes

SPDY

If the browser is made by Google, instead of using HTTP to retrieve page information, it will send a request to the server to discuss using SPDY.

SPDY (pronounced “SPeeDY”) is a TCP-based session layer protocol developed by Google to minimize network latency, speed up the network, and optimize the user experience.

SPDY is not intended as an alternative to HTTP, but rather an enhancement of the HTTP protocol. Features of the new protocol include data stream multiplexing, request prioritization, and HTTP header compression.

SPDY is a precursor to HTTP/2. In September 2015, Google announced that it was removing support for SPDY and embracing HTTP/2.

HTTP 2

HTTP/2.0 aims at asynchronous connection multiplexing, header compression, request/response pipelining, and maintaining backward compatibility with HTTP 1.1 semantics is also a key goal of this release. One of the bottlenecks of HTTP implementations is that concurrency depends on multiple connections. HTTP pipelining mitigated this problem, but only partially multiplexed. In addition, it has been proven that browsers cannot adopt pipelining because of intermediate interference.

HTTP 2.0 will only be used for https:// urls on the open Internet, while http://urls will continue to use HTTP/1.1. The goal is to increase the use of encryption on the open Internet to provide strong protection against active attacks.

As of June 2019, 36.5 percent of websites around the world support HTTP/2, according to W3Techs.

The following two websites can compare HTTP/1.1 and HTTP/2 speeds

  • www.http2demo.io/
  • http2.akamai.com/demo

There’s a lot more to learn about HTTP 2,Here are some more details

HTTP 3

  • QUIC “Quick UDP Internet Connections”
  • Reduce RTT and increase data interaction speed through high link utilization efficiency
  • On the basis of high efficiency, ensure safety requirements
  • Solve the adaptation problem in the actual network environment

The main improvements in HTTP 3 are at the transport layer. No more heavy TCP connections at the transport layer. Now, everything goes through UDP.

The history of Internet communication is actually the history of mankind’s struggle with RTT, which is short for Round Trip Time. Colloquially speaking, it is the Time when communication comes and goes.

Establishing a TCP connection requires 1.5RTT

An HTTP interaction costs 1RTT per session

So the total time spent on HTTP communication based on TCP transmission: 2.5 RTT

The TLS based HTTPS request has four more times in the TCP handshake phase, so the connection time is 4.5RTT

The 9RTT is two TCP connections required to request a page with images before there is a persistent connection

Multiple request times are reduced by 4.5RTT once TCP connections can be reused

The QUIC protocol developed by Google integrates TCP reliable transmission mechanism, TLS secure encryption and HTTP /2 traffic reuse technology, and its page load time is 2.5Rtt and the waiting time for reconnection is 1RTT

HTTP 3 is based on QUIC and has an overall page load time of 2RTT

The HTTP protocol

Hypertext Transfer Protocol (HTTP) :

HTTP is an application-layer protocol, a simple request-response protocol that usually runs on top of TCP. It specifies what messages the client might send to the server and what responses it might get. HTTP was originally designed to provide a way to publish and receive HTML pages, with URIs identifying specific resources, and then using HTTP to transmit data in a wide range of formats, not just HTML.

The characteristics of

  • Simple and fast When a client requests services from a server, it only needs to send the request method and path. The commonly used request methods are GET, HEAD and POST. Each method specifies a different type of contact between the client and the server. Because HTTP protocol is simple, the HTTP server program size is small, so the communication speed is very fast.
  • Flexibility: HTTP allows the transfer of any type of data object. The Type being transferred is marked by content-Type, which is the identifier used in the HTTP package to indicate the Content Type.
  • Connectionless: The meaning of connectionless is to limit processing to one request per connection. The server disconnects from the customer after processing the request and receiving the reply from the customer. In this way, transmission time can be saved.
  • Stateless: HTTP is a stateless protocol. Stateless means that the protocol has no memory for transaction processing. The lack of state means that if the previous information is needed for subsequent processing, it must be retransmitted, which can result in an increase in the amount of data transferred per connection. On the other hand, the server responds faster when it doesn’t need the previous information.

HTTP Packet Format

The structure of HTTP request packets and response packets is basically the same and consists of three parts:

  • Start line: Describes basic information about the request or response
  • Header field collection: usekey-valueFormat Indicates the packet in more detail
  • Message body: The actual transmitted data, which is not necessarily plain text but can be binary data such as pictures and videos

Format of the request line message

METHOD URI VERSION
Copy the code

Request method + space + request target + space + version number + newline

Response message format

VERSION STATUSCODE REASON
Copy the code

Version number + space + Status code + Space + Status Description + Newline

The HTTP header fields

The header field is in the form of key-value. The key and value are separated by colons (:). At the end of the header field, CRLF newlines are used to indicate the end of the field.

Cache-control: max-age=600

Key = cache-control and value = max-age=600.

HTTP header field is very flexible, not only can use the standard Host, Connection and other existing headers, but also can arbitrarily add custom headers, which brings infinite possibilities for HTTP protocol expansion.

Precautions for header fields

  • The field name is case insensitive and does not contain Spaces. The field name can be hyphens (-) but cannot be underscores (_). The field name must be followed by a colon (:) and cannot contain Spaces. The field value after a colon (:) can contain multiple Spaces.
  • The order of the fields is meaningless and can be arbitrarily arranged without affecting semantics.
  • Fields cannot be repeated in principle unless the semantics of the field itself allow it, for example:Set-Cookie.

Common header field

There are many header fields in the HTTP protocol, but they can be divided into four general categories:

  • Request field: a header field in a request header, such as:Host,Referer.
  • Response field: the header field in the response header, for example:Server,Date.
  • Generic fields: can appear in both request headers and response headers, such as:Content-type,Connection.

In summary, a string that meets the preceding conditions can be sent as a request or response packet by the TCP connection. Browser and HTTP server are the technologies to implement and apply HTTP protocol. For example, the browser cache mechanism mentioned before is HTTP cache mechanism, because the browser has completed the technical implementation of the cache related fields in HTTP protocol, so the two are the same concept.

Test the

The NC command in Linux can be used as the server to initiate TCP connections.

Enter the following command:

nc www.baidu.com 80
Copy the code

Enter the following code and press Enter twice to see if you will receive a response message from baidu’s home page:

GET / HTTP / 1.1
Copy the code

Why hit Enter twice?

There should be a blank line between the request header and the message body, even if there is no message body to send.

HTTPS

HyperText Transfer Protocol Secure (HTTPS)

  • These are commonly called HTTP over TLS, HTTP over SSL, or HTTP Secure
  • First proposed by Netscape in 1994
  • The default port number for HTTPS is 443 (HTTP is 80)

It is important to note that HTTPS does not change HTTP much, but the TLS handshake is added to the TCP three-way handshake.

SSL / TLS

HTTPS encrypts packets using SSL and TLS on the basis of HTTP, providing reasonable protection against eavesdropping and man-in-the-middle attacks.

SSL/TLS can also be used for other protocols, such as:

  • FTP, FTPS
  • SMTP → SMTPS

Transport Layer Security (TLS) protocol

  • Formerly known as Secure Sockets Layer (SSL), Secure Sockets Layer (SSL)

Historical Version Information

  • SSL 1.0: Never made public due to a serious security vulnerability

  • SSL 2.0:1995, deprecated in 2011

  • SSL 3.0:1996, deprecated in 2015

  • TLS 1.0:1999

  • TLS 1.1:2006

  • TLS 1.2:2008

  • TLS 1.3:2018

At what layer does SSL/TLS work?

OpenSSL

OpenSSL is an open source implementation of THE SSL/TLS protocol. It was started in 1998 and supports Windows, Mac, and Linux platforms

  • OpenSSL comes with Linux and Mac
  • Windows requires a download and installation

Common commands

  • Generate the private key:openssl genrsa -out mj.key
  • To generate a public key:openssl rsa -in mj.key -pubout -out mj.pem

You can use OpenSSL to build your own CA and issue your own certificate, which is called “self-signed certificate”.

The cost of the HTTPS

  • Cost of certificate

  • Encryption and decryption calculation

  • The access speed is reduced

Some companies use HTTPS for requests that contain sensitive data, and keep HTTP for others.

HTTPS communication process

The BASIC idea of THE SSL/TLS protocol is to use the public key encryption method. That is, the client first requests the public key from the server, and then encrypts the information using the public key. After receiving the ciphertext, the server uses its private key to decrypt it.

But there are two problems.

(1) How to ensure that the public key is not tampered with?

Solution: Put the public key in the digital certificate. The public key is trusted as long as the certificate is trusted.

(2) Public key encryption calculation is too large, how to reduce the consumption of time?

Solution: For each session, the client and server generate a session key, which is used to encrypt information. Because the “conversation key” is symmetrically encrypted, the operation is very fast, whereas the server public key is only used to encrypt the “conversation key” itself, which reduces the time consumed in the encryption operation.

Therefore, the basic process of SSL/TLS protocol can be divided into three stages

  • TCP three-way handshake

  • The TLS connection

    1. The client requests and verifies the public key from the server
    2. Both parties negotiate to generate “dialogue key”
  • The two parties use the “conversation key” for encrypted HTTP communication

Details of the handshake phase

The handshake phase involves four communications, so let’s look at each one. Note that all communication in the “handshake phase” is in clear text.

Client sends a request (ClientHello)

First, the client (usually a browser) makes a request to the server to encrypt the communication. This is called a ClientHello request.

In this step, the client mainly provides the following information to the server.

(1) Supported protocol version, such as TLS version 1.0.

(2) a client-generated random number that is later used to generate the “conversation key”.

(3) Supported encryption methods, such as RSA public key encryption.

(4) Supported compression methods.

Note that the server’s domain name is not included in the message sent by the client. That is, the server can theoretically contain only one web site, otherwise it would be confusing which web site’s digital certificate should be provided to the client. This is why there is usually only one digital certificate per server.

For virtual host users, this is of course very inconvenient. In 2006, the TLS protocol added a Server Name Indication extension that allows a client to provide a requested domain Name to the Server.

Server response (SeverHello)

When the server receives a client request, it sends a response to the client. This is called SeverHello. The server’s response contains the following.

(1) Confirm the version of the encrypted communication protocol, such as TLS 1.0. If the browser does not match the version supported by the server, the server turns off encrypted communication.

(2) A server-generated random number that is later used to generate the “conversation key”.

(3) Confirm the encryption method used, such as RSA public key encryption.

(4) Server certificate.

In addition to the above information, if the server needs to verify the identity of the client, it will include a request for a “client certificate.” For example, financial institutions tend to allow only authenticated customers to connect to their networks, and will provide a USB key containing a client certificate to a regular customer.

Client Response

After receiving a response from the server, the client first validates the server certificate. If the certificate is not issued by a trusted authority, or the domain name in the certificate is inconsistent with the actual domain name, or the certificate has expired, a warning is displayed to the visitor and he or she can choose whether to continue the communication.

If there is nothing wrong with the certificate, the client retrieves the server’s public key from the certificate. Then, send the following three pieces of information to the server.

(1) a random number The random number is encrypted with the server’s public key to prevent eavesdropping.

(2) Notification of encoding change, indicating that subsequent information will be sent using encryption methods and keys agreed by both parties.

(3) Notification of the end of client handshake, indicating the end of client handshake. This item is also the hash value of all the previously sent content for the server to verify.

The first random number is the third random number in the handshake phase, also known as the “pre-master key”. With this, the client and server have three random numbers at the same time, and each generates the same “session key” for the session using a pre-agreed encryption method.

Dog250 explains why it is necessary to use three random numbers to generate “session keys” :

“Both the client and the server need random numbers so that the generated key is not the same every time. Because SSL certificates are static, it is necessary to introduce a randomness factor to ensure the randomness of the negotiated keys.

For the RSA key exchange algorithm, the pre-master-key itself is a random number, plus the randomization in the Hello message, three random numbers through a key exporter finally exported a symmetric key.

The existence of Pre Master is that SSL protocol does not trust each host to generate a completely random random number. If the random number is not random, pre Master Secret may be guessed, so it is not appropriate to only apply Pre Master Secret as the key. Therefore, it is necessary to introduce a new random factor, so it is not easy to guess the key generated by the client and server together with pre Master secret. One pseudo-random may not be random at all, but three pseudo-random will be very close to random. Every increase of one degree of freedom, the randomness will not increase by one.”

In addition, if the server requested a client certificate in the previous step, the client sends the certificate and related information in this step.

Final response from the server

After receiving the third random pre-master key from the client, the server calculates and generates the session key used for the session. Then, the following information is finally sent to the client.

(1) Notification of encoding change, indicating that subsequent information will be sent using the encryption method and key agreed by both parties.

(2) Notification indicating that the handshake phase of the server is over. This item is also the hash value of all the previously sent content for the client to verify.

At this point, the whole handshake phase is over. Next, the client and server enter encrypted communication using the plain HTTP protocol, but with a “session key” to encrypt the content.

Persistent and non-persistent connections

A TCP connection can normally send an HTTP request after three handshakes, but let’s consider another question:

Should a TCP connection be established for each HTTP request sent?

The answer, of course, is no.

HTTP has two states: persistent connection and non-persistent connection.

  • Non-persistent connection:HTTP / 1.0The header field inConnectionThe default value iscloseThat is, TCP connections are re-established and disconnected for each request.
  • Persistent connection:HTTP / 1.1The header field inConnectionThe default value iskeep-alive, the connection can be reused. As long as neither the sender nor the receiver disconnects, the TCP connection remains in the state.

In HTTP1.1, all connections are persistent by default, but in HTTP1.0 it is not standardized, even though some servers implement persistent connections through non-standard means, but the server does not necessarily support persistent connections. The advantage of persistent connection: reduces the extra overhead caused by the repeated establishment and disconnection of TCP connections, alleviates the burden on the server side, and the reduced overhead time actually enables HTTP requests and responses to end earlier, improving the display speed of Web pages.

Note that it is not TCP that determines whether or not a persistent connection is made, but rather the upper layer HTTP protocol that controls when TCP breaks.

Concurrent requests

Now that TCP persistence is implemented, let’s think about another question:

Can multiple HTTP requests on a TCP connection be sent in parallel?

In HTTP/1.1, a single TCP connection can process only one request at a time. That is, the life cycles of two HTTP requests cannot overlap. The start and end times of any two HTTP requests cannot overlap in the same TCP connection. The next request cannot be sent until the previous request has received a response.

But there are still ways to implement concurrent requests:

  • Pipelining enables multiple HTTP requests to be sent simultaneously without waiting for a response from the previous request, but the browser turns off pipelining by default for the following reasons:

    • Some proxy servers are not handling correctlyHTTP Pipelining
    • Head-of-line BlockingHeader blocking: After a TCP connection is established, it is assumed that the client sends multiple requests to the server in succession over that connection. By standard, the server should return the results in the order it received the requests. Assuming that the server takes a lot of time to process the first request, all subsequent requests will have to wait for the end of the first request to respond, causing a block.
  • Multiplexing

    Because pipelining in HTTP/1.1 was not really usable, the Multiplexing feature was present in HTTP/2.0

    • inHTTP / 2.0, there are two very important concepts, namely frame and stream. Frame represents the smallest unit of data, and each frame identifies which stream it belongs to. A stream is a data stream composed of multiple frames.
    • Multiplexing means that multiple streams can exist in a TCP connection. In other words, multiple requests can be sent, and the peer end can know which request belongs to by the identifier in the frame. By using this technique, the queue header blocking problem in older VERSIONS of HTTP can be avoided and the transmission performance can be greatly improved.

    • Multiple requests are sent in parallel and interleaved, without affecting each other
    • Multiple responses are sent interleaved in parallel and do not interfere with each other
    • Use a single connection to send multiple requests and responses in parallel

Relationship between HTTP and TCP

Here we can see, actually the HTTP protocol is the top of the application layer protocol, it is stipulated in accordance with the need of software implementation, it does not need to pay attention to the data in the network is how transmission and how to ensure reliability and network congestion, because by the lower TCP to solve all these problems, we also mentioned before, TCP has send queues and receive queues, so HTTP only needs to operate on these two queues to send or get the data it wants. The CONNECTION established by the TCP three-way handshake is not a real physical connection but a virtual one. The essence of the connection is the resources (such as memory and process) required for the connection between the client and server. After establishing a three-way handshake connection using TCP by calling the Socket, the HTTP request packets prepared before are sent to the sending queue and then handed over to TCP to complete the subsequent process.

So throughout the entire network layered structure, HTTP and TCP collaboration between, between TCP and IP, and provide service for the upper and lower upper to the lower packaging good data, the corresponding level and the target only level communication each other, for the current layer, the upper data it is passed down, you don’t care about will not parse.

conclusion

Now look back at the summary diagram at the beginning of this article to see if you can understand the process of network communication and what the browser actually does during this period. In fact, various protocols of TCP lower level have been implemented in the operating system level, we can operate is TCP and the upper protocol, so the front-end network optimization starts from socket, whether it is the use of cache or persistent connection, the purpose of optimization is only one, is to reduce the network delay. Figuring this out can give us a more complete picture of the underlying knowledge, and thus a deeper understanding of HTTP and where it is heading.

Of course, this is only part of the network request, the browser is the most important or V8 engine parsing and page rendering process, this part we will return to the browser software itself to do in-depth research, then we will see you in the next article…

The problem

If you have any questions, you can ask them. This is for recording and answering.

If the strong cache is not hit, is the negotiation cache or DNS request processed next?

Handle the negotiation cache.

If the DNS request hits the browser cache or computer cache, is the DNS request still complete?

The DNS request has not been built here.

Why do you choose to “slice” the data into multiple segments at the transport layer, rather than waiting for the network layer to slice the data to the data link layer?

Because it improves retransmission performance. To be clear: Reliable transport is controlled at the transport layer. If there is no fragmentation at the transport layer, the entire transport layer must be retransmitted once data is lost. If there are segments at the transport layer, once data is lost, only those segments need to be retransmitted.

Refer to the article

what-happens-when-zh_CN

What happens when you type google.com into your browser and press Enter?

(1.6W word) browser soul ask, how many can you catch?

The difference between URIs, urls, and UrNs

Thoroughly understand the browser caching mechanism (HTTP caching mechanism)

Browser caching – Understand strong and negotiated caching thoroughly

Introduction to DNS Principles

Hyperdetailed DNS protocol resolution

DNS (Domain name Resolution Protocol) details

The soul of TCP, solidify your network infrastructure

Interview: Should a TCP connection be established for each HTTP request (non-persistent/persistent)

Introduction to TCP in high concurrency architecture

TCP continuous ARQ protocol and sliding window protocol

Introduction to TCP

NAT is a secret for private IP addresses to access public IP addresses

How does NAT find hosts in a LAN

Understanding ARP

Computer network learning: packet forwarding and routing, ARP protocol

Study notes on network protocol from beginning to bottom principles

Overview of the SSL/TLS protocol operation mechanism

Reference books

The Definitive GUIDE to HTTP

Refer to the video

Wang Dao Computer postgraduate entrance examination computer network

TCP/IP network communication Socket programming introduction

IPv4 address and subnet mask

2021 Network protocol entry to mastery