Here is “to the front-end engineer HTTP series”, remember a big guy once said: “large factory front-end interview requirements for HTTP is higher than CSS”, which shows the importance of HTTP can not be underestimated. This is the first article in a series called “From TCP/UDP to DNS Resolution”.
Read more about my interview series.
Writing program
-
Parsing from TCP/UDP to DNS
-
HTTP protocol stuff
-
Web Server (Nginx/Caddy)
-
HTTPS (symmetric/asymmetric encryption /SSL)
-
JWT (I use it in the background of my own blog, so I’m writing an article)
-
SPDY / HTTP/2 / Websockets
-
Network attack
-
Cross domain
-
Caching mechanisms
-
Principles of browsers
-
Final chapter: What happens from entering the URL to rendering the page
The history of HTTP
The HTTP/0.9 standard came out in 1990 because HTTP was not established as a formal standard at the time, and this version has the flavor of the pre-HTTP /1.0 version.
HTTP/1.0 was published as the first standard in May 1996. It is documented in RFC1945 – Hypertext Transfer Protocol – HTTP/1.0
The HTTP/1.1 standard was published in June 1999. By now it is the most mainstream VERSION of HTTP Protocol. It is recorded in RFC2616 – Hypertext Transfer Protocol – HTTP/1.1
HTTP/2 standard was officially released in May 2015, it is recorded in RFC7540 – Hypertext Transfer Protocol — HTTP/2, it is characterized by ① using binary rather than plaintext packaging, ② multiplexing, ③ repair queue congestion, ④ allow setting request priority, ⑤ server push, ⑥ WebSocket, etc.
According to W3TEChs, as of 2019/04/22, HTTP/2’s global share is 36%. My personal blog has supported HTTP/2 since its launch.
TCP/IP traffic
TCP/IP Layer 5 protocol
Before explaining TCP/IP communication transport flows, let’s review the five layer protocols of TCP/IP.
Application layer: The activities that determine communication when providing application services to users. The TCP/IP protocol family stores various common application services. For example, FTP, DNS, and HTTP.
Transport layer: Transport layer to upper application layer, providing data transfer between two computers in a network connection. At the transport layer, there are two different protocols: Transmission Control Protocol (TCP) and User Data Protocol (UDP).
Network layer: The network layer handles the packets that flow over the network. A packet is the smallest unit of data transmitted over a network. This layer defines the path through which the packets are sent to each other’s computers. When communicating with the other computer through multiple computers or network devices, the role of the network layer is to select a transmission route among many options.
Data link layer: on the basis of bitstream service provided by the physical layer, the data link between adjacent nodes is established, the error-free transmission of data frames on the channel is provided through error control, and the action series of each circuit is carried out. The units of data are called frames.
Physical layer: The physical layer is built on the basis of the physical communication medium. As the interface between the system and the communication medium, the physical layer is used to realize the transparent bit stream transmission between entities of the data link. Only this layer is the real physical communication, and other layers are virtual communication.
TCP/IP data transfer flow
The client makes an HTTP request at the application layer (HTTP protocol).
To facilitate transmission, the transport layer (TCP) divides the data (HTTP request packets) received from the application layer, marks the serial number and port number of each packet, and forwards the packets to the network layer.
At the network layer (IP protocol), add the MAC address as the communication destination and forward it to the data link layer. At this point, the request to the server is ready.
When the server receives data at the link layer, it sends the data to the upper layer in sequence until it reaches the application layer. The request sent by the client is received only when it is transmitted to the application layer.
What is a MAC address?
Media Access Control addresses, also known as LAN addresses, Ethernet addresses, or Physical addresses, It is an address used to confirm the location of devices on the Internet. The Address Resolution Protocol (ARP) is a Protocol used to resolve addresses. It can find out MAC addresses based on IP addresses.
The following figure shows the Intranet IP and MAC addresses of a PC. On the terminal (MAC OS), enter ifconfig and find EN0 to search for local Ethernet information.
So what is a MAC address? We know that IP addresses are mutable and can be assigned to a device in various ways, such as DHCP, PPP, static IP, etc. MAC addresses do not change. A device is branded with a unique identifier during production. This unique identifier is the MAC address.
Here’s an interesting example: if you order takeout at work at noon, the delivery address must be the address of the company; When you get home at night and order takeout, you have to write down your address (IP is dynamic). But wherever you order food, the name and phone number on the order must be your own (MAC address).
The delivery guy delivers lunch to the company’s front door, but you’re not the only one (multiple devices are on the same broadcast network), so he’ll find you by phone number and name.
UDP protocol.
The User Datagram Protocol (UDP) is a simple Datagram – oriented transmission Protocol. In the TCP/IP model, UDP provides a simple interface above the network layer and below the application layer. UDP provides only unreliable delivery of data and does not retain a backup of data once it has sent the data sent by the application to the network layer. UDP adds only reuse and validation to the header of an IP datagram.
Its characteristics are as follows:
-
UDP lacks reliability. UDP does not provide the confirmation sequence number, sequence number, and timeout retransmission mechanism. UDP datagrams may be copied and reordered on the network. That is, UDP does not guarantee that datagrams will reach their final destination, the order of the datagrams, or that each datagram will arrive only once.
-
The UDP header has a low overhead and contains the following data:
-
Two hexadecimal port numbers: source port and destination port.
-
The length of the entire data packet.
-
The checksum of the entire data packet, used to find errors in header information and data
-
-
UDP is connectionless. The UDP client and server do not have to have a long-term relationship before. UDP also does not require a handshake to create a connection before sending datagrams.
-
UDP supports not only unicast, but also multicast and broadcast.
Udp-based protocols include:
-
Domain Name System (DNS)
-
Simple Network Management Protocol (SNMP)
-
Dynamic Host Configuration Protocol (DHCP)
-
Routing Information Protocol (RIP)
-
Bootstrap Protocol (BOOTP)
-
Simple File Transfer Protocol (TFTP)
TCP protocol
Transmission Control Protocol (TCP), defined by RFC 793 of the IETF, is a connection-oriented and reliable transport-layer communication Protocol based on byte stream services. The Byte Stream Service divides a chunk of data into packets based on segment for convenient transmission.
-
TCP provides a connection-oriented, reliable byte stream service
-
In a TCP connection, only two parties communicate with each other. Broadcast and multicast cannot be used with TCP
-
TCP uses checksum, acknowledgement, and retransmission mechanisms to ensure reliable transmission
-
TCP sorts data into sections and uses cumulative validation to ensure that the order of the data is constant and non-repetitive
-
TCP uses the sliding window mechanism to control traffic, and dynamically changes the window size to control congestion
TCP packet
Port number: includes the source port number and destination port number, which identifies different application processes on the same computer. The source port number and destination port number in the TCP header and the source IP address and destination IP address in the IP packet uniquely determine a TCP connection.
-
Source port number: The source port and IP address identify the return address of the packet.
-
Destination port number: The destination port specifies the application program interface on the receiving computer.
Serial number: It is the serial number of the first byte of the data set sent by the current message segment. In a TCP stream, each byte has a sequence number. For example, if the serial number of a packet segment is 300 and the data part of this packet segment is 100 bytes, the serial number of the next packet segment is 400. Serial numbers ensure the order of TCP transmission.
Acknowledgement number: ACK(Acknowledgement), indicating the next byte number expected to be received, indicating that all data before the number has been correctly received. The confirmation number is valid only when the ACK flag is 1. For example, when establishing a connection, the ACK bit of the SYN packet is 0.
Length of head: Since the header may contain optional content, the length of the TCP header is uncertain. If the header does not contain any optional fields, the length is 20 bytes. The maximum length of the 4-bit header field can be 1111, which is 15 in decimal notation, 15*32/8 = 60, so the maximum length of the header is 60 bytes. The header length is also called data offset because it actually indicates the starting offset of the data area in the packet segment.
Reserved: Reserved for defining new uses in the future, now set to 0.
Control bits | instructions |
---|---|
URG | Emergency pointer flag: 1 indicates that the emergency pointer is valid; 0 indicates that the emergency pointer is ignored. |
ACK (acknowledgement) | A value of 1 indicates that the acknowledgement number is valid. A value of 0 indicates that the packet contains no acknowledgement information and the acknowledgement number field is ignored. |
PSH | Push flag: 1 indicates data with push flag, indicating that the receiver should deliver the message segment to the application program as soon as possible after receiving the message segment, instead of queuing in the buffer. |
RST | Reset connection flag, used to reset a faulty connection due to a host crash or other reason. It can also be used to reject invalid message segments and connection requests. |
SYN (synchronize) | SYN=1 and ACK=0 in the connection request indicate that the data segment does not use a piggy-back acknowledgment field, while the connection reply carries an acknowledgment, SYN=1 and ACK=1. |
FIN (Finish) | The Finish flag, used to release the connection, is 1 to indicate that the sender has no more data to send, that is, to close its data stream. |
Window: The sliding window size is used to inform the sender of the cache size of the receiver, so as to control the rate at which the sender sends data and achieve flow control. The window size is a 16bit field, so the maximum window size is 65535.
Checksum: The parity check. The checksum is calculated in 16-bit characters for the entire TCP packet segment, including the TCP header and TCP data. It is calculated and stored by the sender and verified by the receiver.
Emergency pointer: The emergency pointer is valid only when the URG flag is set to 1. The emergency pointer is a positive offset that is added to the value in the ordinal field to indicate the ordinal number of the last byte of the emergency data. TCP emergency mode is a way for the sender to send emergency data to the other end.
Options and padding: The most common optional field is the Maximum Segment Size (MSS). Each connection usually specifies this option in the first Segment of the communication (the Segment where the SYN flag is set to 1 to establish the connection). It indicates the Maximum length of the Segment that the local end can accept. The option length does not have to be a multiple of 32 bits, so padding bits are added, that is, adding an extra zero to the field to ensure that the TCP header is a multiple of 32.
Data part: The data part of the TCP packet segment is optional. When a connection is established and when a connection is terminated, only the TCP header is exchanged. If one party has no data to send, the header without any data is also used to acknowledge the received data. In many cases of processing timeouts, a segment of the message without any data is also sent.
TCP setting up a connection (three-way handshake)
- Can I connect you?
- You can.
- Then I’m off.
Emmmmm, I’ve been single for a long time. I’ve seen three handshakes.
Three-way handshaking is when a TCP connection is built, a client and a server send three packets. Its purpose is to connect to the specified port of the server, establish a TCP connection, and synchronize the serial number and confirmation number of both sides, exchange TCP window size information. In socket programming, a three-way handshake is triggered when the client executes connect().
-
First handshake
The client first sends a packet with the SYN value 1 to the server, specifying which interface the client wants to connect to and the initial sequence number X. After sending the packets, the client enters the SYN_SEND state.
-
Second handshake
The server replies with a SYN/ACK packet. SYN=1, ACK=1. The server selects its ISN sequence number, places it in seQ, and sets ack sequence number to the ISN+1 (x+1) of the client. After sending the packets, the server enters the SYN_RCVD state.
-
Third handshake
After receiving the acknowledgement, the client sends another packet with an ACK flag. That is, ACK=1, ACK= y+1, and its sequence number seq=x+1. After the sending is complete, the client and server enter the ESTABLISHED state. At this point, the three handshakes ended.
TCP closes the connection (four waves)
- Client: I’m going to sleep
- Server: Well, go to sleep, good night
- Server: I’m going to bed, too
- Client: Good night, sweet dreams
TAT, shitty.
-
First wave
The client calls close() and sends a packet with FIN (Finish) flag 1 to the server to indicate that all data has been sent. At this point, the client enters the FIN_WAIT_1 state. According to TCP, FIN packets consume a sequence number even if they do not carry data.
-
Second wave
After receiving the release packet from the client, the server sends an ACK packet containing ACK=1 and ACK= U +1 and its serial number seq= V. Indicates that you have received the client’s request to close the connection, but are not ready to close the connection (half-closed), meaning that the client has no data to send, but the server may still send data. After sending the packet, the server enters the CLOSE_WAIT state.
After receiving the packet, the client enters the FIN-WaIT-2 state and waits for the server to send a connection release packet.
-
Third wave
When the server is ready to close the connection, it sends a connection release packet containing FIN=1 and ACK = U +1 to the client. Since the server is in the semi-closed state, it is likely that it sent some more data, assuming the serial number is seq= W. After sending, the server enters the last-ack state.
-
Fourth wave
After receiving the connection release packet from the server, the client needs to send an ACK packet with ACK=1 and ACK= W +1. The serial number of the client is SEq = U +1 and the client enters the time-wait state. The ACK packet that requires retransmission may occur during the waiting process.
In this case, the TCP connection is not released and the client can enter the CLOSED state only after the Maximum Segment Lifetime (MSL) period expires.
The server enters the CLOSED state immediately after receiving the confirmation from the client. Similarly, revoking the TCB terminates the TCP connection. Therefore, the server ends the TCP connection earlier than the client.
Comparison between TCP and UDP
UDP | TCP | |
---|---|---|
Whether connection | There is no connection | connection-oriented |
reliable | Unreliable transmission, not using flow control and congestion control | Reliable transmission, using flow control and congestion control |
Number of connected objects | Supports one-to-one, one-to-many, many-to-one and many-to-many interactive communication | It can only be one-to-one communication |
transport | For a message | Word oriented stream |
The first overhead | The header overhead is small, only 8 bytes | Minimum 20 bytes, maximum 60 bytes |
Applicable scenario | For real-time applications (IP phone calls, video conferencing, live streaming, etc.) | Suitable for applications that require reliable transmission, such as file transfer |
DNS
DNS Message Format
The request packet and the reply packet returned by the DNS server are in the same format:
-
Session ID (2 bytes) : INDICATES the ID of a DNS packet. This field is the same for the request packet and the corresponding reply packet. It can be used to distinguish which DNS reply packet is the response to the request.
-
Flag (2 bytes) : It has 8 parts, as shown below:
field instructions QR (1bit) Query/response flag. 0 indicates a query packet and 1 indicates a response packet opcode (4bit) 0 indicates standard query, 1 indicates reverse query, 2 indicates server status request, and 3-15 indicates reserved values AA (1bit) Indicates the authorization answer. This field is meaningful only when the server that answers the answer is the authorization resolution server that queries the domain name. TC (1bit) Truncated: indicates that the length of the packet is longer than the allowed length, resulting in truncation RD (1bit) This field is set by the request and is returned with the same value used in the reply. If the RD is configured, the DNS server is recommended to perform recursive resolution. Recursive query is optional RA (1bit) Indicates available recursion. This field is set or unset in the reply and is used to indicate whether the server supports recursive queries RCODE (4bit) Reply code: 0 indicates no error, 3 indicates name error, and 2 indicates server error Z retention -
Questions Query field
-
QNAME unsigned 8bit indicates the query name. The length is unlimited.
-
QTYPE An unsigned integer of 16 bits indicates the protocol type to be queried.
-
QCLASS An unsigned 16bit integer indicates the class to be queried.
-
-
Answer/Authority/Additional
All three have the same format, as follows:
-
NAME NAME of the resource record.
-
TYPE Indicates the DNS protocol TYPE.
-
CLASS represents the CLASS of RDATA.
-
TTL indicates the time for which resource records can be cached. 0 means it can only be transmitted, but not cached.
-
RDLENGTH indicates the length of RDATA.
-
RDATA can be a long string representing a record. The format root TYPE is related to CLASS. For example, if TYPE is A and CLASS is IN, then RDATA is A 4-byte ARPA network address.
-
DNS Resolution Record
If you have ever built a website, you must be familiar with DNS resolution records. Let’s review the table below.
type | The mnemonic word | instructions |
---|---|---|
1 | A | Obtaining aN IPv4 address from a domain name |
2 | NS | Querying domain Name Servers (Common) |
5 | CNAME | Setting a Domain name alias (Common) |
6 | SOA | To authorize |
11 | WKS | Be familiar with the service |
12 | PTR | Translate IP addresses into domain names |
13 | HINFO | Host information |
15 | MX | Email exchange (common) |
28 | AAAA | Obtain IPv6 address from domain name (common) |
252 | AXFR | Transmit requests for the entire district |
255 | ANY | A request for a used record |
Domain name Resolution Process
-
The system checks the browser cache for an IP address that has been resolved for the domain name. If it does, the resolution process ends. The browser cache is controlled by the expiration time of the domain name and the size of the cache.
-
If the user does not have one in the browser cache, the browser looks for the local Host file in the operating system cache.
-
The router may also have a cache.
-
If you can’t find it in the first few steps, you’ll go to the Local DNS (LDNS), which is the DNS (usually two) assigned to you by your ISP. In most cases, this is where the domain name will be resolved. Here are the two DNS provided to me by Cloudflare.
-
If not found in LDNS, the Root Server is requested to resolve the DNS. The root DNS Server returns to the local DNS Server a query for the master DNS Server (gTLD Server) address. GTLD is an international top-level domain name server, such as.com,.cn and.org. There are only about 13 gTLD servers in the world.
-
The local DNS Server will send a request to the gTLD Server address, and it will return the address of a Name Server. This Name Server is usually the Name Server from which your domain Name is registered, such as NameCheap, Dogdad, Wanwang, etc.
-
Name Server The DNS Server queries the mapping table between storage domain names and IP addresses. In normal cases, the DNS Server obtains the destination IP address records based on the domain Name and returns them together with a TTL to the local DNS Server.
-
The local DNS server caches the IP address based on the TTL and returns the resolution result to the client. The client then caches the IP address information to the local system cache based on the TTL. At this point, domain name resolution is complete.
Introduction to the CDN
CDN is fully called Content Delivery Network. It can redirect users’ requests to the nearest service node according to the comprehensive information such as Network traffic, connection of each node, load status, distance to users and response time, so as to improve the corresponding speed of users’ access to websites.
In layman’s terms, the resources that users used to access were stored on your own server, but now the resources are accessed from the CDN cache server. In practice, we only need to direct the DNS resolution of the domain name to the domain name server provided by the CDN service provider.
I’ve been using the free version of CloudFlare, which is awesome. It offers HTTP/2, IPV6, brotli (a compression algorithm that is much more efficient than Gzip).
A typical CDN system consists of the following two parts:
- Distribution service system
The basic element of the distribution service system is the Cache device, which will synchronize the content of the source site and be responsible for responding to the user’s access request, and quickly provide the cached local content to the user.
- Load balancing system
Scheduling access for users who initiate requests to determine the actual final access address provided to users. The system is divided into global load balancing (GSLB) and local load balancing (SLB). GBLB optimizes each service node based on the proximity principle to provide users with the most suitable Cache device. SLB is responsible for device load balancing within nodes.
A few interview questions
Three handshakes and four waves
See above.
What means does TCP have to ensure reliable delivery
-
To facilitate transmission, TCP divides large data into data blocks for management.
-
When TCP sends a packet segment, it starts a timer and waits for the destination to confirm receipt of the packet segment. If an acknowledgement cannot be received in time, the packet segment is resend.
-
When it receives data from the other end, it sends an acknowledgement, but this acknowledgement is not sent immediately. It is delayed so that the packet can be fully validated.
-
TCP checks the accuracy of data by checking the checksum. When detecting data errors, TCP throws a packet with the NCK flag to the client. After receiving the packet, the client sends the packet again.
-
TCP assigns a serial number to each byte to ensure the order of TCP transmission. If a packet segment is out of order, TCP reorders packets based on the serial number.
-
IP datagrams may duplicate, and the TCP receiver discards duplicate data.
-
TCP provides traffic control. Each side of a TCP connection has a fixed amount of buffer space. The TCP receiver only allows the other end to send as much data as the buffer on the receiving end can accept. TCP uses a variable-size sliding window protocol for flow control.
Why three handshakes? Why not two?
To ensure reliable data transmission, both parties of the TCP protocol must maintain a serial number to identify which packets have been received. The three-way handshake is a necessary step for communication parties to inform each other of the start sequence number and confirm that the other party has received the start sequence number
In the case of two handshakes, at most only the initial sequence number of the connection initiator can be confirmed, and the sequence number selected by the other party cannot be confirmed.
What if the connection has been established, but the client suddenly fails?
TCP also has a keep-alive timer, so if the client fails, the server cannot wait forever, wasting resources. The server resets this timer every time it receives a request from the client, usually for two hours. If it does not receive any data from the client within two hours, the server sends a probe segment, which is then sent every 75 seconds. If there is no response after 10 probe packets are sent, the server assumes that the client is faulty and closes the connection.
DNS Resolution Process (key)
See above.
The last
Stay tuned for the next article, which will take a thorough look at the HTTP protocol.
Welcome to follow my wechat public number: the front of the attack
reference
Illustrated HTTP by Ueno
TCP and UDP
Why use a MAC address when you have an IP address?
[Interview ∙ Networking] TCP/IP (4) : Introduction to TCP and UDP
Description of the FORMAT of TCP packets
Compare TCP and UDP
hit-alibaba TCP
TCP three-way handshake and four-way wave
When we talk about networks, what are we talking about (4)- TCP and UDP
DNS protocol description and packet format analysis
Description of DNS request packets
In-depth understanding of Http requests, DNS hijacking and resolution.
Summary of CDN and DNS knowledge