Author: Kong Lingtao

Since 2015, QUIC protocol has been standardized in IETF and implemented by major manufacturers at home and abroad. Considering that QUIC has many advantages such as “0RTT connection building” and “support connection migration”, and will become the underlying transport protocol of the next generation Internet protocol: HTTP3.0, Ant Group Alipay client team and access gateway team began to implement QUIC in mobile payment, overseas acceleration and other scenarios in the second half of 2018.

This paper is a review to introduce the overall landing situation of QUIC in ants. The reason why it is a review is that THE QUIC protocol is too complex. If the existing protocol is compared, QUIC is approximately equal to HTTP + TLS +TCP, so it cannot be used in a single battle in detail. Therefore, we present the main points of the implementation to the readers through the review, and mainly introduce the following parts:

  • QUIC background: a brief and comprehensive introduction to QUIC related background knowledge;
  • Scheme selection and design: this paper introduces in detail how the ant’s landing scheme can find a new way and elegantly support many characteristics of QUIC, including connection migration, etc.
  • Landing scenario: Introduces two landing scenarios of QUIC in ant, including Alipay client link and overseas acceleration link;
  • Several key technologies: introduce the core problems to be solved in the process of landing QUIC, and the solutions we use, including: “support connection migration”, “improve 0RTT ratio”, “support UDP lossless upgrade” and “client intelligent uplink selection”, etc.
  • Several key technology patents.

Background on QUIC

In view of the reader’s background may be different, before starting this article, we first introduce the QUIC related background knowledge, if you are interested in the more design details of the agreement, can see the relevant Draft:datatracker.ietf.org/wg/quic/doc…

First, what is QUIC?

To put it simply, QUIC (Quick UDP Internet Connections) is a secure and reliable transport protocol based on UDP encapsulation. Its goal is to replace TCP and self-contained TLS as a standard secure transport protocol. The following figure shows the position of QUIC in the protocol stack. The HTTP protocol based on QUIC bearer is further standardized as HTTP3.0.

Second, why QUIC?

Before QuICS, TCP carried more than 90% of Internet traffic and seemed fine, so why the revolutionary Quics? This is mainly because TCP, which has been developed for several decades, faces the “protocol rigidity problem”, which is manifested in the following aspects:

  1. The rigidity of TCP support on network devices is reflected in the following aspects: For some firewalls or NAT devices, if TCP introduces new features, such as adding certain TCP options, packets may be considered as attacks and thus the new features cannot work on the old network devices.
  2. TCP becomes rigid due to the upgrade difficulty of the network operating system. Some TCP features cannot be rapidly evolved
  3. In addition, when the application layer protocol optimization to TLS1.3, HTTP2.0, the optimization of the transport layer is also put on the agenda, QUIC on the basis of TCP, take its essence and discard its dross has the following core advantages:

A brief history of QUIC ecosphere development

Below are some of the important time nodes of QUIC from its creation until now. In 2021, QUIC V1 became RFC, ending the trend of flowers blooming.

After introducing the relevant background of QUIC, we will introduce the whole landing content of ant. Here, in order to facilitate the description, we use the first, second, third and fourth of Ant QUIC to summarize, namely, “a landing framework”, “two landing scenes”, “three innovative patent protection”, “four key technologies”.

A floor-to-ceiling frame

The ant access gateway is based on the multi-process NGINX (internally known as Spanner, protocol uninstallation Spanner), while UDP has many challenges in the multi-process programming model, typically such as lossless upgrades. In order to design a complete framework, we fully consider the convenience, expansibility and performance of the server deployment on the cloud before landing, and design the following landing framework to support different landing scenarios:

In this framework, there are two components as follows:

  1. QUIC LB component: based on NGINX layer 4 UDP Stream module development, used for routing based on the server information carried in QUIC DCID, to support connection migration;
  2. NGINX QUIC server: developed NGINX_QUIC_MODULE, each Worker listens on two types of ports:
  3. BASE PORT, the same PORT number used by each Worker, is monitored in the form of Reuseport and exposed to QUIC LB to receive the packet in the first RTT from the client. The characteristic of this packet is that DCID is generated by the client without routing information.
  4. Working PORT, the different PORT number used by each Worker, is the real Working PORT, which is used to receive THE QUIC packet after the first RTT. The specific DCID of this packet is generated by the process of the server and carries the information of the server.

The capabilities supported by the current framework include the following:

  1. QUIC connection migration is fully supported in user mode without modifying the kernel, as well as CID Update during connection migration.
  2. Support non-destructive QUIC upgrade and other operation and maintenance problems completely in user mode without modifying the kernel;
  3. Support true 0RTT and increase the ratio of 0RTT.

Why these capabilities are supported will be described later.

Two landing scenes

Our two landing scenes from near to far are as follows:

Scenario 1: Alipay mobile terminal landing

The following is the schematic diagram of our landing architecture. Alipay mobile client carries HTTP request through QUIC, and forwards the request to Spanner (7-layer gateway developed by Ant internal based on NGINX) through QUIC LB and other four-layer gateways. We Proxy the QUIC request into TCP request on Spanner and send it to service gateway (RS).

The specific scheme selection is as follows:

  • The supported QUIC version is gQUIC Q46.
  • NGINX QUIC MODULE support QUIC access and PROXY into TCP ability;
  • Support all RPC requests including mobile payments, funds, ant Forest;
  • There are two ways to select a QUIC link:
    • In Backup mode, the TCP link is degraded to the QUIC link when the TCP link is unavailable.
    • In Smart mode, TCP and QUIC race. In the case that TCP is weaker than QUIC, QUIC link is actively used in the next request.

In this scenario, the dividends that can be earned by using QUIC include:

  1. When the client connection is migrated, the chain can continue to serve;
  2. The client can save the TCP three-way handshake time when initiating a connection for the first time.
  3. For weak networks, the transmission control of QUIC can improve the transmission performance.

Scenario 2: Accelerating landing overseas

Since 2018, Ant Group has independently developed an overseas dynamic acceleration platform AGNA (Ant Global Network Accelerator) to replace the acceleration services provided by third-party manufacturers. By deploying Local Proxy(LP) overseas and Remote Proxy(RP) in China, the AGNA sends users’ overseas requests back to the source country through an accelerated link between LP and RP. As shown in the figure below, we deployed QUIC on the link between LP and RP.

On the overseas access point (LP), each TCP connection is hosted by Proxy as a Stream on THE QUIC. On the domestic access point (RP), each QUIC Stream is Proxy as a TCP connection. A QUIC long connection is used between LP and RP.

In this scenario, the dividends that can be earned by using QUIC include:

  1. The Stream on the QUIC long connection carries THE TCP request to avoid cross-sea connection every time.
  2. For cross-sea networks, QUIC’s transmission control can improve transmission performance.

Three key patents

So far, we have protected some innovative technology points in the implementation process by applying for patents, and actively share our research results in the STANDARDIZATION of IETF, including:

A patent

In the landing scenario 2, the method of QUIC Stream four-layer proxy is used to protect the patent of the overseas return source acceleration method, and we propose “a link acceleration method based on QUIC protocol proxy”. At present, this patent has been authorized by the United States patent, patent number: CN110213241A.

Patent 2

We protect the QUIC LB component in our ground frame as a patent, and propose “a stateless, consistent and distributed QUIC load balancing device”, which is still being accepted at present. Since QUIC LB can well support the connection migration of QUIC protocol, there is a Draft related to QUIC LB on THE IETF QUIC WG at present. We have participated in the discussion and formulation of the Draft, and the subsequent schemes will continue to be promoted to cloud products.

Patent three

The UDP lossless upgrade method we solved is patented, and “a lossless upgrade scheme for QUIC server” is proposed, which is still being accepted at present. Because UDP lossless upgrade problem is a difficult problem in the industry, some current means need to jump in user mode, performance loss is large, our solution can solve the current problem in our landing framework, about the details of this solution we are introduced in the following key technologies.

Four key technologies

In the whole process, we designed a solution to solve several core problems and formed four key technologies, which are as follows:

Tip 1: Gracefully supports connection migration capabilities

First, let’s talk about the problem of connection migration. As mentioned above, one of the important functions of QUIC is to support connection migration. Connection migration here means that if the client switches the network while the long connection remains, for example, from 4G to Wifi, or QUIC can continue the connection state on the new quintuple because of the change of quintuple caused by NAT Rebinding. QUIC supports connection migration partly because QUIC is based on connectionless UDP, and partly because QUIC uses a unique CID to identify a connection, rather than a quintuple.

The following figure shows A schematic diagram of the connection supported by QUIC. When the client egress address is switched from A to B, the CID remains unchanged, so the corresponding Session status can still be queried on the QUIC server.

However, the theory is full, but the landing is very difficult. In the end-to-end landing process, due to the introduction of load balancing devices, all the mechanisms that rely on quintuple Hash for forwarding or Session association will be invalid during connection migration. In the case of LVS, after connection migration, LVS ‘reliance on quintuple addressing will result in inconsistencies in addressing servers. Even if the LVS addressing is correct, when the packet arrives at the server, the kernel will associate the process according to the quintuple, and the addressing will still fail. At the same time, IETF Draft requires that CID be updated when connections are migrated, making a plan to rely solely on CID for forwarding also impractical.

In order to solve this problem, we designed the ground frame introduced at the beginning. Here, we simplified and abstracted the scheme. The overall idea is as follows:

1. We have designed a QUIC LoadBalancer mechanism for the four-layer load balancing:

  • We have extended some fields in the CID of QUIC (ServerInfo) to associate the IP and Working Port information of the QUIC Server;
  • During connection migration, QUIC LoadBalancer can rely on ServerInfo in CID for routing to avoid the problem caused by relying on quintuple Session association.
  • When CID needs to be updated, the ServerInfo in NewCID remains unchanged. In this way, when CID needs to be updated, only the CID Hash is used to select the back end.

2. In the multi-process working mode of QUIC server, we break through the shackles of NGINX’s inherent multi-worker monitoring on the same port, and design a multi-port monitoring mechanism. Each Worker is isolated on the working port. In addition, the information of the port is carried in the CID of the Packet return of First Initial Packet. The advantages of this proxy are as follows:

  • Regardless of connection migration, QUIC LB can forward packets to the correct process based on ServerInfo.
  • The common solution is to modify the kernel to Reuse port to Reuse CID, where the kernel selects processes based on CID. Although it can be supported by EBPF and other means later, we believe that this mechanism of modifying the kernel is too dependent on the bottom layer, which is not conducive to large-scale deployment, operation and maintenance of the solution, especially in the public cloud.
  • Using a separate port also helps solve the UDP lossless upgrade problem in multi-process mode, which is described in Technical point 3.

Tip 2: Improve the 0RTT handshake ratio

Here, the principle of QUIC 0RTT is introduced. As we mentioned earlier, QUIC supports both transport-layer and section-layer handshakes in one 0RTT. TLS1.3 itself supports 0RTT for cryptographic layer handshakes, so it’s no surprise. How does QUIC implement transport layer handshake support for 0RTT? Let’s take a look at the purpose of the transport-layer handshake. That is, the server verifies that the client is the one who really wants to shake hands, and that the address is not spoofing, thus avoiding forged source address attacks. In TCP, the server relies on the last ACK of the three-way handshake to verify that the client is a real client, that is, only the real client will receive the syn_ACK from Sever and reply.

QUIC also needs to verify the source address of the handshake, otherwise there will be UDP DDOS problem, so how to implement QUIC? Depends on the Source Address Token (STK) mechanism. Like TLS, QUIC’s 0RTT handshake is established on the basis of a previous connection to the same server, so if it is a pure first connection, it still needs an RTT to obtain the STK. As shown in the figure below, let’s introduce this principle:

  1. Similar to the Session Ticket principle, the Server encrypts the client’s address and current Timestamp with its own KEY to generate an STK.
  2. The Client sends the STK to the Client for the next handshake. Because the STK cannot be tampered with, the Server decrypts it using its own KEY. If the obtained address is the same as the Client’s handshake address and the handshake time is within the validity period, the Client is trusted and the connection can be established.
  3. Since the client does not have this STK when it shakes hands for the first time, the server will reply to REJ with the message of this handshake and carry the STK.

In theory, as long as the client caches the STK and brings it to the server for the next handshake, the server can verify it directly, which implements 0RTT for the transport layer. But there are two problems with the real world:

  1. Because the STK is server-side encrypted, the next time the client is routed to another server, the server needs to recognize it as well.
  2. Encode in STK is the address of the last client. If the address carried by the next client changes, it will also lead to verification failure. This phenomenon is highly likely to occur on mobile terminals, especially in IPV6 scenarios, where the egress address of the client often changes.

Let me tell you a little bit about our solution. The first problem is easier to solve, as long as we ensure that the machines in the cluster generate the same secret key for STK. For the second question, our solution is as follows:

  1. We extended a Client ID in STK. This Clinet ID is generated by the Client through the wireless bodyguard black box and globally unique, similar to the SIMID of a device, which the Client passes to the server through the encrypted Trasnport Parameter. The server includes this ID in the STK;
  2. If the STK fails due to the change of the Client IP address, the Client ID is verified. The ID of a Client is never changed. Therefore, the verification succeeds on the premise that the Client is real. To prevent leakage of the Client ID, traffic limiting is implemented for the Client ID verification capability.

Technical point 3: QUIC non-destructive upgrade is supported

We know that UDP lossless upgrade is an industry challenge. Lossless upgrade is when, on reload or binary update, the old process can run out of data on the existing connection and exit gracefully. Using NGINX as an example, here is how TCP handles lossless upgrades, mainly in the following two steps:

  1. The old process closes the listening socket first and then closes the connection socket after all connection requests are completed.
  2. The new process inherits the Listening socket from the old process and starts accepting new requests.

UDP cannot be upgraded losslessly because UDP has only one listening socket and does not have a tcp-like connection socket. All packets are sent and received on this socket, causing problems in the following hot upgrade steps:

  1. During a hot upgrade, the old process forks a new process, which inherits the listening socket and starts recv MSG.
  2. If the old process closes the Listenging socket, the data packets in transit cannot be received and the graceful exit cannot be achieved.
  3. If you continue to listen, both the old and new processes receive packets from the new connection. As a result, the old process cannot exit.

Here are the related solutions. To solve this problem, some methods are used. For example, the process ID is carried in the data packet. If the data packet is incorrectly sent and received, the process is forwarded between the old and new processes. For performance reasons at the access layer, we don’t want the data to jump again. In combination with our landing architecture, we designed the following lossless upgrade scheme based on multi-port rotation. In simple terms, we let old and new processes listen on different port groups and carry them in CID, so that QUIC LB can forward to new and old processes based on port. In order to facilitate operation and maintenance, port rotation is adopted. After being reload N times, the old and new processes will restart the selected port. As shown below:

  1. During the non-destructive upgrade, the Baseport port of the old process is closed so that the first Intial Packet is no longer accepted. This is similar to disabling the TCP Listening socket.
  2. The working port of the old process, which continues to work and receives residual traffic from the current process.
  3. The Baseport of the new process starts to work, which is used to receive the first Initial packet and open a new connection, similar to the LISTENING socket of TCP.
  4. The working port of the new process = (I + 1) mod N, where N refers to the number of times the Old and new process states can be simultaneously reloaded. For example, N = 4 means that four Old, New1, New2, and New3 states can coexist simultaneously. I is the port number of the previous process. Here + 1 is because there is only one worker. If there are M workers, add M.
  5. The connection is transferred to the Working Port of the new process’s listening Port by the Load Balancer.

Technical point 4: Intelligent uplink selection on the client

Despite the good intentions of landing QUIC, the development of new things is not smooth sailing. Since QUIC is based on UDP, UDP is not friendly to carriers compared with TCP.

  1. UDP traffic is often restricted when bandwidth is tight.
  2. Some firewalls Drop UDP packets directly;
  3. The UDP Session lifetime of the NAT gateway is also short.

At the same time, according to the observation, different mobile phone manufacturers have different support capabilities for UDP, so in the landing process, if all traffic is blindly cut to QUIC, it may lead to some unexpected results. To this end, we designed the TCP and QUIC Backup link introduced in the beginning on the client, as shown in the figure below, we detected the RTT, packet loss rate, request completion time, error rate and other indicators of TCP and QUIC links in real time, and scored the two links according to certain quantitative methods. According to the score, you can select the link to avoid the problem caused by only one link.

Make a summary

This paper mainly introduces the landing scheme, scene and some key technologies of QUIC in ant. In terms of key technologies, it mainly introduces how we creatively propose THE QUIC LB component, as well as the multi-port monitoring mechanism to elegantly support the QUIC connection migration mechanism, the NON-destructive upgrade of THE QUIC server, etc. Depending on this scheme, our access gateway does not need to rely on the changes of the underlying kernel like the industry. This greatly facilitates the deployment of our solution, especially in public cloud scenarios. In addition to connection migration, we also proposed 0RTT connection enhancement scheme and client intelligent uplink selection scheme to maximize the benefits of QUIC on mobile terminal. Up to now, QUIC has been running smoothly in alipay mobile terminal and global accelerator link, and has brought good business benefits.

The future planning

In the past two years, we mainly based on the community’S gQuic, gave full play to the protocol advantages of QUIC, combined with the business characteristics of ants to maximize the benefits of mobile terminal, creatively proposed some solutions, and actively promoted to the community and IETF. In the future, as Ant develops and explores more businesses and HTTP3.0/QUIC will soon become a standard, we will continue to explore the value of QUIC in the following directions:

  1. We will make use of the advantages achieved by QUIC in the application layer to design a unified QUIC transmission control framework with adaptive business types and network types, so as to optimize the network transmission experience of business for different types of business and network types.
  2. Switch FROM gQUIC to IETF QUIC to promote the further landing of standard HTTP3.0 in ant;
  3. The ant QUIC LB technology point was advanced to IETF QUIC LB, and finally evolved into the standard QUIC LB.
  4. Explore and implement MPQUIC (multipath QUIC) technology to maximize revenue in mobile terminals;
  5. Continue the QUIC performance optimization work using UDP GSO, eBPF, IO_URING and other internal nuclear technologies;
  6. Explore opportunities for QUIC to carry east-west traffic on the Intranet.

Follow us every week for 3 mobile technology practices & dry goods for you to think about!