API Gateway performance comparison: NGINX vs. ZUUL vs. Spring Cloud Gateway vs. Linkerd

A few days ago I read an article by Dr. Turgay Celik, a senior engineer at OpsGenie, a company dedicated to Dev & Ops, about how they initially adopted Nginx as a gateway for monolithic applications. The adoption of other components began with the introduction of the microservices architecture.

I like to get to the bottom of what I’m doing or what technology I’m interested in, so when I read an article and I didn’t see the design ideas I wanted to see, I would look around for information. Besides, the article was about the Spring Cloud that I was working on, so I decided to write an article. Try to explain why there is such a performance difference from the design idea.

Technology is introduced

Nginx, ZUUL, Spring Cloud, Linkerd, etc. (and Envoy and UnderTow are optional API gateways too, so I’ll explain them separately, starting with API gateways.)

API gateway

The reason for the emergence of API gateway is the emergence of micro-service architecture. Different micro-services generally have different network addresses, and external clients may need to call the interfaces of multiple services to complete a business requirement. If clients are allowed to directly communicate with each micro-service, the following problems will occur:

The client requests different microservices multiple times, increasing the complexity of the client.
Cross-domain requests exist and are complicated to process in certain scenarios.
Authentication is complex and each service requires independent authentication.
It is difficult to refactor, and as the project iterations, microservices may need to be reclassified. For example, you might combine multiple services into one or split a service into multiple services. Refactoring can be difficult to implement if the client communicates directly with the microservice.
Some microservices may use firewall/browser-unfriendly protocols, making direct access difficult.

These issues can be addressed with API gateways. The API gateway is the middle layer between the client and server, where all external requests pass first. In other words, the implementation of API considers more business logic, while security, performance and monitoring can be done by API gateway, which not only improves business flexibility but also does not lack security. The typical architecture diagram is shown in the figure:

The advantages of using the API gateway are as follows:

Easy to monitor. Monitoring data can be collected at the gateway and pushed to external systems for analysis.
Easy to authenticate. Authentication can be done on the gateway and then requests can be forwarded to the back-end microservice instead of authentication in each microservice.
Reduce the number of interactions between the client and each micro-service.

NGINX service

Nginx consists of a kernel and modules. The design of the kernel is very small and concise. The work done by the kernel is also very simple.

The following diagram shows the general flow of an HTTP request:

Nginx modules are compiled directly into Nginx and are therefore statically compiled. When Nginx is started, the modules for Nginx are loaded automatically, unlike Apache, which first compiles the modules into an SO file and then specifies whether to load them in the configuration file. When parsing configuration files, each Nginx module may handle a request, but only one module can handle the same request.

After Nginx is started, there will be a Master process and multiple Worker processes. Master process and Worker process interact with each other through interprocess communication, as shown in the figure. Worker The choke point of a Worker process is at I/O multiplexing function calls such as SELECT (), epoll_wait(), etc., waiting for a data read/write event to occur. Nginx handles requests in an asynchronous, non-blocking manner, which means that Nginx can handle thousands of requests simultaneously. The number of requests that a Worker process can process at the same time is only limited by the size of memory. Moreover, in terms of architectural design, there is almost no restriction of synchronization lock when processing concurrent requests between different Worker processes. Worker processes usually do not go to sleep. When the number of processes on Nginx is equal to the number of CPU cores (ideally each Worker process is bound to a specific CPU core), the cost of interprocess switching is minimal.

Netflix Zuul

Zuul is Netflix’s open source microservices gateway component that works with Eureka, Ribbon, Hystrix, and more. At the heart of Zuul is a series of filters that do the following:

Authentication and security: Identify the authentication requirements for each resource and reject those that do not comply with the requirements.
Review and monitor: Track meaningful data and statistical results with edge locations, resulting in an accurate production view.
Dynamic routing: Dynamically routing requests to different back-end clusters.
Stress test: Incrementally increase traffic to the cluster to understand performance.
Load allocation: Allocate capacity for each load type and discard requests that exceed the limit.
Static response processing: Part of the response is built directly at the edge to avoid forwarding to the internal cluster.
Multi-region elasticity: Request routing across AWS Regions is designed to diversify the use of The Elastic Load Balancing (ELB) and bring the edges of the system closer to users of the system.

These are features that Nigix doesn’t have, because Netflix created Zuul to solve a lot of problems in the cloud (specifically helping AWS with implementing these features across regions), rather than just being a reverse proxy like Nigix, of course, We can just use the reverse proxy feature, which is not described here.

Zuul1 is built on a Servlet framework, as shown in the figure, using a blocking and multithreading approach, in which one thread processes a connection request, which can lead to an increase in live connections and threads in cases of high internal latency and device failures.

The big difference with Zuul2 is that it runs on an asynchronous and non-blocking framework, one thread per CPU core, which handles all requests and responses. The lifecycle of requests and responses is handled through events and callbacks, which reduces the number of threads and is therefore less expensive. Since the data is stored on the same CPU, cpu-level caches can be reused, and the aforementioned delays and retry storms are mitigated by the number of connections and events queued for storage (much lighter and less costly than thread switching). This change is sure to significantly improve performance, and we’ll see what happens later in the test session.

Today we are talking about API gateway performance, which also involves high availability. A brief introduction to Zuul’s high availability features is critical because external requests to back-end microservices flow through Zuul, so it is common to deploy highly available Zuul in production environments to avoid single points of failure. Generally we have two deployment options:

1. The Zuul client is registered with Eureka Server

This is a relatively simple case where multiple Zuul nodes need to be registered with Eureka Server to achieve high availability of Zuul. In fact, high availability in this case is no different from other services doing high availability scenarios. Taking a look at the diagram below, when the Zuul client is registered with Eureka Server, only multiple Zuul nodes need to be deployed to achieve high availability. The Zuul client automatically queries the Zuul Server list from Eureka Server and then requests the Zuul cluster using load balancing components such as the Ribbon.

2. The Zuul client cannot register with the Eureka Server

If our client is a mobile APP, it is impossible to register on Eureka Server through scheme 1. In this case, Zuul can be highly available with additional load balancers such as Nginx, HAProxy, F5, etc.

As shown, the Zuul client sends the request to the load balancer, which forwards the request to one of its proxy Zuul nodes, thus making Zuul highly available.

Spring Cloud

Although Spring Cloud comes with the word “Cloud,” it is not a solution for Cloud computing, but rather a set of tools built on top of Spring Boot for rapidly building a common pattern for distributed systems.

Applications developed using Spring Cloud are very suitable for deployment on Docker or PaaS, so they are also called Cloud native applications. Cloud native can be simply understood as a software architecture for a cloud environment.

Since it is a toolset, it must contain many tools. Let’s look at the following diagram:

Since only API gateway comparisons are involved here, I won’t go through the other tools.

Spring Cloud integrates Zuul, but from the perspective of Zuul, there are no major changes. However, the whole framework of Spring Cloud is integrated with components and provides far more functions than Netflix Zuul, so there may be differences in comparison.

The Service of the Mesh Linkerd

I think Dr. Turgay Celik chose Linkerd as one of the objects of comparison because Linkerd provides flexible Service Mesh for cloud native applications. Service Mesh provides lightweight high-performance network proxies and also provides microservices framework support.

Linkerd is our open source RPC proxy for microservices, based directly on Finagle, Twitter’s internal core library, which manages the flow of communication between different services. Virtually every Twitter online service is built on top of Finagle, which supports millions of RPC calls per second, and is designed to help users simplify operations under a microservices architecture, a service-to-service communication infrastructure layer dedicated to handling time-sensitive communications.

Like Spring Cloud, Linkerd offers load balancing, fusing machines, service discovery, dynamic request routing, retry and offline, TLS, HTTP gateway integration, transparent proxy, gRPC, distributed tracing, o&M, and many more. It adds another one for the technical selection of microservice framework. Since I have not contacted Linkerd, I cannot analyze the architecture level for the time being. I will supplement the content in this aspect later and make a technical selection by myself.

Performance test results

Dr. Turgay Celik’s article used Apache’s HTTP server performance evaluation tool, AB, as a test tool. Note that since this test is based on amazon (AWS) public cloud, it may differ from the test results on your actual physical machine.

In the experiment, two machines, the client and the server, were started, and multiple services to be tested were installed respectively. The client accessed the resources in several ways to try to obtain resources. The test scheme is shown in the figure below:

Dr Turgay Celik conducted the test in three environments:

Single CPU core, 1GB of memory: Used to compare Nginx reverse proxy with Zuul (excluding the average results after the first run);
Dual CPU cores, 8GB of RAM: Used to compare Nginx reverse proxy and Zuul (excluding the average results after the first run);
8 core CPUS, 32GB of RAM: Used to compare Nginx reverse proxy, Zuul (minus the average results after the first run), Spring Cloud Zuul, Linkerd.

In the test process, 200 parallel threads were used to send a total of 10,000 requests. The command template is as follows:

ab -n 10000 -c 200 HTTP://<server-address>/<path to resource>

Note: Since Dr. Turgay Celik’s test was based on Zuul 1, the performance was poor and could not truly reflect the performance of the current Zuul version. I will do my own experiments and publish the results in the following articles.

According to the above results, Zuul has the worst performance in single-core environment (950.57 times /s), and the best performance in direct access mode (6519.68 times /s). Compared with direct access mode, Nginx reverse proxy mode has a performance loss of 26% (4888.24 times /s). In dual-core environments, Nginx performed nearly three times better than Zuul (6,187.14 beats /s versus 2,099.93 beats /s). In a strong test environment (8 cores), direct access, Nginx, and Zuul were close, but Spring Cloud Zuul might have 873.14 requests per second due to overall internal consumption.

The final conclusion

From a product perspective, API gateways are responsible for service request routing, composition, and protocol transformation. All requests from the client first pass through the API gateway, which then routes the request to the appropriate microservice. API gateways often process a request by calling multiple microservices and combining the results. It can be converted between Web protocols (such as HTTP and WebSocket) and non-Web-friendly protocols used internally, so the selection of technical solutions is also important for the whole system.

From what I understand about the design of the four components, Zuul1 is similar to Nigix in that each I/O operation is executed from one of the worker threads and the request thread is blocked until the worker thread completes, but Nginx is implemented in C++ and Zuul is implemented in Java. The JVM itself has a slow first load. Zuul2 is definitely going to be a big improvement over Zuul1. In addition, Zuul performed poorly In the first test, but has been much better since the second test, probably due to Just In Time (JIT) optimization. Linkerd itself is a resource-sensitive gateway design, so comparing it to other gateway implementations in a general environment can be inaccurate.

The authors introduce

Mingyao Zhou, graduated from Zhejiang University, master of Engineering. 13 years of working experience in software development, 10 years of technical management experience, 4 years of distributed software development experience, submitted 17 invention patents. He is the author of Big Talk Java Performance Optimization, Deep Understanding of JVM&G1 GC, and How Technical Leadership Programmers Can Lead Teams. Michael_tec, wechat official account “Uncle Mike says at 10 every night”.

English text links as follows: engineering.opsgenie.com/comparing-a…

Thanks to Guo Lei for planning and reviewing this article.

Stamp here
ArchSummit Global Architect Summit