Nginx is a well-known high-performance server in the industry and is widely used. Its high performance is due to its excellent architecture design, which mainly includes the following points: modular design, event-driven architecture, multi-stage asynchronous processing of requests, management process and multi-worker process design, memory pool design, as described below.

Modular Design A highly modular design is the foundation of Nginx’s architecture. In Nginx, everything except a little core code is a module.

There are five types of modules in Nginx: core module, configuration module, event module, HTTP module and mail module. The relationship between them is as follows:

Among the five modules, the configuration module and the core module are closely related to the Nginx framework. The event module is the basis of the HTTP and Mail modules. The HTTP module has a similar “status” to the Mail module in that both are more application-focused.

Event-driven architecture: Event sources generate events, an event collector collects and distributes events, and an event handler processes these events (the handler registers the events it wants to process in the event collector).

For the Nginx server, events are generally generated by network adapters and disks, and the event module in Nginx is responsible for the collection and distribution of events. All modules may be event consumers. They first need to register the event types they are interested in with the event module. In this way, when an event occurs, the event module will distribute the event to the corresponding module for processing.

For traditional Web servers (such as Apache), the so-called event driven is often limited to the establishment and closure of TCP connections. After a connection is established, all operations before it is closed are no longer event driven, and then it will degrade into the batch mode of sequential execution of each operation. Each request will occupy system resources once the connection is established and will not be released until it is closed.

This pattern of requests sitting on server resources waiting to be processed is a huge waste of server resources. As the figure below shows, traditional Web servers tend to treat a process or thread as a consumer of time. When an event generated by a request is processed by that process, the process resources are occupied by that request until the request is processed. A typical example of this is the Apache synchronous blocking multi-process mode.

A simple model of how traditional Web servers handle events (rectangles represent processes) :

Nginx uses event-driven architecture to handle business in a different way from traditional Web servers. It does not use processes or threads as event consumers, which can only be modules. Only event collectors and distributors are eligible to consume process resources. When they distribute an event, they call the event consumption module to use the currently occupied process resources.

As shown in the figure below, five different events are listed. After the five events are collected in order during one processing by the event collection, distributor process, the current process is used to distribute the event, thus invoking the corresponding event consumer to process the event. Of course, this distribution, invocation is also ordered.

Nginx’s simple model for handling events

By the picture above you can see, the handle request events, Nginx event consumers just been event distributor process short call, this design makes the network performance was improved, user request delay of perception are, each of the user’s request events will response in time, the network throughput of the whole server will increase due to the timely response of the event.

Of course, this also brings some requirements, that is, each event consumer should not have blocking behavior, otherwise it will occupy the event distributor process for a long time and other events will not respond in time. The non-blocking feature of Nginx is because its modules meet this requirement.

Multistage asynchronous processing of requests Multistage asynchronous processing requests are closely related to event-driven architectures, that is, multistage asynchronous processing of requests can only be implemented based on event-driven architectures.

Multi-stage asynchronous processing is to divide the processing process of a request into multiple stages according to the triggering method of events, and each stage can be triggered by event collection and distributor.

The stages of processing an HTTP request for a static file and the events that trigger each stage are as follows:

In this example, the request is divided into seven phases, which can be repeated, so a static download resource request can be broken down into hundreds or thousands of phases as shown in the figure above due to the large amount of data requested, unstable network speed, etc.

Asynchronous processing and multiphase processing are complementary to each other, and only when the request is divided into multiple phases can there be so-called asynchronous processing. When a time is distributed to an event consumer for processing, the event consumer’s processing of the event is only equivalent to the processing of one request phase.

When can we handle the next stage? This can only wait for notification from the kernel, which means that the next time an event occurs, an event dispatcher such as epoll will get the notification and then call the event consumer for processing.

Management process and multi-worker process Design After Nginx is started, there will be a master process and multiple worker processes. The master process is mainly used to manage worker processes, including receiving signals from the outside world, sending signals to all worker processes, monitoring the running status of worker processes and starting worker processes.

The worker process is used to handle request events from clients. Multiple worker processes are equal, they compete equally for requests from clients, and each process is independent of each other. A request can only be processed in one worker process. The number of worker processes can be set, which is generally the same as the number of CPU cores of the machine. The reason is related to the event processing model. The Nginx process model can be represented by the following figure:

Check the Nginx process on the server:

This design brings the following advantages:

1. Take advantage of the concurrent processing capabilities of multi-core systems

Modern operating systems already support multi-core CPU architectures, which allow multiple processes to work on different CPU cores. All worker processes in Nginx are completely equal. This improves network performance and reduces the latency of requests.

2. Load balancing

Multiple worker processes realize load balancing through inter-process communication, that is, when a request arrives, it is more likely to be allocated to the worker process with lighter load for processing. This also improves the network performance and reduces the request delay to some extent.

3. The management process is responsible for monitoring the status of the work process and managing its behavior

The administrative process consumes few system resources and is simply used to start, stop, monitor, or otherwise control worker processes. First of all, it improves the reliability of the system. When the worker process has problems, the management process can start a new worker process to avoid the degradation of the system performance.

Secondly, the management process supports the operation of program upgrade and configuration item modification during the running of Nginx service. This design makes dynamic scalability and dynamic customization easier to realize.

The design of the memory pool To avoid memory fragments, apply to the operating system memory less frequently, lower the development complexity of the various modules, Nginx designed a simple memory pool, its role is mainly to integrate multiple apply to the system for the operation of the memory into one, which greatly reduces the consumption of CPU resources, reduce the memory fragments at the same time.

As a result, there is usually a simple separate pool of memory for each request (such as one allocated for each TCP connection), and at the end of the request the entire pool is destroyed, returning the allocated memory to the operating system in one go.

This design greatly improves the module development more simple, because the module does not need to care about its memory release after the application; And the latency of the request execution is reduced because the number of times the memory is allocated is reduced. Meanwhile, by reducing memory fragmentation, the effective utilization of memory and the number of concurrent connections that can be processed by the system are improved, thus enhancing network performance.