Nginx unveils the core architecture design to help you understand the way to high concurrency

From: I like the three frameworks | coordinating editor: le le

Link: my.oschina.net/u/3906190/blog/1859060

Nginx is a free, open source, high-performance HTTP server and reverse proxy. It is known for its high performance, stability, rich features, simple configuration and low resource consumption. Nginx is a Web server that can also be used as a load balancer and HTTP cache.

Many high profile sites use Nginx, such as Netflix, GitHub, SoundCloud, MaxCDN, etc.

1. Overall architecture of Nginx

1.1. The main process

When Nginx starts, it generates two types of processes, one master, one (currently only one on Windows) or multiple workers.

The main process does not handle network requests. It is responsible for scheduling worker processes, which are shown in the following three:

Load the configuration
Start the working process
Non stop to upgrade

Therefore, when Nginx is started and you look at the operating system’s process list, you can see at least two Nginx processes.

1.2. Work process

The server actually handles network requests and responds to worker processes. On unix-like systems, Nginx can be configured with multiple workers, and each worker process can handle thousands of network requests at the same time.

1.3. Modular design

Nginx worker processes include core and functional modules. The core module is responsible for maintaining a run-loop and executing module functions at different stages of network request processing.

For example, network read/write, storage read/write, content transfer, outbound filtering, and sending requests to upstream servers.

The modular design of its code also makes it possible for us to select and modify the functional modules appropriately according to needs and compile them into servers with specific functions.

1.4. Event-driven model

The asynchronous and non-blocking event-driven model is the key to Nginx’s high concurrency and high performance. It also benefits from the adoption of event notification and I/O performance enhancements in Linux, Solaris, and BSD-like operating systems. Such as kqueue, epoll, and Event ports.

1.5. Proxy design

Proxy design, can be said to be Nginx deep bone marrow design, whether for HTTP, or for FastCGI, Memcache, Redis and other network requests or responses, essentially adopt the proxy mechanism. Therefore, Nginx is inherently a high-performance proxy server.

2. Modular design of Nginx

A highly modular design is the foundation of Nginx’s architecture. Nginx server is decomposed into multiple modules, each module is a functional module, only responsible for its own functions, between modules strictly follow the principle of “high cohesion, low coupling”.

As shown below:

2.1. Core modules

The core module is an essential module for the normal operation of the Nginx server, providing error logging, configuration file parsing, event-driven mechanism, process management and other core functions.

2.2. Standard HTTP module

The standard HTTP module provides functions related to HTTP protocol parsing, such as port configuration, web page encoding Settings, HTTP response header Settings, and so on.

2.3. The HTTP module is optional

The optional HTTP module extends the standard HTTP functionality to allow Nginx to handle special services such as Flash multimedia transport, GeoIP request resolution, network transport compression, and SSL support.

2.4. Mail service module

Mail service module is mainly used to support Nginx mail service, including POP3 protocol, IMAP protocol and SMTP protocol support.

2.5. Third-party modules

The third party module is designed to extend the Nginx server application, complete the developer custom functions, such as: Json support, Lua support, etc.

3. Nginx request processing

Nginx is a high-performance Web server capable of handling a large number of concurrent requests simultaneously. It combines multi-process mechanism and asynchronous mechanism, asynchronous mechanism is the use of asynchronous non-blocking mode, next to introduce you to Nginx multi-thread mechanism and asynchronous non-blocking mechanism.

3.1. Multi-process mechanism

Whenever the server receives a client, the master process generates a worker process to establish a connection and interact with the client until the connection is disconnected and the child process ends.

The advantage of using processes is that each process is independent from each other and does not need to be locked, which reduces the impact of locking on performance, reduces the complexity of programming, and reduces the development cost.

Secondly, independent processes can not affect each other. If one process exits abnormally, other processes work normally, and the master process starts the new worker process quickly to ensure that the service will not be interrupted, thus minimizing the risk.

The disadvantage is that when the operating system generates a child process, it needs to perform operations such as memory replication, which incurs some overhead in resources and time. A large number of requests may degrade system performance.

3.2. Asynchronous non-blocking mechanisms

Each worker process can handle multiple client requests in an asynchronous, non-blocking manner.

When a worker process receives a request from a client, it calls IO to process it. If it does not get a result immediately, it processes another request (i.e., non-blocking). In the meantime, the client does not need to wait for a response and can do something else (i.e., asynchronous).

When the IO returns, the worker process is notified, and the process is notified to suspend the current transaction to respond to the client request.

4. Nginx event-driven model

In Nginx’s asynchronous non-blocking mechanism, a worker process calls IO and then processes other requests. When the IO call returns, the worker process is notified.

For such system calls, the event-driven model of the Nginx server is mainly used, as shown in the following figure:

As shown in the figure above, Nginx’s event-driven model is based on three parts: event collector, event sender, and event handler

Event collector: responsible for collecting various IO requests of worker process;
Event sender: responsible for sending IO events to the event handler;
Event handler: Responsible for the response of various events.

The event sender puts each request into a list of pending events and invokes the event handler to process the request using non-blocking I/O.

The processing method is called multiplex I/O multiplexing method, which includes the following three types: SELECT model, poll model, and epoll model.

5. Nginx process processing model

Nginx server uses master/worker multi-process mode, multi-thread startup and execution process is as follows:

After the Master process is started, external signals are received and processed through a for loop
The main process generates worker child processes through the fork() function, and each child process executes a for loop to realize the receipt and processing of events by the Nginx server

It is generally recommended that the number of worker processes be the same as the number of CPU cores. In this way, there is no large number of sub-process generation and management tasks, avoiding the cost of CPU resource competition between processes and process switching.

Nginx also provides CPU affinity binding options to take advantage of the multi-core feature. We can bind a certain process to a certain core, so that the Cache will not be invalidated due to process switching.

For each request, one and only one worker process processes it. First, each worker process is forked from the master process. In the master process, after establishing the socket (listenfd) that needs listen, fork out multiple worker processes.

The listenFD of all worker processes becomes readable upon the arrival of a new connection. To ensure that only one process processes the connection, all worker processes preempt accept_mutex before registering the ListenFD read event

The process that grabbed the mutex registered listenFD for the read event, which called Accept to accept the connection.

When a worker process accepts the connection, it starts to read the request, parse the request, process the request, generate the data, and then return it to the client. Finally, the connection is disconnected. A complete request is like this.

We can see that a request is handled entirely by the worker process, and only within one worker process.

As shown below:

During the running of the Nginx server, the main process and the worker process need process interaction. The interaction relies on the Socket implementation’s pipeline.

5.1. The main process interacts with the worker process

This pipe is different from the ordinary pipe, it is a one-way pipe from the main process to the worker process, including the master process to the worker process instruction, process ID and so on. At the same time, the main process communicates with the outside world by signal. Each child process has the ability to receive signals and process corresponding events.

5.2. Worker processes interact with worker processes

This interaction is basically the same as the main-worker interaction, but is done indirectly through the main process, and the worker processes are isolated from each other.

Therefore, when worker process W1 needs to send instructions to worker process W2, first find the process ID of W2, and then write the correct instructions to the channel pointing to W2, W2 receives the signal and takes corresponding measures.

summary

Through this article, we have a general understanding of the overall architecture of the Nginx server. Including its modular design, multi-process and asynchronous non-blocking request handling, event-driven model, etc.

Through these theoretical knowledge, can better understand the design ideas of Nginx. It’s a great help for us to learn about Nginx.

For more technical articles, please pay attention to wechat public number: Java Programmers Gathering Place