This is the 20th day of my participation in the Genwen Challenge
Nginx process
Nginx starts with a master process and multiple worker processes
The master process is mainly used to manage worker processes. The master process receives signals from the outside world and then performs related operations according to the signals. For example, we usually use kill -hup Pid to gracefully restart Nginx or reload the configuration. In the gracefully restart process, the service is not interrupted. After receiving the HUP signal, the master process will reload the configuration file first, and then start the new worker process, and send signals to all the old worker processes, telling them to exit the new work Er starts to receive new requests after startup, while the old worker stops receiving new requests after receiving the signal from the master, and exits after all unprocessed requests in the current process are processed Sending signals directly to the master process is an old way of doing things. After version 0.8,Nginx introduced a series of command-line parameters for easy management, such as./ Nginx -s reload to restart Nginx, and./ Nginx -s stop to stop Nginx
Each worker process is fork from the master process. In the master process, the socket(listenFD) that requires listen is first established, and then multiple worker processes are fork. The listenFD of all worker processes will become readable upon the arrival of a new connection To ensure that only one process processes the connection, all worker processes rush accept_mutex before registering the listenfd read event. The process that wins the mutex registers the listenFD read event and calls Accept in the read event to accept the connection. When a worker process accepts the connection, it starts to read the request, parse the request, process the request, generate the data, and then return it to the client, and finally disconnect the connection. That is, a request is handled entirely by the worker process, and only in one worker process
Worker uses the process model
- Each worker process is an independent process and does not need to be locked, which saves the overhead caused by locking. At the same time, it is much more convenient in programming and problem finding
- Independent processes can not affect each other: after one process exits, other processes are still working, the service is not interrupted, and the master process starts the new worker process soon
- Nginx handles requests in an asynchronous, non-blocking manner, so it can handle thousands of requests simultaneously; Apache’s works is commonly used for each request will be dominant in a worker thread, when concurrency to thousands of thousands of threads in processing requests at the same time, this will cause a lot of pressure to the operating system, the thread of the footprint is very large, the thread context switching the CPU overhead is large, the performance is unavailable
Nginx works in asynchronous non-blocking mode
- Nginx uses an asynchronous non-blocking event processing mechanism. Specifically, system calls such as SELECT /poll/epoll/kqueue provide a mechanism to monitor multiple events at the same time. They are blocked, but the timeout can be set within the timeout period, if there is an event ready Then return
- Take epoll as an example. When an event is not ready, we put it in epoll. When the event is ready, we read and write it
- Because there is only one thread, there is only one request that can be processed at the same time, but it is constantly switching between requests. Switching is also because the asynchronous event is not ready and the initiative is given out. The switching here is without any cost; Compared with multithreading, this kind of event handling methods there is a big advantage, do not need to create threads, each request memory also rarely, no context switching, event handling is very lightweight, no amount of concurrency will lead to unnecessary waste of resources (context switching), more concurrency, just takes up more memory
- Nginx recommends setting the number of workers to the number of CPU cores, because more workers will only cause processes to compete for CPU resources, resulting in unnecessary context switching. Moreover, in order to better utilize the multi-core feature, Nginx provides the BINDING option of CPU affinity, which can bind a process to a certain core The cache will not fail due to process switching. Small optimizations like this are common in Nginx, such as converting 4 characters to an int for a 4-byte string comparison to reduce CPU instructions, etc
Timer handling mechanism in Nginx code
Since functions such as epoll_wait can be called with a timeout, Nginx uses this timeout to implement timers. In nGINx, timer events are stored in a red-black tree, which is used to maintain timers. Before entering epoll_WAIT, the minimum time of all timer events is obtained from the tree, and the epoll_wait timeout is calculated. Therefore, when no event is generated and there is no interrupt signal,epoll_wait will time out, that is, the timer event is up, and nGINx will check all timeout events and set their status to timeout before processing network events
When we write Nginx code, the first thing we usually do when handling the network event callback is to determine the timeout and then handle the network event
The concept of connection
- Connection in Nginx is the encapsulation of TCP connections, including connection sockets, read events, and write events
- Nginx encapsulates a connection to make it easy to use Nginx to handle things related to connections, such as establishing connections, sending and receiving data, etc
- The processing of HTTP requests in Nginx is built on connection, so Nginx can be used not only as a Web server but also as a mail server
- You can also interact with any back-end service using the Connection provided by Nginx
Nginx uses Connection to handle the connection lifecycle
- At startup, Nginx parses the configuration file to determine the port and IP address to listen on. The master process initializes the socket(creates the socket, sets the addrreuse option, binds it to the specified IP address port, and then forks the socket Multiple child processes come out and then compete to accept new connections
- The client initiates a connection to Nginx. When the client establishes a connection with the server through a three-way handshake, one of Nginx’s child processes accepts the socket, and then creates an NGx_Connection_t structure to encapsulate the connection, and then sets up read and write Event handler and add read and write events to exchange data with the client
- Finally,Nginx or the client takes the initiative to close the connection, at which point a connection dies
- Nginx can also act as a client to request data from other servers (such as the upstream module). Connections to other servers are encapsulated in ngx_Connection_t. As a client,Nginx obtains an ngX_Connection_t structure and then creates s Ocket and set socket properties (such as non-blocking), then call the connection by adding read/write events, call connect/read/write, and finally close the connection and release ngx_Connection_t
Maximum number of connections for worker processes
- Each worker process in Nginx has an upper limit on the number of connections, which is different from the system limit on fd. In the operating system, ulimit is used -n is the maximum number of fd’s that a process can open, that is, nofile. Because each socket connection consumes one FD, this also limits the maximum number of connections that our process can make. This also directly affects the maximum number of concurrent connections that our program can support
- Nginx sets the maximum number of connections supported by each process by setting worker_Connectons. If this value is greater than nofile, then the actual maximum number of connections is nofile, and Nginx will warn you
- When implemented, Nginx is managed by a connection pool. Each worker process has an independent connection pool whose size is worker_connections. The connection pool stored in this pool is not a real connection, it is just an N of the size of worker_connections An array of gx_connection_t structures, and Nginx stores all free ngX_connection_T connections in a linked list called free_connections. Each time a connection is acquired, Nginx takes one from the free connection list and puts it back into the free connection list when it is used up
- The worker_connections parameter indicates the maximum number of connections that each worker process can make, so the maximum number of connections that an Nginx can make is worker_connections * worker_processes; For HTTP requests to local resources, the maximum number of concurrent requests that can be supported is worker_connections * worker_processes, whereas for HTTP as a reverse proxy, the maximum number of concurrent requests should be worker_connections * Worker_processes /2, because as a reverse proxy server, each concurrency establishes a connection to the client and a connection to the back-end service, occupying two connections
Multi-worker process fair competition mechanism
- Multiple process of free competition will be a client connection, if a process to get the opportunity to accept more, its free connection soon ran out, if you don’t do some control in advance, when to accept a new TCP connection, because we cannot get the free connection, and failed to transfer this connection to other processes and eventually lead to the TCP connection Abort without being processed
- To ensure fair competition between processes,Nginx requires only accept_mutex processes to add accept events. In other words,Nginx controls whether processes add accept events Accept event,Nginx uses a variable called ngx_accept_disabled to control whether accept_mutex locks are contested. The value of ngx_accept_DISABLED is first calculated as 1/8 of the total number of connections made by a single Nginx process, minus the remaining empty value The value of ngx_accept_disabled is greater than 0 only when the number of connections remaining is less than 1/8 of the total number of connections. The smaller the number of connections remaining, the greater the value. When ngx_accept_DISABLED is greater than 0, no attempt is made to obtain the accept_MU Tex lock, and decrement ngx_accept_disabled by 1, so that each time it reaches this point, it decrement by 1 until it is less than 0
- By not obtaining an accept_mutex lock, we give up the opportunity to obtain a connection. When there are fewer free connections,ngx_accept_disable is used The larger the lock, the greater the opportunity for other processes to acquire the lock, the greater the opportunity for other processes to acquire the lock, the greater the opportunity for other processes to acquire the lock, the greater the opportunity for Nginx to control the connection pool of other processes, and the balance of connections between multiple processes