Nginx is widely used in Internet companies. The most important features of nginx are reverse proxies, load balancing, and caching. Therefore, it is necessary to be familiar with the use and in-depth understanding of Nginx.

Background component framework mainly has three kinds: Redis single process single thread, memcache single process multi-thread, nginx multi-process; After looking at Nginx, I also calculate the set.

Nginx is developed in a modular way, such as core module, Event module and HTTP module. In order to support multiple platforms, the Event module has encapsulation support for various platforms, such as Linux platform Epoll, MAC platform KQueue and so on. The HTTP module is then split into submodules.

This article focuses on how nginx is started and how requests are executed, so this article focuses on the following two points:

  1. Nginx starts the process;
  2. Important callback function Settings;
  3. Nginx handles HTTP requests;
  4. conclusion

1. Nginx starts the process

Nginx is so large that it’s hard to look at all the code in a short amount of time, and I don’t see much of it, so here’s an overview of Nginx from a macro perspective.

If you look directly at the main function, you can actually understand most of it, but there are too many nginx callback functions.

Therefore, you need to use GDB for point-of-view debugging;

To use GDB, you need to add the -g option to GCC.

  1. Open the nginx /auto/cc/conf file and change ngx_compile_opt= “-c” to -g, ngx_compile_opt= “-c -g”.
  2. Run./configure and make to compile and generate an executable file under objs.

Nginx generates an executable file and runs it on a terminal. Nginx loads the default configuration file and runs it as a daemon.

After nginx runs, you can debug it through GDB.

Run the following command to start GDB

! [](https://pic3.zhimg.com/80/v2-b647c719acccbb188f566a5204c6cc39_720w.jpg)

Then attach the nginx process number using the pidof command as follows:

! [](https://pic3.zhimg.com/80/v2-d5d0263686f4ef650877a9ede0e6ead9_720w.jpg)

Nginx starts one master process and one worker process by default, so the above command returns two process numbers, 8125 and 8126 on my host. The smaller is the master process and the larger is the worker process. So let’s take a look at the master process,

! [](https://pic2.zhimg.com/80/v2-731f4fd0cd47ad913d0f2fb249cf26fd_720w.jpg)

In this way, you can directly debug the nginx worker process, with the command bt can view the master process function stack

! [](https://pic4.zhimg.com/80/v2-2cac9850ccff5c73f4b2d296af702b00_720w.jpg)

After nginx is enabled, the master process is started first, starting with the main function.

  1. The main function does some initialization, initializing startup parameters, starting daemons, creating new PID files, etc., and then calls ngx_master_process_cycle.
  2. The most important thing in ngx_master_process_cycle is to start the child process and then call sigsuspend. The master process blocks in the signal.

Therefore, the task of the master process is to start and manage the child processes; How is it managed?

The signal, yes, the signal; When the master process receives a signal, it passes the signal to the worker process, which then processes it according to different signals.

How does the master process pass signals to the worker process?

Pipes, yes, pipes. The principle is the same as the communication mechanism between memcache’s master thread and worker thread, that is, each worker process has two file descriptors fd[0] and FD [1], one read end and one write end.

The worker process adds the reader to epoll event monitoring. When the master process receives a signal, it writes a flag to the writer end of each worker process. Then the worker process triggers the read event, reads the flag, and performs corresponding operations according to the flag.

So nginx receives client requests and processes client requests, mainly in worker processes. Let’s take a look at the worker process function stack

! [](https://pic3.zhimg.com/80/v2-973fec57b452401f524e718d031f1ebb_720w.jpg)

Because the worker process is fork out by the master process, the worker process contains the function stack of the master process. So let’s just start with function #5,

  1. Ngx_start_worker_processes calls ngx_spawn_process to start the child process and set the channel through which the master process communicates with the worker process.
  2. Ngx_spawn_process the ngx_spawn_process function sets the communication channel between the master process and the worker process, such as non-blocking, and starts the child process through the fork function. The child process calls the callback function ngx_worker_process_cycle passed in as an argument, and the parent process then sets the worker process-related attributes.
  3. Ngx_worker_process_cycle Starts by calling ngx_worker_process_init to initialize the worker process, including setting the priority of the process, the maximum file descriptor that the worker process is allowed to open, setting the blocking signal, Initialize all modules, add communication channel between master process and worker process to listen for readable events, etc. Then, in an infinite loop, the function ngx_worker_process_cycle then calls ngx_process_events_and_timers, starting the event-listening loop;
  4. In the ngx_process_EVENTS_and_timers function, listenFD first obtains the lock. If the lock is obtained, listenFD can receive the client; otherwise, ListenFD cannot receive the client events. Then call ngx_process_events, which is ngx_epoll_process_events to enable event listening.

Ok, the worker process is now ready, waiting for the client to connect and request data.

In order to avoid the stampede phenomenon and realize the load balancing of worker processes, every time a client connects, all worker processes will compete for the lock first. If a worker process obtains the lock, it can receive the client and client request events.

If the worker process does not fight for the lock, only the client request event is executed.

2. Set important callback functions

When the nginx master and worker processes are started, the client can send requests. Next, take a look at how nginx handles requests;

When the client sends a request, it first establishes a connection through TCP three-way handshake. After the connection is successfully established, the listenfd callback function is executed, but what is the listenfd callback function? The ListenFD callback function is actually hard to spot for newcomers.

The following analysis:

Things like listenFD’s callback function and how modules are pieced together are almost all done at module initialization. The callback function of ListenFD is set when the Event module is initialized or some setting functions of the Event module are called. After the client connects to the server, the callback function after the server receives the request is also set when the HTTP module is initialized or when some setup functions of the module are called.

When the event module initializes, the ngx_event_process_init function is called. The most important code for this function is listed below:

! [](https://pic4.zhimg.com/80/v2-d58c116e7408f2ca1e9453053df1e9ef_720w.jpg)

In the for loop, each listening socket is iterated, recv is the read event of the ListenFD connection object, the callback function of the ListenFD read event is set as ngX_EVENT_ACCEPT, and then each ListenFD is added to the event listener and set as a readable event.

Ok, when we look at the definition of ngx_add_CONN and ngx_add_event, it looks like this:

! [](https://pic4.zhimg.com/80/v2-0340296b4c3d66282eff44202b66d473_720w.jpg)

Ngx_add_conn and ngx_ADD_event are Pointers to functions set in the ngx_EVENT_actions structure.

In fact, ngx_event_Actions is the key to nginx cross-platform, because different platforms use different event listeners, resulting in different ngx_event_actions.

For example, Linux uses epoll, so the ngx_EVENT_Actions structure is set when the epoll module loads, in the first half of the code above. Let’s look at the epoll module actions.init function:

! [](https://pic1.zhimg.com/80/v2-05b3bd8ddf8c84c8b5937d4dc098aec1_720w.jpg)

As you can see from the code, ngx_event_actions is set to ngx_epoll_module_ctx.actions. Then look at the structure:

! [](https://pic4.zhimg.com/80/v2-9fc1d879cf78c0bfc48113fd18d3caf3_720w.jpg)

Therefore, when ngx_ADD_CONN and ngx_ADD_event are called, ngX_epoll_ADD_connection and ngx_epoll_ADD_event are respectively called.

So, if it’s a MAC platform and the event listener is kqueue, ngx_kqueue_ADD_event will be called when called.

If the poll listener is used, the call will be ngx_poll_add_event and so on.

Listenfd callback Function ListenFD callback function ListenFD callback function ListenFD callback function ListenFD

! [](https://pic1.zhimg.com/80/v2-159eafc9467bdcc0861bdcaf7b166c1f_720w.jpg)

When the client connects to the server, the ListenFD callback first calls the Accept function to receive the client request, and then retrieves an object from the object pool that encapsulates the client socket connection.

If the epoll event listener is currently used, ngx_add_conn(c) is called to place the event listener, and finally the ngX_listening_T callback is called to further operate on the client connection.

What is ls->handler(c)? The first time I looked at the code, I was stunned!!

Remember what we said earlier? The interface between modules is almost always set when a module is initialized or when some setup function of the module is called, so let’s look at what happens when an HTTP module is initialized.

The HTTP module does not set ls->handler(c) in the module initialization function, but in the ngx_http_block function when the “HTTP” command is read;

! [](https://pic4.zhimg.com/80/v2-c7fd5d985a7c6829afb50fb0e2035fc3_720w.jpg)

The lS-handler setup function ngx_http_init_connection is an entry function in the HTTP module that handles HTTP requests from the client.

At this point, we can see that when the server receives the client, it first wraps it into an NGx_Connection_t structure and then hands it to the HTTP module to execute the HTTP request.

3. Nginx processes HTTP requests

Nginx handling HTTP requests is one of the most important functions of Nginx, and one of the most complex. You can outline the execution process:

  1. Read the parse request line;
  2. Read the parse request header;
  3. Start with the most important part, which is multi-stage processing; Nginx divides the request processing into 11 phases. That is, when nGINx reads the request line and header, it encapsulates the request structure ngx_HTTP_request_T. The handler of each phase processes the request based on the ngX_HTTP_request_T. Examples include uri rewriting, permission control, path finding, content generation, and logging;
  4. Returns the result to the client;

Multistage processing is the most important part of the Nginx module, because third-party modules are registered here as well; For example, someone wrote a third party module that uses Nginx and memcache for page caching, or changed memcache to Redis clustering, etc.

Moreover, nginx multi-stage processing is similar to Python and Golang Web framework middleware. The latter mainly uses the decorator mode to encapsulate the handler layer by layer, while Nginx uses the array (linked list) form to combine multi-stage handlers, and then executes according to the handler chain list.

Because the content of multi-stage is not fully understood, so follow the online tutorial, write the simplest third-party module, used to set the fixed point debugging, observe the HTTP stage function execution process, the steps are as follows:

  1. Create a new directory THM (third Mudole) in the nginx directory, create a new directory foo (module foo), and create ngx_HTTP_foo_module.c in the foo directory
! [](https://picb.zhimg.com/80/v2-5e9546c96d7c255f95f223896f53f74f_720w.jpg)
! [](https://pic4.zhimg.com/80/v2-c24cb73f34f4719c8c765fbe0f8b62d1_720w.jpg)

Then create a new config file in foo, again

! [](https://pic3.zhimg.com/80/v2-2a3e53ce1e3c17836404484d445095fd_720w.jpg)

Thus, one of the simplest third-party modules is written.

The above two functions are easy to understand. One is an initialization function that registers the module’s handler to a stage.

This example is in the NGX_HTTP_CONTENT_PHASE phase, and then when the program reaches the above phase, it executes the foo module; Finally recompile to generate the executable file.

Next, take a look at HTTP execution using GDB and set the point to

! [](https://picb.zhimg.com/80/v2-5e1135594a97302d1d7ecf63cfe0e6f4_720w.jpg)

A brief explanation of the above functions, the version I read is not the same as the running version, so the above is for reference only:

  1. When a client sends a TCP connection request, ngx_epoll_process_events returns the listenFD readable event, calls ngX_EVENT_ACCEPT to receive the client request, and encapsulates the request into the NGX_Connection_T structure. Finally, call ngx_HTTP_init_connection to enter HTTP processing;
  2. Ngx_http_wait_request_handler (ngx_HTTP_init_connection (ngx_connection_t * C)) is not found in the new nginx. Then call ngx_HTTP_init_request to initialize the request structure ngx_HTTP_request_t and call ngx_HTTP_process_request_line inside the function.
  3. Ngx_http_process_request_line calls the ngx_HTTP_read_request_header function to read the request line into the cache, and then calls the ngx_HTTP_parse_request_line function to parse the request line information. Finally, ngx_HTTP_process_request_header is called to process the request header;
  4. The ngx_HTTP_process_request_header function is called to read the request header, and the ngx_HTTP_parse_header_line function is called to parse the request header. The ngx_HTTP_process_request_header function is then called to validate the request header as necessary, and finally the ngx_HTTP_process_request function is called to process the request.
  5. Call ngx_HTTP_handler (ngx_HTTP_request_t) inside the ngx_HTTP_process_request functionR) function, while in ngx_HTTP_handler (ngx_HTTP_request_tR) Call ngx_HTTP_CORE_RUN_PHASES internally to perform multistage processing;
  6. Let’s look at the multiphase processing function ngx_HTTP_CORE_RUN_PHASES
! [](https://pic4.zhimg.com/80/v2-bb030468c3234bd7a7661a6859bf74d9_720w.jpg)
  1. HTTP multi-stage processing, each stage may correspond to one handler or multiple handlers, and each stage corresponds to the same checker.

So in the above while loop, we iterate through all HTTP module handlers, and then process them in the handler function according to the request structure ngx_HTTP_request_t.

The checker function for the NGX_HTTP_CONTENT_PHASE is ngx_HTTP_CORE_content_phase. It then executes the handler (ngx_http_foo_handler) for module foo inside the checker function.

After the multi-stage processing is complete, the response is finally returned to the client.

4. To summarize

This article is a macro analysis of the overall operation process of Nginx, because the first time to see nginx, there are a lot of confusing places, so this article is also a note. Take a closer look at multi-stage processing, as third-party development modules are also registered in the multi-stage process, and become familiar with NGX + Lua module development.

I hope the above content can help you. Many PHPer will encounter some problems and bottlenecks when they are advanced, and they have no sense of direction when writing too many business codes. I have sorted out some information, including but not limited to: Distributed architecture, high scalability, high performance, high concurrency, server performance tuning, TP6, Laravel, Redis, Swoft, Kafka, Mysql optimization, shell scripting, Docker, microservices, Nginx, etc. Many knowledge points can be free to share with you