1 the concept

1.1 define

Classic definitions in operating systems:

  • Process: unit of resource allocation
  • Thread: Unit of scheduling
  • Coroutine: scheduling unit of user-mode implementation

Process scheduling

Note: Process switching is actually scheduling threads

1.2 relationship

A process can contain multiple threads, and a thread can contain multiple coroutines. However, it must be made clear that multiple coroutines in a thread run serially. If there is one multicore cpus, multiple processes or multiple threads within a process can be run in parallel, but more than a thread in the collaborators cheng absolute serial (synchronous | run in turn), no matter how many CPU (nuclear). More than one coroutine can run in a thread, but these coroutines run serially. While one coroutine is running, other coroutines must be suspended.

To compare

Although coroutines are not dimensionally related to processes and threads, sometimes we need to compare them as follows:

  1. A coroutine is neither a process nor a thread, it is simply a special function, and it is not in the same dimension.
  2. A process can contain multiple threads, and a thread can contain multiple coroutines.
  3. Although multiple coroutines in a thread can be switched, they are executed in serial and can only run in this thread, so CPU’s multi-core capability cannot be utilized.
  4. Coroutines, like processes, have context switching problems in their switching.

Tripartite context

process thread coroutines
To switch the The operating system The operating system User (programmer/application)
The switching time The user is unaware of the switchover policy specified by the OPERATING system The user is unaware of the switchover policy specified by the OPERATING system The user decides
Switch content Page global directory

The kernel stack

Hardware context
The kernel stack

Hardware context
Hardware context
Toggle the saving of content Stored in the kernel stack Stored in the kernel stack Stored in the user’s own variable (user stack or heap)
The switching process User-kernel – User-mode User-kernel – User-mode User mode (not falling into kernel mode)
The switch efficiency low In the high

1.3 Processes and Threads

  1. Address space: a unit of execution within a process; Processes have at least one thread; They share the address space of the process; Processes have their own address space; Thus threads can read and write the same data structures and variables, facilitating communication between threads. In contrast, inter-process communication (IPC) is difficult and consumes more resources.
  2. Resource ownership: A process is a unit of resource allocation and ownership. Threads in the same process share the resources of the process
  3. Processes are an independent unit of resource allocation and scheduling, while threads are the basic unit of CPU scheduling
  4. Both can be executed concurrently.
  5. The creation of a process calls fork or vfork, while the creation of a thread calls pthread_create. All threads owned by the process are destroyed when the process terminates without affecting the termination of other threads in the same process
  6. A thread has its own private TCB, thread ID, registers, hardware context, and a process has its own private process control block PCB. These private attributes are not shared and are used to identify a process or a thread’s flag

1.4 coroutines

Coroutines, or cooperative programs, are based on the idea that a series of interdependent coroutines use the CPU in turn, with only one working at a time while the others lie dormant. Coroutines can pause execution at some point during the run and resume execution from the point where it was suspended when the run resumes.

Coroutines can be thought of as a kind of user-space thread with two major advantages over traditional threads:

  • Unlike threads, coroutines voluntarily surrender the CPU and deliver the next one they want, rather than being interrupted by system scheduling at any time. The use of coroutines is therefore more straightforward, and locking is not required in most cases.
  • Compared to threads, switching coroutines is programmatically controlled and occurs in user space rather than kernel space, so switching costs are minimal.

2 Web server Example

2.1 Multi-process single-thread model

In this access model, a new process is opened for each request, and the process is single threaded. Advantages:

  • Programming is relatively easy; You usually don’t need to worry about locking and synchronizing resources.
  • Greater fault tolerance: One advantage over multi-threading is that one process crashes without affecting other processes.
  • Kernel-guaranteed isolation: Data and error isolation.
  • Error isolation is useful for native code written in languages such as C/C++ : programs with multi-process architectures tend to have some degree of self-recovery; (The master daemon monitors all worker processes and restarts them when it finds them hanging)

Disadvantages: process switching overhead, intensive access with concurrent problems.

2.1.1 Nginx process model

Nginx usesMultiprocess single thread & multiplexing IOThe model works in multi-process (one thread per process) mode (of course nGINx also supports multi-thread mode, but our main method is multi-process mode), but each thread can handle multiple client access, which also contributes to its concurrency performance. Nginx’s multi-process approach has many benefits.After Nginx starts, there will be one Master process and multiple Worker processes.

The master process is mainly used to manage worker processes, including: receiving signals from the outside, sending signals to all worker processes, monitoring the running status of worker processes. When the worker process exits (in abnormal cases), it will automatically restart the new worker process.

Basic network events are handled in the worker process. Multiple worker processes are peer. They compete equally for requests from clients and are independent of each other. A request can only be processed in one worker process, and it is impossible for a worker process to process requests from other processes. The number of worker processes can be set, which is generally consistent with the number of MACHINE CPU cores. The reason for this is inseparable from Nginx’s process model and event processing model.

2.1.2 PHP-FPM process model

Php-fpm adopts the Master/Worker process model (blocked single-thread model). When php-fpm starts, the configuration file is read and a Master process and several Worker processes are created (the number of Worker processes is determined by the number configured in php-fpm.conf). The Worker process is forked by the Master process. The Master process and the Worker process:

  • Master process: responsible for managing Worker processes and monitoring ports
  • Worker processes: Handle business logic (each Worker process can only handle one request at a time)

The FastCGI protocol can be understood as a standard for reassembling requests forwarded by Nginx into a context that PHP programs can parse and recognize.

The phP-FPM process management modes include Dynamic, Static, and Ondemand, which are described in the following sections.

  • In this way, a number of Worker processes are created when PHP-fPM starts. When the number of requests increases gradually, the number of Worker processes will be dynamically increased. When the number of requests drops, the Worker process that was created dynamically is destroyed. In this way, if the maximum number of processes is too large, a large number of Worker processes will appear when the number of requests increases, and frequent switching between processes will waste a lot of CPU resources.
  • In Static mode, phP-fPM will create the number of Worker processes specified in the configuration file when it starts, and will not increase or decrease according to the number of requests. Since each Worker process started by PHP-fpm can only handle one request at a time, there will be a waiting situation when the request increases in this way.
  • In this way, phP-FPM starts without creating Worker processes, and the Master forks out child processes only when the request arrives. In this mode, the Master process is very busy with requests and takes up a lot of CPU time. Therefore, this mode is not suitable for heavy traffic environment.

Since each Worker process started by php-fpm can only handle one request at a time, Nginx+php-fpm has been criticized as a bottleneck for concurrency problems.

2.2 Single-process multithreading model

In this type of access model, the server starts a process, each WEB project is deployed a thread pool, each request for the project corresponding to the opening of a new thread stored in the thread pool, within the thread can open child threads. Advantages:

  • Create fast, convenient and efficient data sharing
  • Sharing data: Multiple threads can share the same virtual address space. Shared memory, semaphore and other IPC technologies are needed for data sharing between multiple processes.
  • Light context switching overhead – no address space switching, no registers changing, no TLB refreshing.
  • Provide heterogeneous services, if all are computing tasks, but the time of each task is not 1s, but between 1ms-1s fluctuation; In this way, the advantages of multi-threading compared with multi-process are reflected, which can effectively reduce the probability of “simple tasks being overwhelmed by complex tasks”;

Disadvantages: There is only one process, once one error occurs, the whole process may die. You can of course write a “daemon” for ta to restart, but during the restart, your server will literally “die”.

2.2.1 NIO mode of Tomcat

Tomcat has three working modes. In NIO mode, starting a tomcat process is a single thread for each access to the project. If idle threads fill up, new threads are created until the maximum number of threads is reached.

This is a typical request processing process. Green represents threads and blue represents data.

  1. The Acceptor thread accepts the request and removes the socket object from the socketCache (otherwise, the socket object will be created. The purpose of caching is to avoid object creation overhead).
  2. The Acceptor thread marks the Poller object, assembles the PollerEvent, and places it in the PollerEvent queue of the Poller object
  3. The Poller thread takes the PollerEvent from the event queue, registers its socket with its selector,
  4. The Poller thread waits until a read or write event occurs and dispatches it to the SocketProcessor thread to actually process the request
  5. After the SocketProcessor thread processes the request, the socket object is reclaimed and placed in the socketCache

2.3 Process thread + coroutine mode

Using Swoole’s coroutine HTTP server as an example, the process model is shown below

  • Master: The Master process is a multi-threaded process
  • Reactor thread:
    • The Reactor thread is a thread created in the Master process
    • Responsible for client TCP connection maintenance, network IO processing, protocol processing, data sending and receiving
    • No PHP code is executed
    • The data sent by the TCP client is buffered, spliced, and split into a complete request packet
  • Worker processes:
    • Accepts the request packet posted by the Reactor thread and executes a PHP callback to process the data
    • The response data is generated and sent to the Reactor thread, which then sends it to the TCP client
    • It can be asynchronous non-blocking mode or synchronous blocking mode
    • The Worker runs in a multi-process mode
  • TaskWorker process:
    • Accept tasks posted by the Worker process
    • Process the task and return the resulting data to the Worker process
    • Full synchronous blocking mode
    • Taskworkers run in multi-process mode
  • Manager process: responsible for creating/reclaiming worker/ Task processes

But for each thread for the request processing, it is coroutine mode work. First, a quick look at the scheduler: a thread running a scheduler can create several coroutines on a scheduler. The scheduler is responsible for scheduling these coroutines. And the scheduler maintains a multiplexer (epoll/select/poll) within it.

Now suppose we have three coroutines A,B, and C that each perform several IO operations. These three coroutines run in the context of the same scheduler (thread) and use the CPU in turn.

Coroutine A runs first, and when it performs an IO operation that is not immediately ready, A registers the IO event with the scheduler and voluntarily abandonsthe CPU. At this point, the scheduler switches B to the CPU and starts executing. Similarly, when it encounters an IO operation, it registers the IO event with the scheduler and voluntarily abandons the CPU. The scheduler switches C to the CPU to begin execution. When all coroutines have been “blocked”, the scheduler checks to see if registered IO events have occurred or are ready. Assuming that the IO events registered by coroutine B are ready at this point, the scheduler resumes execution of B, which picks up where the CPU was dropped last time. Same thing for A and C.

Thus, for a coroutine, we use a synchronous model; But for the overall scheduler (thread), it is actually an asynchronous model.


This article mainly from the process/thread/coroutine perspective, analysis of the working principle of the WEB server, if there is any error, welcome to correct.


reference

www.cnblogs.com/leisure_chn… Blog.csdn.net/shixin_0125… Blog.csdn.net/Dream_Weave… www.jianshu.com/p/a253d21e4… www.jianshu.com/p/61b634a8a…