This is the sixth article in the Netty series
In the last article we saw a basic approach to building a Server application using Netty from a Netty Demo. Starting from this Demo, the logical architecture of Netty is briefly described, and the concepts of Channel, ChannelHandler, ChannelPipeline, EventLoop, EventLoopGroup and so on have a preliminary understanding.
Review the logical architecture diagram.
Today, I will learn EventLoop and EventLoopGroup in logic architecture and master Netty thread model, which is one of the most essential knowledge points of Netty.
This article will take about 15 minutes to read and will focus on the following questions:
-
What is the Reactor thread model?
-
How to implement EventLoopGroup EventLoop thread model?
-
In-depth optimization of Netty’s threading model
-
Thread model changes for Netty3 and Netty4
-
What is lockless serialization of the Netty4 threading model
-
Best practices from the threading model
1. What is the Reactor thread model?
Let’s start by reviewing the I/O threading models we introduced in Part 2 of the Netty series, including BIO, NIO, I/O multiplexing, signal-driven IO, and AIO. IO multiplexing has a special NIO package in Java that encapsulates the related methods.
As mentioned in the previous article, we used Netty instead of using Java NIO packages directly because Netty helped us encapsulate many of the details of using NIO packages and made many optimizations.
One of the most famous is Netty’s Reactor Thread model.
If the pre-knowledge is not clear, you can go back to the previous articles:
From network I/O to Netty, learn more about I/O multiplexing. From I/O to Netty, cross Java NIO Packages.
Reactor model is an event-driven model.
The Reactor Thread model is a single thread that listens using the Selector select() method in the Java NIO package. When an event (accept, read, etc.) is received, the event is dispatched for corresponding event processing (Handle).
A more explicit definition of the Reactor thread model would be:
Reactor Thread pattern = Reactor (I/O multiplexing) + thread poolCopy the code
The Reactor listens and allocates events, and the thread pool processes events.
Then, according to the number of reactors and the number of thread pools, reactors can be divided into three models
- Single-reactor single-thread model (thread pool of fixed size 1)
- Single-reactor multithreaded model
- Multi-reactor multi-thread model (usually two primary and secondary reactors)
1.1 Single-reactor single-thread model
The Reactor internally listens for connection events through selector and distributes them through dispatch when received.
- If the connection is established, accept the connection and create a Handler to handle subsequent events.
- Read => (decode => compute => encode) => send (decode => compute => encode)
In this process, whether it is event listening, event distribution, or event processing, there is always only one thread doing everything.
Cons: Can’t hold up when there are too many requests. Because there is only one thread, the performance of multi-core CPU cannot be achieved. And once a Handler is blocked, the server is completely unable to handle other connection events.
1.2 Single-reactor multithreaded model
To improve performance, we can hand over complex event handlers to a thread pool, which can evolve into a single-reactor multithreaded model.
The main difference between this model and the first is that business processing is removed from the previous single thread and replaced with thread pool processing.
1) Reactor thread
The client request is listened for by SELECT, the connection is accepted by Accept, and a Handler is created to handle subsequent read and write events for the connection. The Handler here is only responsible for responding to events, read and write events, handing over the specific business processing to the Worker thread pool.
Only connection events, read and write events are handled.
2) Worker thread pool
Process all business events, including (decode => compute => encode) procedures.
Make full use of multi-core machine resources to improve performance.
Disadvantages: In very specific scenarios, a single Reactor thread that listens and handles all client connections can cause performance problems. For example, millions of concurrent client connections (Double Eleven, Spring Festival rush tickets)
1.3 Multi-reactor multi-thread model
To take full advantage of the multi-core capability, two reactors can be built, which evolves into the master-slave Reactor threading model.
1) the main Reactor
The primary Reactor listens on server sockets separately, accepts new connections, and registers the established SocketChannel with the specified secondary Reactor.
2) from the Reactor
The Reactor adds the connection to the connection queue for listening and creates a handler for event processing. Perform read, write, and distribute events, leaving business processing to the worker thread pool.
3) Worker thread pool processes all business events to make full use of multi-core machine resources and improve performance.
Easily handle millions of concurrent requests.
Disadvantages: Complex implementation.
But with Netty, everything is easier.
Netty encapsulates everything for us and allows us to quickly use the master-slave Reactor thread model (adding lock-free serialization to the Reactor 4 implementation). The code is not posted here, but can be seen in the previous Demo.
2. How does EventLoop and EventLoopGroup implement Reactor thread model?
We’ve seen the Reactor thread model above, and the core of it is:
Reactor Thread pattern = Reactor (I/O multiplexing) + thread poolCopy the code
Its operation mode consists of four steps:
- Connection registration: After a connection is established, a channel is registered with a selector
- Event polling: Polling on selcetor (select() function) for all I/O events (multiplexing) of registered channels
- Event distribution: Allocates ready I/O events to the corresponding thread for processing
- Event handling: Each worker thread executes event tasks
How does this model work in Netty?
This brings us to EventLoop and EventLoopGroup.
2.1 What is EventLoop
EventLoop is not unique to Netty, but is itself a generic program model for event waiting and processing. It is mainly used to solve the problem of high multithreaded resource consumption. Node.js, for example, uses EventLoop.
So, what is an EventLoop in Netty?
- An event handler for the Reactor model.
- A single thread.
- An EventLoop maintains a selector and a taskQueue, which handle “I/O events” and “tasks,” respectively.
A taskQueue is a multi-producer, single-consumer queue that ensures thread safety when multiple threads add tasks concurrently.
I/O events are events in selectionKey, such as Accept, Connect, read, and write.
Task Includes common tasks and scheduled tasks.
- Common tasks: Add tasks to the taskQueue using the execute() method of NioEventLoop. For example, Netty encapsulates WriteAndFlushTask and submits it to taskQueue when writing data.
- Scheduled task: A scheduled task is added to the scheduledTaskQueue by calling the schedule() method of NioEventLoop for periodic execution of the task (such as heartbeat message sending). Tasks in the scheduled task queue are added to the common task queue for execution.
A picture is worth a thousand words:
EventLoop runs single-threaded and performs three actions in a loop:
- Selector Event polling
- I/O event processing
- Task processing
2.2 What is an EventLoopGroup
The EventLoopGroup is simple and can be simply understood as an “EventLoop thread pool.”
Tips:
Listening on a port is bound to only one Eventloop in the BossEventLoopGroup, so configuring multiple threads for BossEventLoopGroup is useless unless you listen on multiple ports simultaneously.
2.3 Specific Implementation
Netty supports single-thread model, single-reactor multi-thread model, and multi-reactor multi-thread model through simple configuration.
Let’s take a look at Netty using EventLoop as an example.
Or is a picture worth a thousand words:
Let’s take a look at the four steps of the Reactor thread model:
1) Connect to register
The Master EventLoopGroup has an EventLoop that is bound to a specific port for listening.
When a new connection triggers an Accept event, the connection is assigned to an EventLoop in the Slave EventLoopGroup during the I/O event processing phase of the current EventLoop to listen for subsequent events.
2) Event polling
The EventLoop in the Slave EventLoopGroup polls the channels bound to it through Selcetor to obtain all I/O events (multiplexing) of the registered channels.
Of course, there are multiple EventLoops running in an EventLoopGroup, each of which loops. The number of eventloops is either the number of threads specified by the user or twice the number of cores by default.
3) Event distribution
When an EventLoop in the slave EventLoopGroup receives an I/O event, it sends it to the corresponding ChannelPipeline for processing in the I/O event processing (processSelectedKeys) phase of the EventLoop.
Note that serial processing is still done on the current thread
4) Event handling
Handle I/O events in ChannelPipeline.
After the I/O event is processed, EventLoop consumes the tasks in the queue in the runAllTasks phase.
At this point, we can fully tease out the relationship between EventLoopGroup/EventLoop and the Reactor thread model.
Gee, what seems to be wrong?
That’s right, as you might notice, it’s not in the Slave EventLoopGroup
A selector thread poolCopy the code
Instead, there are multiple Eventloops
Multiple selectors plus multiple single threadsCopy the code
Why is that?
It’s time to dig deeper into the threading model optimization of Netty4.
3. In-depth optimization of Netty’s threading model
As mentioned above, each EventLoop is single-threaded and executes three actions in a loop:
- Selector Event polling
- I/O event processing
- Task processing
In the slave EventLoopGroup, there is not a “selector + thread pool” mode, but a “multiple selector + multiple single thread” model composed of multiple Eventloops. Why is this?
This is mainly because we analyze the Netty4 threading model, which is different from the traditional Reactor model of Netty3.
3.1 Changes in the threading model of Netty3 and Netty4
In the thread model of Netty3, it is divided into read event processing model and write event processing model.
-
Channelhandlers for read events are executed by Netty’s I/O thread (corresponding to Netty 4’s EventLoop).
-
I/O thread scheduling executes the corresponding method of the Handler chain in the ChannelPipeline up to the End Handler of the business implementation.
-
The End Handler wraps the message into a Runnable and executes it in the business thread pool. The I/O thread returns and continues the I/O operations, such as reading/writing.
- Write events are handled by the calling thread, either the I/O thread or the business thread.
- If it is a business thread, the business thread executes a Channel Handler in the ChannelPipeline.
- The last ChannelHandler in the system pushes the encoded message to the send queue, and the business thread returns.
- The Netty I/O thread fetches the message from the send message queue and calls SocketChannel’s Write method to send the message.
As can be seen from the above, in the thread model of Netty3, it adopts the model of “selector + business thread pool”.
Note that under this model, the read and write models are inconsistent. In particular, the “thread of execution” for read and write events is different.
However, in the thread model of Netty4, “multiple selector + multiple single thread” model is adopted.
Read the event:
- I/O thread NioEventLoop reads datagrams from SocketChannel, posts ByteBuf to ChannelPipeline, and fires ChannelRead event;
- The I/O thread NioEventLoop calls the ChannelHandler chain until the message is posted to the business thread, and then the I/O thread returns to continue.
Write event:
- Business thread calls ChannelHandlerContext. Write (Object MSG) method for message sent.
- ChannelHandlerInvoker encapsulates the sent message as a task and places it in the Mpsc task queue of the EventLoop, and the business thread returns. EventLoop then schedules and executes the loop uniformly.
- The I/O thread EventLoop gets the task from the Mpsc task queue, calls the ChannelPipeline for processing, handles the Outbound event until the message is put into the send queue, and then wakes up the Selector to perform the write operation.
In Netty4, both reading and writing are handled uniformly through the I/O thread, also known as EventLoop.
Why did the threading model of Netty4 make this change? The answer is lock-free serial design.
3.2 What is lockless serialization of Netty4 thread model
Let’s start by looking at the problems with Netty3’s threading model:
- Inconsistent read/write thread models impose an additional development mental burden.
- When a write operation is initiated by a service thread, the service usually uses multiple threads in the thread pool to concurrently execute a service process. Therefore, multiple threads may concurrently operate the ChannelHandler at a certain time. Therefore, we need to protect the concurrency of the ChannelHandler, which greatly reduces the development efficiency.
- Frequent thread context switching imposes additional performance costs.
The design of “lockless serialization” in Netty4 thread model solves these problems well.
A picture is worth a thousand words:
Event polling, message reading, encoding, and subsequent Handler execution are always performed sequentially within the I/O thread NioEventLoop, which means that there is no thread context switch in the whole process, avoiding performance degradation caused by multithreading competition, and avoiding the risk of concurrent modification of data.
On the surface, the serialization design appears to be CPU inefficient and not concurrent enough. However, by adjusting the thread parameters of the Slave EventLoopGroup, multiple NioEventloops can be started at the same time, and the serialized threads run in parallel. This partially lock-free serial thread design is better than the “one queue-multiple worker thread model”.
Summarize the advantages of Netty4 lock-free serialization design:
- An EventLoop handles all events throughout the life cycle of a channel. The I/O thread NioEventLoop is always responsible for reading from the message, encoding it, and subsequent execution by the Handler.
- Each EventLoop will have its own queue of tasks.
- There is no thread context switch and data is not at risk of being concurrently modified.
- For users, the unified read-write thread model also reduces the mental burden of use.
4. Best practices from a threading model
NioEventLoop’s lockless serialization is so well designed that it’s perfect?
No!
Netty3’s threading model may perform better in certain scenarios. For example, encoding and other write operations are time-consuming, performed concurrently by multiple business threads, and certainly perform better than a single EventLoop thread performing sequentially.
Therefore, while single-threaded execution avoids thread switching, its disadvantage is that it does not perform I/O operations that take too long. Once one I/O event is blocked, all subsequent I/O events cannot be executed, or even cause a backlog of events.
So, the following two points should be noted as best practices for Netty4’s threading model:
- Neither read nor write operations are performed in custom ChannelHandler.
- Do not queue time-consuming operations.
This paper deeply studied EventLoop in Netty logical architecture and grasped the thread model, the most essential knowledge point of Netty.
Start with the Reactor thread model and see how Netty implemented the Reactor thread model with EventLoop.
Then the thread model optimization of Netty4 is introduced in detail, especially the serialization design without lock.
Finally, the best practice of using Netty4 in daily development is illustrated based on EventLoop thread model.
I hope you get a full understanding of EventLoop.
In addition, due to space constraints, there are two very important data structures in EventLoop that are not covered. Do you know what they are?
The following will be written separately for analysis, please look forward to.
Bibliography: Netty in Action
See the end, the original is not easy, point a concern, point a like it ~
Reorganize the knowledge fragments and construct the Java knowledge graph: github.com/saigu/JavaK… (Easy access to historical articles)