Asynchrony: How to avoid blocking in a single-threaded model?

“This is the first day of my participation in the Gwen Challenge in November. Check out the details: The last Gwen Challenge in 2021”

One of the reasons why Redis is widely used is that it supports high performance access. Because of this, we must pay attention to all the factors that may affect Redis performance (such as command operation, system configuration, key mechanism, hardware configuration, etc.), not only know the specific mechanism, as far as possible to avoid the occurrence of performance exceptions, but also prepare in advance to deal with exceptions.

Therefore, starting from this class, I will spend six classes to introduce five potential factors affecting Redis performance, which are as follows:

Blocking operations inside Redis;
CPU core and NUMA architecture impact;
Redis key system configuration;
Redis memory fragmentation;
Redis buffer.

In this lesson, we will first learn about blocking operations inside Redis and how to deal with them.

In Lecture 3, we learned that Redis network IO and key pair reads and writes are done by the main thread. Then, if the operation on the main thread takes too long, it will cause the main thread to block. However, Redis has both key-value add, delete, change and check operations requested by clients, persistence operations to ensure reliability, and data synchronization operations for master/slave replication, and so on. With so many operations, what exactly causes blocking?

Don’t worry, I’m going to walk you through these operations and identify blocking operations.

What choke points do Redis instances have?

At runtime, a Redis instance interacts with many objects, and these different interactions involve different operations. Let’s take a look at the objects that interact with a Redis instance and the operations that occur when they interact.

Client: network IO, key value to add, delete, change and check operation, database operation;
Disk: Generate RDB snapshot, record AOF log, and rewrite AOF log;
Master and slave nodes: the master library generates and transfers RDB files, receives RDB files from the library, clears the database, and loads RDB files.
Slice cluster instance: transfer hash slot information to other instances and migrate data.

To help you understand, let me draw another diagram to show the relationship between these four types of interaction objects and specific operations.

Next, let’s take a look at which operations in each of these interaction objects cause blocking.

1. Blocking points during interaction with clients

Network IO can sometimes be slow, but Redis uses IO multiplexing to prevent the main thread from waiting for a network connection or request to arrive, so network IO is not a factor that causes Redis to block.

Adding, deleting, modifying, and checking key pairs is a major part of the interaction between Redis and the client, and is also a major task performed by the Redis main thread. Therefore, the complexity of add, delete, change and query operations will definitely block Redis.

So how do you tell if the operation is complex? One of the most basic criteria is whether the operation is O(N).

The complexity of operations involving collections in Redis is usually O(N), and we need to be careful when using it. Examples include HGETALL, SMEMBERS, and aggregate statistical operations such as intersection, union, and difference sets. These operations can serve as the first Redis choke point: collection full queries and aggregation operations.

In addition, the deletion of the collection itself also has the potential to block. Why block the main thread, you might think, when deleting data is easy?

In fact, the essence of a delete operation is to free up memory space occupied by key/value pairs. Don’t underestimate the memory release process. Freeing memory is only the first step. To manage memory space more efficiently, when an application frees memory, the operating system inserts the freed memory into a linked list of free memory blocks for later management and redistribution. This process itself takes time and blocks the currently free application, so if a large amount of memory is freed all at once, the linked list operation time of free memory blocks increases and accordingly blocks the main Redis thread.

So, when do you free a lot of memory? When deleting a large number of key/value pairs, the most typical deletion is the collection containing a large number of elements, also called bigkey deletion. To give you an idea of bigKey’s delete performance, I tested the amount of time it takes to delete a collection of different elements, as shown in the following table:

From this table, we can draw three conclusions:

When the number of elements increases from 100,000 to 1 million, the deletion time of the four major collection types increases from 5 times to nearly 20 times.
The larger the collection element, the longer it takes to delete it;
When deleting a collection of 1 million elements, the maximum absolute delete time is 1.98s (Hash type). Redis typically has response times in microseconds, so an operation that reaches nearly 2s inevitably blocks the main thread.

After the analysis, it is clear that the Bigkey delete operation is the second Redis choke point. The delete operation has a significant negative impact on Redis instance performance and is easily overlooked in actual business development, so it is important to pay attention to it.

Since frequent deletion of key-value pairs is a potential blocking point, flushing a database (such as FLUSHDB and FLUSHALL operations) in Redis is also a potential blocking risk because it involves deleting and releasing all key-value pairs. So, this is the third Redis choke point: emptying the database.

2. Choke points during interaction with disks

I have singled out Redis’s interactions with disks mainly because disk IO is generally laborious and requires a lot of attention.

Fortunately, the Redis developers have long recognized the blocking effects of disk IO, so they further designed Redis to use child processes to generate RDB snapshot files and perform AOF log overwrites. In this way, both operations are performed by child processes, and slow disk IO does not block the main thread.

However, when Redis records AOF logs directly, it saves the data in a drop disk according to different write back strategies. A synchronous write to disk takes about 1 to 2ms. If a large number of write operations need to be logged in the AOF log and synchronized, the main thread will be blocked. This brings us to the fourth Redis choke point: AOF log write synchronously.

3. Obstruction points during the interaction between the primary and secondary nodes

In a master/slave cluster, the master library needs to generate RDB files and transfer them to the slave library. RDB files are created and transferred by child processes in the replication process of the master library, without blocking the main thread. In the case of the slave library, however, it needs to flush the current database using the FLUSHDB command after it receives the RDB file, which hits the third choke point we just analyzed.

In addition, after clearing the current database, the slave library also needs to load the RDB file into memory. The speed of this process is closely related to the size of the RDB file. The larger the RDB file, the slower the loading process, so loading the RDB file becomes the fifth Redis choke point.

4. Slice the choke points during interaction between cluster instances

Finally, when we deploy the Redis slicing cluster, the hash slot information allocated on each Redis instance needs to be transferred between different instances. At the same time, when load balancing or instances are added or deleted, data will be migrated between different instances. However, the amount of information in the hash slot is small, and data migration is performed incrementally, so in general, there is little risk of blocking the Redis main thread with either of these operations.

However, if you use the Redis Cluster solution and the migration happens to be a bigkey, the main thread will block because the Redis Cluster uses a synchronous migration. In Lecture 33, I’ll show you how to deal with the blocking of data migration caused by different slicing cluster solutions. Just know that without bigkey, instances of a slicing cluster can interact without blocking the main thread.

Now that you know the key operations of Redis and the blocking operations involved, let’s summarize the five choke points we just found:

Set full query and aggregate operations;
Bigkey delete;
Empty the database;
AOF log synchronous writing;
Load the RDB file from the library.

If you perform these operations on the main thread, it is inevitable that the main thread will be unable to service other requests for a long time. To avoid blocking operations, Redis provides an asynchronous threading mechanism. The asynchronous threading mechanism means that Redis starts child threads and assigns tasks to them in the background rather than the main thread. Use the asynchronous thread mechanism to perform operations without blocking the main thread.

At this point, however, the question arises: Can all five blocking operations be performed asynchronously?

Which choke points can be executed asynchronously?

Before looking at the feasibility of asynchronous execution of blocking operations, let’s look at the requirements for asynchronous execution.

If an operation can be executed asynchronously, it means that it is not an operation on the critical path of the Redis main thread. Let me explain what the operations are on the critical path. That is, the client sends the request to Redis and waits for Redis to return the data result.

This might be a little abstract, but let me draw a picture to illustrate it.

The main thread receives operation 1, and since operation 1 does not need to return specific data to the client, it can hand it off to a background child thread and simply return an “OK” result to the client. When the child thread performs operation 1, the client sends operation 2 to the Redis instance. At this point, the client needs to use the data result of operation 2. If operation 2 does not return the result, the client will remain in the waiting state.

In this case, operation 1 is not an operation on the critical path because it does not return concrete data to the client, so it can be executed asynchronously by the backend child thread. Operation 2, which returns the result to the client, is the operation on the critical path, so the main thread must execute it immediately.

Read operations are typically critical path operations for Redis, because after sending a read, the client waits for the read data to return for subsequent data processing. Redis’s first choke point, “collection full query and aggregate operations,” involves reads, so they cannot operate asynchronously.

Let’s look at the delete operation. The deletion operation does not need to return specific data results to the client, so it is not a critical path operation. The second one, “bigkey delete”, and the third one, “empty database”, both delete data and are not on the critical path. Therefore, we can use the backend subthread to perform the deletion asynchronously.

For the fourth choke point “AOF log synchronous write”, in order to ensure data reliability, the Redis instance needs to ensure that the operation record in the AOF log has been dropped. Although this operation requires the instance to wait, it does not return specific data results to the instance. So, instead of having the main thread wait for the AOF log to complete, we can start a child thread to perform a synchronous write to the AOF log.

Finally, let’s look at the “load RDB file from the library” choke point. In order to provide data access services to clients from the library, the RDB file must be loaded. Therefore, this operation is also an operation on the critical path, and we must let the main thread from the library perform it.

For Redis’s five choke points, except for “set full query and aggregate operations” and “load RDB file from library”, the other three choke points involve operations that are not on the critical path. Therefore, we can use Redis’s cross-step threading mechanism to implement bigkey deletion, database clearing, and AOF log synchronous write.

So how does Redis implement the cross-step threading mechanism?

Asynchronous child threading mechanism

When the Redis main thread is started, it uses the pthread_CREATE function provided by the operating system to create three child threads that are responsible for the asynchronous execution of AOF log writes, key-value pair deletions, and file closures.

The main thread interacts with child threads through a linked list of tasks. When a key-value pair is received for deleting and emptying the database, the main thread encapsulates the operation as a task, places it in the task queue, and then returns a completion message to the client indicating that the deletion is complete.

However, the deletion is not actually performed at this point, and the actual deletion of the key-value pair and the corresponding memory space are not started until the backend subthread reads the task from the task queue. Therefore, we call this asynchronous deletion lazy free. At this point, delete or flush operations do not block the main thread, which avoids any performance impact on the main thread.

Similar to lazy deletion, when AOF logging is configured with the Everysec option, the main thread wraps AOF logging operations into a task and places them in the task queue. After reading the task, the backend child thread starts to write to the AOF log itself, so that the main thread does not have to wait for the AOF log to finish writing.

The following diagram shows the cross-step thread execution mechanism in Redis. You can take a look at it again to get a better impression.

It is important to note that asynchronous key-value pair deletion and database cleansing are available in Redis 4.0, and Redis also provides new commands to perform both operations.

Key-value pair deletion: I recommend using the UNLINK command when you have a large number of elements in your collection type (such as million-level or ten-million-level elements) that need to be deleted.
Flushes the database: We can use the ASYNC option at the end of the FLUSHDB and FLUSHALL commands to make the backend child threads flush the database asynchronously, as shown in the following example:

FLUSHDB ASYNC
FLUSHALL AYSNC
Copy the code

summary

In this lesson, we learned about the four main types of interaction objects in Redis instance runtime: client, disk, master/slave library instances, and sliced cluster instances. Based on these four categories of interaction objects, we sorted out the five choke points that could cause Redis performance to be affected, including collection full query and aggregation operations, Bigkey deletion, database emptying, AOF log synchronous write, and RDB file loading from the library.

Among these five choke points, bigkey deletion, database clearing, and AOF log synchronous write are not critical path operations and can be done using cross-step threading. Redis creates three child threads at runtime, and the main thread interacts with the three child threads through a task queue. The child thread performs asynchronous operations based on the specific type of task.

However, asynchronous deletion is only available after Redis 4.0. If you are using a version prior to 4.0, I have a tip for you when you encounter a Bigkey deletion: first read data using the SCAN command provided by the collection type, and then delete. Using the SCAN command, you can read and delete only part of the data at a time. In this way, you can avoid the blocking caused by deleting a large number of keys at a time.

For example, for Hash bigkey deletions, you can use HSCAN to get a few key-value pairs (say 200) from the Hash set at a time, and then use HDEL to delete those key-value pairs. In this way, the deletion burden can be spread over multiple operations, so that each deletion takes less time. I’m not blocking the main thread.

Finally, I would like to mention that the collection full query and aggregate operations, loading RDB files from the library, are on the critical path and cannot be done using asynchronous operations. I also give you two tips for these choke points.

Set full query and aggregation operations: You can use the SCAN command to read data in batches and perform aggregation calculation on the client.
Loading RDB files from the library: control the data size of the main library within 2 to 4GB to ensure that RDB files can be loaded at a faster speed.

Each lesson asking

As usual, I have a quick question for you: Do you think Redis writes (e.g., SET, HSET, SADD, etc.) are on the critical path?

Welcome to write down your thoughts and answers in the comments area, and we will exchange and discuss together. If you found today’s lesson helpful, you’re welcome to share it with more people, and I’ll see you next time.

Asynchrony: How to avoid blocking in a single-threaded model?

What choke points do Redis instances have?

Which choke points can be executed asynchronously?

Asynchronous child threading mechanism

summary

Each lesson asking

Related Posts

Long time no see, Java design patterns

Do you know what happens in GO where variables escape?

MySQL Advanced – Backup and Restore