The paper
I believe you will often be asked an open question in the interview, why is Redis so fast? Before learning Redis systematically, I found the answer to this question on the Internet often because redis is based on memory, REDis IO multiplexing model, Redis is single-threaded with reduced context switching and so on. But these answers are too broad to make the interviewer’s eyes pop. So here’s a new look at what makes redis so fast at the data structure level, in terms of String objects.
What is the SDS
SDS — Simple Dynamic String in Redis
- The underlying implementation of the String object String is SDS
- SDS is used when Redis represents a string that can be modified
- The AOF buffer in the AOF module and the input buffer in the client state are implemented by SDS.
Definition of SDS
Struct SDSHDR {// Record the number of bytes used in the buF array // is equal to the length of the string held by the SDS int len; int free; // A byte array to hold the string char buf[]; }Copy the code
For example, an SDS data structure with a string “Kobe” added would look like this
- The free property value of 0 means that the SDS has not allocated any unused space, which will be useful in “memory preallocation” later
- Len’s attribute value is 4, indicating that the SDS holds a string of 4 bytes (excluding the last null character ‘\0’).
- The BUF property stores bytes of real data.
- Use ‘\0’ as the terminator, consistent with the C string representation
C string contest
The advantage of representing the end of a string with ‘\0’, as described above, is that you can directly reuse part of C’s string library. The difference with C strings, however, needs to be explained in detail.
Constant complexity gets the length of the string
As mentioned earlier in the string length section, the Len attribute directly represents the length of an SDS, rather than C, which requires traversing the string to calculate the length. In this way, SDS reduces the time complexity of obtaining string length to a constant level, that is, using space for time.
Prevent buffer overflow
For example, if we want to change the previous string ‘Kobe’ to ‘KobeBryant’ and call stract(s1, ‘Bryant’) to concatenate the string, if we forget to allocate enough space for the new string before concatenating, we will cause a buffer overflow and an accidental tampering of other data may occur. So what does Redis do?
The redis concatenation process looks like this. The call to stract(s1, ‘Bryant’) checks if the given string space is sufficient (len+free is used to check if there is enough space). If there is not enough space, it expands to enough space. An important step in performing the ‘Bryant’ operation is to also allocate 10 bytes of unused space to the SDS. The operation is shown below:
As you can see, not only has the length value changed to 10, but the free value has also changed to 10.
Reduces memory reallocation when modifying strings
In the actual business use of Redis, data is frequently modified and has strict requirements on speed. Therefore, every modification requires a memory reallocation, which is obviously undesirable. Therefore, Redis makes use of the free attribute, that is, the unused space adopts two optimization strategies of “memory pre-allocation” and “lazy space release”.
Memory preallocation
To sum up, this is the optimization of the way in which space is used for time in the growth operation. When redis expands, it not only allocates the space required for modification to SDS, but also allocates additional unused space to SDS. It can be understood as a possible future modification of the pre-allocation of memory. The policy logic for execution looks like this:
If the length of SDS (the len attribute value) is less than 1MB after modification, the program allocates the same amount of unused space as len, and the length attribute value and free attribute value of SDS are the same
If the length of SDS (the len attribute) is greater than or equal to 1MB after modification of SDS, the program sets 1MB of unallocated memory space.
Inert space release
The general idea is to release the shortening operation used to optimize the SDS string. Shortening ‘KobeBryant’ to ‘Kobe’ does not immediately use memory reallocation to reclaim the shortened extra byte space, but instead uses the free property to record it for later use. As shown in the figure below
Binary security
The BUf [] property stores binary data, so it can store binary data such as pictures, videos, audio, and compressed files.
Summarized below
- Redis represents SDS as a string in most cases.
- Compatible with some C string functions
- Constant complexity gets the length of the string
- Prevent buffer overflow
- Optimization strategies to reduce memory reallocation when modifying strings: “Memory preallocation”, “Lazy space free”
- Binary security
Ps: Today is 2021-8-24. Use the name “Kobe” as a way to honor him and hope that the legend will live on forever.