Advantages of Sds over C strings

1) Constant complexity gets the length of the string.

C string does not record its own length information. The operation to obtain the length of C string is to traverse the whole string and count each character until the empty character at the end. SDS records the length of SDS itself in the len attribute, which can be read to obtain the length of the character

2) Prevent buffer overflows

Because the C string does not record its length, the user is performing string concatenation (SRCCat). If the target string has insufficient memory space and the adjacent space has contents, the contents in the adjacent space will be occupied and the original contents will be overwritten, resulting in buffer overflow. SDS space allocation strategy completely eliminates the possibility of buffer overflow: when the SDS space needs to be modified, check whether the SDS space meets the requirements for modification first. If the SDS space is insufficient, expand the SDS space automatically first, and then modify the operation.

3) Reduce the number of memory reallocations required to modify the string length

Because C string does not record its length, the last of the C string data is used to save the null character as an end, and the C string concatenation or reducing to redistribution of memory space extend array space or release space, without space redistribution, the former can cause buffer overflows, which generates a memory leak.

The length of SDS array is not necessarily the number of strings plus one. SDS implements two optimization strategies of space pre-allocation and lazy space release to manage space.

A) Space pre-allocation. When SDS expands space, it will not only allocate the space required for modification, but also allocate additional unused space (if the modified space is less than 1MB, the actual length of the modified space is the modified length *2+1, and the extra 1 byte is used to store empty strings; If the length is greater than or equal to 1MB, another 1MB of unused space will be allocated. The actual length +1MB+1byte)

B) Lazy space release, SDS string for shortening operation, the program does not immediately use space redistribution to shorten the extra bytes, but with the free attribute recorded, for the next use

4) Binary security

The characters in a C string must conform to an encoding format, and the string cannot contain empty strings. This means that C strings can only hold text data, but not binary data such as images, audio, video, and compressed files

5) Compatible with some C string functions

Redis doesn’t have to write its own function to compare SDS and C string values