The underlying implementation — the structure
A string in Redis is a modifiable string that exists in memory as a byte array. A string in Redis is a modifiable string that exists in memory as a byte array.
struct SDS<T> {
T capacity; // Array capacity
T len; // Array length
byte flags; // Special identifier bit, ignore it
byte[] content; // The contents of the array
}
Copy the code
The e author of this code is probably trying to express generics, but it is not necessary to use generics if the underlying array is a byte array, and it does not make sense to set the size and length of an array that must be an integer. , and SDS itself as a string, T[] content may not be appropriate
Is it really so
String expansion
Len and Capacity are the same for initial allocation, but capacity is triggered after append
APPEND can provide a compact representation of a series of fixed-size samples, often referred to as time series.
/** *** s string *** t The content to append to the string *** len Appends the length of the string **/
sds sdscatlen(sds s, const void *t, size_t len) {
size_t curlen = sdslen(s); // The length of the original string
// Adjust space as needed. If capacity is insufficient to accommodate the appended content, the byte array is reallocated and the contents of the original string are copied into the new array
s = sdsMakeRoomFor(s,len);
if (s == NULL) return NULL; // There is not enough memory
memcpy(s+curlen, t, len); // Appends the contents of the target string to the byte array
sdssetlen(s, curlen+len); // Set the append length
s[curlen+len] = '\ 0'; // Let the string end with \0 for debugging and printing, and use glibc's string function directly
return s;
}
Copy the code
Size_t instead of int: Because len and capacity can be represented by byte and short when the string is short, Redis uses different structures for strings of different lengths to optimize memory.
The template is abstracted by the author, and the actual Redis source code is dynamically selected from the unsigned char/short/int/longlong
Redis specifies that the string length cannot exceed 512 megabytes. Len is the same length as capacity, and no extra space is allocated. This is because most scenarios do not use append operations to modify strings.
Embstr vs RAW (Two storage methods)
Redis strings can be stored in two ways. Embedded strings are stored in EMB format when the length is very short, and raw strings are stored in RAW format when the length exceeds 44.
Redis object header structure
All Redis objects have the following structure header:
struct RedisObject {
int4 type; // 4bits
int4 encoding; // 4bits
int24 lru; // 24bits
int32 refcount; // 4bytes
void *ptr; // 8bytes, specifies the location where the body of the 64-bit system object is stored
} robj;
// 4*2/8 + 24/8 + 32/8 + 64/8 = 16
Copy the code
When the string is small, the size and length can be expressed as 1 byte, i.e. 3+ (content).size + object header (16)
Memory allocators such as Jemalloc/TCmalloc allocate memory in units of 2, 4, 8, 16, 32, 64, etc. To accommodate a complete EMBSTR object, Jemalloc allocates at least 32 bytes of space. If the string is slightly longer, That’s 64 bytes of space.
The longer one is RAW. Unlike EMB, which puts SDS and object headers together, RAW takes out the character array as its own content
errata
After appEnd, the storage mode changes
In Redis version 4.0.14, no matter how short the initial set string is, embstr becomes RAW whenever the append operation is performed, no matter how short the append string is.
SDS structure
struct sdshdr{
// Records the number of bytes used in the BUF array
// is equal to the length of the string saved by SDS
int len;
// Records the number of unused bytes in the BUF array
int free;
// An array of bytes to hold strings
char buf[];
}
Copy the code
Please add the following text and link at the end of the article: This article is participating in the “Gold Digging Booklet free learning!” Event, click to view details of the event