preface

Hello, everyone. I am a little boy picking up snails. We all know that Redis is fast, with QPS of 100,000 (requests per second). Why is Redis so fast? This article will learn with you.

  • Public number: a boy picking up snails
  • Github address, thanks to every star

Memory-based implementation

We all know that memory reads and writes are much faster than disk reads and writes. Redis is a database based on memory storage, which saves disk I/O consumption compared to a database where data is stored on disk. MySQL and other disk databases need to establish indexes to speed up the query efficiency, while Redis data stored in memory, direct operation of memory, so it is very fast.

Efficient data structures

As we know, MySQL index selects B+ tree data structure for efficiency. In fact, reasonable data structure, is to make your application/application faster. Let’s take a look at Redis data structure & internal coding diagram:

SDS simple dynamic string

Struct SDSHDR {//SDS simple dynamic string len; // Record the space used in buF int free; Char buf[]; // The actual contents of the store}Copy the code

String length handling

In C language, to obtain the length of the string of the little boy picking up field snails, we need to iterate from the beginning, the complexity is O (n); In Redis, there is already a len field to record the length of the current string, which is O(1).

Reduce the number of memory reallocations

In C, modifying a string requires reallocation of memory, and the more frequently you modify it, the more frequently you allocate memory, which can cost performance. In Redis, SDS provides two optimization strategies: space pre-allocation and lazy space release.

Space preallocation

When SDS is simple dynamic string modification and space expansion, unused space is allocated in addition to the required memory space. Here’s the allocation rule:

  • After SDS modification, len length is less than 1M, then additional unused space of the same length as LEN will be allocated. For example, len=100. After reallocation, the actual length of buF becomes 100(used space)+100(extra space)+1(null characters)=201.
  • SDS changes len to a length greater than 1M, then the program allocates 1M unused space.

Inert space release

When SDS shortens, instead of reclaiming the excess memory space, it records the excess space with free. In the subsequent modification operation, the space in free is directly used to reduce the memory allocation.

The hash

Redis is a K-V in-memory database that uses a global hash to hold all key-value pairs. The entry element in the hash bucket holds *key and *value Pointers, where *key refers to the actual key and *value refers to the actual value.

Hash table lookups are very fast, somewhat like a HashMap in Java, which allows us to quickly find key-value pairs in O(1) time. First, calculate the hash value through the key, find the corresponding hash bucket location, locate the entry, and find the corresponding data in the entry.

Some of you might wonder: When you write a large amount of data into a hash table, you run into hash conflicts, and your efficiency slows down.

Hash conflict: Calculate the same hash value with different keys, resulting in the same hash bucket.

Redis uses chain hash to solve hash conflicts. Chained hash means that multiple elements in the same hash bucket are stored in a linked list, which are connected by Pointers in turn.

Some of you might wonder: elements in a hash collision chain can only be looked up by Pointers. When you insert a lot of data into the hash table, there will be more collisions, the longer the list of collisions will be, and the query will be less efficient.

To stay efficient, Redis rehashes the hash table, adding hash buckets to reduce collisions. To make rehash more efficient, Redis also uses two global hash tables by default, one for current use, called the primary hash table, and one for capacity expansion, called the standby hash table.

Jump table

Skip list is a unique data structure of Redis. It actually adds multi-level indexes on the basis of linked list to improve search efficiency. A simple schematic of skip lists is as follows:

  • Each layer has an ordered list, and the lowest list contains all the elements.
  • Skip lists support node lookup with average O (logN), worst O (N) complexity, and batch processing of nodes through sequential operations.

Ziplist ziplist

Ziplist is one of the underlying implementations of list and dictionary keys. It is a list composed of a series of specially encoded memory blocks. A Ziplist can contain multiple entries, and each entry can hold a character array or integer of limited length, as follows:

  • Zlbytes: records the memory bytes occupied by the compressed list
  • Zltail: indicates the offset from the tail node to the start node
  • Zllen: Records the number of nodes in the compressed list
  • EntryX: Compress each node contained in the list
  • Zlend: Special value 0xFF(255 decimal), used to mark the end of the compressed list

Because memory is allocated continuously, traversal is fast.

Reasonable data encoding

Redis supports multiple basic data types, each of which corresponds to a different data structure, and each data structure corresponds to a different encoding. To improve performance, Redis designers have concluded that data structures are best suited for coding.

Redis uses redisObject to represent the key value in the database. When we create a key value pair in Redis, we create at least two objects, one is the key object used as a key value pair, and the other is the value object of the key value pair.

Typedef struct redisObject{// type unsigned type:4; // Unsigned encoding:4; // A pointer to the underlying data structure void * PTR; / /... }robj;Copy the code

In redisObject, type corresponds to the object type, including String, List, Hash, Set, and zset. Encoding corresponds to encoding.

  • String: If a number is stored, the encoding is int; If a non-numeric string of 39 bytes or less is stored, it is embstr. If the value is larger than 39 bytes, it is raw encoding.
  • List: If the List has less than 512 elements and each element of the List has a value less than 64 bytes (default), use ziplist encoding, otherwise use LinkedList encoding
  • Hash: If the number of Hash elements is less than 512 and all values are less than 64 bytes, use ziplist, otherwise use Hashtable.
  • Set: IntSet encoding is used if all the elements in the collection are integers and the number of elements is less than 512, otherwise hashTable encoding is used.
  • Zset: ziplist encoding is used when the ordered set has less than 128 elements and the value of each element is less than 64 bytes, otherwise skiplist (skiplist) encoding is used

Rational threading model

Single-threaded model: Avoids context switching

Redis is single-threaded, which means that the network IO and key pair reads and writes in Redis are done by one thread. But other Redis functions, such as persistence, asynchronous deletion, cluster data synchronization, and so on, are actually performed by additional threads.

Redis’s single-threaded model eliminates unnecessary CPU context switching and contention lock consumption. Because it is single-threaded, if a command is executed too long (such as the hgetall command), it will block. Redis is an in-memory database for fast execution scenarios, so be careful with commands like lrange and smembers and hgetall.

What is context switching? Here’s a corn:

  • For example, if you are reading an English novel, you come to a page and find a word that you can’t pronounce, you bookmark it and look it up in the dictionary. After looking up the dictionary, you come back and start reading from the bookmark, and the process is very smooth.
  • If you read this book alone, you’ll be fine. But if you look it up in the dictionary, someone else looks through your book and runs away. When you come back, it’s not the page you were looking at, and you have to take the time to find your page.
  • A book, you a person how to read how label all right, but many people turn over, this book all kinds of marks are very messy. It may be a crude explanation, but the principle should be the same.

I/O multiplexing

What is I/O multiplexing?

  • I/O: network I/O
  • Multiplexing: Multiple network connections
  • Reuse: Reuse the same thread.
  • IO multiplexing is actually a synchronous IO model, which enables a thread to monitor multiple file handles. Once a file handle is ready, the application can be told to read or write the file. When no file handle is in place, the application blocks and the CPU is surrendered.

Redis uses EPoll as an implementation of I/O multiplexing technology that allows a single thread to efficiently process multiple connection requests. In addition, The event processing model of Redis converts the connection, read and write, and close in Epoll into events, so as not to waste too much time on network I/O.

Virtual memory mechanism

Redis directly built its own VM mechanism, unlike the general system will call system function processing, will waste a certain amount of time to move and request.

What is the virtual memory mechanism of Redis?

The virtual memory mechanism is to temporarily swap infrequently accessed data (cold data) from memory to disk, freeing up valuable memory space for other data that needs to be accessed (hot data). The VM function separates hot and cold data so that hot data is stored in the memory and cold data is saved to disks. This avoids the problem of slow access due to insufficient memory.

Reference and thanks

  • Redis VM mechanism
  • Why is Single-threaded Redis so fast?
  • Insight | Redis is single-threaded, but Redis why so soon?