No more nonsense, straight to the text

0.1. Redis solves Hash conflicts by using the chained address method. Redis uses dynamic strings to store strings, a data structure that can grow automatically and has properties such as length

0.2. Redis lists are double-ended, loopless, with a header pointer and a tail pointer, with a length counter, polymorphic (void* pointer is used to store the list).

0.3. Redis uses progressive rehash to reduce the load on the server. Every time the dictionary is added, deleted, changed, or checked, the data will be added to the new hash table

0.4. Jump table is an ordered set of one of the underlying implementation, jump table, there are two data structures, is used to store a jump table information, is used to record a jump table node, jump table height is a random number between 1-32, the same jump table, can contain the same score, but the members of each node objects must be unique, When the scores are the same, the size of the member objects is compared

0.5. Compressed lists were developed to save memory. Each node of a compressed list can hold either an integer value or a byte array

  1. Object Encoding Corresponds to:

    

Int: integer; Embstr: a simple dynamic string encoded by embstr; Raw: simple dynamic string; Hashtable: dictionary; Linkedlist: a double-ended linkedlist; Ziplist: Compressed linked lists; Intset: set of integers; Skiplist: skip lists and dictionariesCopy the code
  1. The encoding of string objects can be: int, RAW, embstr.

    

Int: integer value and can be represented by long; Raw: a string larger than 32 bytes, or a floating-point number that can be stored with a long double, an integer that cannot be stored with a long, or an integer that cannot be stored with a long double; Embstr: a string of 32 bytes or less, or a floating-point number that can be stored in a long double, an integer that cannot be stored in a long double, or an integer that cannot be stored in a long double. Modifying embstr will change them to raw code.Copy the code
  1. String object API

    

SET: saves the value GET: gets the saved string, converts it to a string if it is a number and saves it APPEND: appends it to the end of an existing string INCRBYFLOAT: takes an integer value and converts it to a floating point number and adds it up, then saves the result of the addition INCRBY: DECRBY: takes the integer, subtracts it, saves the integer result STRLEN: Returns the length of the string SETRANGE: GETRANGE: Converts the object to a string (specifically for integers and embSTR types) and then performs the operation (setting the value on a string specific index to the given character). GETRANGE: Converts the object to a string (specifically for integers) and then returns the character at the position of the given indexCopy the code

    

  1. There are two conditions for using Ziplist:

    

A. All string elements are smaller than or equal to 64. B. The number of string elements is smaller than or equal to 512. Any one that is not satisfied will use linkedListCopy the code

5. List API

    

LPOP: Returns the header element and deletes RPOP: returns the tail element and deletes LINDEX: locates the element with the specified subscript and returns LLEN: returns the list length LINSERT: LREM: traverses the list, deletes the node containing the given element LSET: locates the specified position, and then updates the positionCopy the code

    

  1. There are two underlying implementations of hash objects: Ziplist or Hashtable. Conditions for using Ziplist:

    

A. The length of the key and value is less than or equal to 64 bytes. B. The number of key-value pairs is less than or equal to 512. Any one of these is automatically converted to a HashTableCopy the code

    

Ziplist: The two nodes that hold key-value pairs are always next to each other, and each new node is always added to the end of the table. Hashtable: use dictionary key-value pairs, where each key is a string object containing the key; Each value is a string object that holds the value.Copy the code
  1. The Hash object API

    

HSET: adds a key/value pair HGET: obtains the value corresponding to a key HEXISTS: determines whether a key exists HDEL: deletes the key/value pair corresponding to a key HLEN: returns the number of key/value pairs HGETALL: traverses the entire hash objectCopy the code
  1. Collection object, encoded in two ways: intSet and hashtable.

    

Intset contains only integers, using the collection of integers as the underlying implementation; Hashtable uses a dictionary as the underlying implementation, the keys are used for saving, and all values are NULL. Encoding conversion: A. All elements of the set are integers; B. The number of elements does not exceed 512. Any of these are converted to hashTable encodingCopy the code
  1. Collection Object API

    

SADD: adds elements SCARD: returns the size SISMEMBER: returns whether or not it is in the collection SMEMBERS: traverses the entire collection SRANDMEMBER: returns a random element SPOP: returns a random element and deletes SREM: deletes elements equal to the given elementCopy the code
  1. An ordered collection object, saved using ziplist or Skiplist.

    

Ziplist uses two compression nodes, the first to hold element members and the second to hold element scores. Skiplist uses zset as its underlying implementation, where each skip table node holds an element, the object attribute holds the element's members, and score holds the element's scoreCopy the code
  1. Ordered set API

    

ZADD: adds elements ZCARD: returns the number of elements ZCOUNT: counts the number of elements in a given range ZRANGE: returns all elements in a given range ZREVRANGE: returns all elements in a given range ZRANK: ZREVRANK: returns the ranking of a given member. ZREM: deletes the node containing the given element. ZSCORE: gets the member's scoreCopy the code
  1. Redis only implements memory sharing for string objects of integer type
  2. TTL and PTTL return the remaining lifetime of the key
  3. There are four ways to set the expiration time:

    

EXPIRE <key> < TTL >: sets the expiration time to TTL seconds PEXPIRE: <key> < TTL >: sets the expiration time to TTL milliseconds EXPIREAT: <key> < TTL >: sets the expiration time to TTL seconds Timestamp PEXPIREAT: <key> < TTL >: sets the expiration time to TTL millisecond timestampCopy the code

The first three are eventually converted to PEXPIREAT, where the database has an expiration dictionary with keys as Pointers to key-value pairs and values as expiration times to maintain the relationship between keys and their expiration times.

  1. PERSIST removes key-value pair expiration times; TTL returns the remaining lifetime of the key in seconds, and PTTL returns the remaining lifetime of the key in milliseconds

  2. Three ways to delete expired keys:

    

Scheduled deletion: When the deletion time is set up, establish a timer to delete the time; Internal good, CPU unfriendly inert delete: when the key is taken out will be checked, delete; Cpu-friendly, memory-unfriendly Periodic deletion: Deletes the key periodically. However, the deletion efficiency is affected by the execution duration and frequency. If the execution duration is long or the frequency is high, the key is degraded to periodic deletion, which wastes too much CPU resources. If the duration is too short or the frequency is too low, it degrades to lazy deletion, causing memory leaksCopy the code
  1. Redis uses lazy deletion and periodic deletion;

    

For lazy deletion, the key health check function is executed each time, followed by the specific operation. If it does not expire, perform the expiration operation. For periodic deletions, a random number of keys will be pulled out each time for deletion checks, and progress Pointers will be used to mark where the progress is.Copy the code
  1. Expired keys in the database do not affect new RDB files. If the server is running as the primary server, the expired keys are filtered out when the RDB file is loaded. If running on a slave server, it will all be loaded, but when synchronizing with the master and slave servers, the expired key will be filtered out.

  2. Expired keys in the database also have no effect on AOF. A delete instruction is appended to the AOF file after the key is lazy or deleted periodically. Therefore, expired keys do not affect AOF, AOF rewriting is similar to RDB loading.

  3. When the server is running in replication mode, the expiration key deletion action of the secondary server is determined by the primary server:

    

When the master server deletes the expiration key, it will display the deletion instruction to the slave server, so as to achieve the deletion; But when read from the server, it is not lazily deleted, as if the key had not expired; The slave server deletes the key only when it receives a delete instruction from the master server. This feature ensures data consistency between the primary and secondary servers.Copy the code
  1. Redis data is stored in memory, so you need to save the data on the hard disk again to avoid data loss caused by power failure or exit. There are two saving methods: RDB and AOF. When the server loads an RDB file, it blocks until the operation is complete.

  2. RDB uses SAVE or BGSAVE to generate RDB files to SAVE data.

SAVE blocks the current thread to SAVE, which prevents the server from performing any external operations; BGSAVE creates a new thread to save, so it does not block the current thread.

The loading of the RDB file is done at server startup and does not require manual loading. Note that since AOF files are updated more frequently, the server loads AOF files first. Finally, during BGSAVE execution, SAVE instructions sent by the client are rejected because parent and child threads conflict to compete for resources; BGSAVE sent by the client will also be rejected, and the two child threads will conflict;

BGREWRITEAOF and BGSAVE cannot be executed at the same time: if BGSAVE is executing, BGREWRITEAOF is delayed until after BGSAVE. If the BGWRITEAOF command is being executed, the server rejects the BGSAVE sent by the client. Both commands are executed by sub-threads, but concurrent execution is not allowed due to disk read/write performance.

  1. The user can set multiple save criteria through the Save option, and BGSAVE is executed when any one of them is met.

    

Format: Save <time> <ops>Copy the code

Explanation: Making OPS changes to the database within time triggers the BGSAVE condition.

  1. The saveParam array of the database maintains an array whose element is a structure containing two properties:

    

Time and number of modifications;Copy the code

    

The server also maintains a dirty counter that records how many changes have been made to the server since the last successful save;Copy the code

Lastsave is a UNIX timestamp that records the last successful save. At the same time, serverCron executes every 100ms. This function is used to perform maintenance on the running server. One of its tasks is to check whether the save criteria set by the Save option are met, and if so, to execute the BGSAVE instruction.

  1. The RDB file structure is a bit more complicated, see page 125.

  2. RDB stores data, AOF stores operation instructions, in order to achieve the preservation of data.

  3. The database contains a server state aOF_buf buffer, which is used to append commands. Each update command is converted to the protocol format and appended to the end of the buffer. For AOF writes and synchronizes, there are three attribute values that determine the flushing of the AOF_buf buffer:

    

Always: always scour, the strongest safety, but low efficiency; Everysec: Refresh once per second, a compromise between efficiency and security (default) No: Never active flushing, flushing behavior depends on the operating system, unsafe.Copy the code
  1. AOF overrides, iterating through the data in the database, then creating the required instructions and writing them, so none of the commands are redundant, greatly reducing the size of the AOF file. AOF rewritten with the child thread, new changes have taken place in order to avoid when writing, parent would change at the same time wrote aof_buf and AOF rewrite buffer buffer (note that the file is written to is first written to the buffer to the hard disk), rewriting is completed, will rewrite the inside of the buffer data written to the disk, the new file renaming, Atomically overwrite existing AOF files to complete file replacement

  2. Redis uses Reactor multiplexing, so it is single-threaded but efficient. Redis multiplexes like single-threaded NIO, with sockets, selectors, dispatchers, and corresponding handlers. But the IO multiplexer maintains a task queue internally, which ensures that the sockets are delivered to the dispatcher in an orderly manner. The performance sequence is evport, epoll, KQueue, and Select.

  3. When the client connects to the server, an AE_READABLE event is generated and the socket becomes readable (the client performs write or close operations on the socket). If both readable and writable events occur at the same time, the readable event is processed first.

  4. Redis sets up a number of file event handlers, including reply connection, command request, command reply, master/slave replication, etc. A complete event request process:

    

1. The ServerSocket of the server is in the listening state, and there is a reply connection handler bound to it; 2. When a client initiates a connection, the server responds to the connection processor, generates a Socket associated with the client, and sets the Socket as AE_READABLE, ready to receive the client's command. 3. The command is received, the result is executed, the socket becomes AE_WRITEABLE event, and a reply handler is bound to it, ready to write the reply to the client. 4. Write is complete and ready to accept readable events again. P.s: In fact, it is the same as the Java NIO stream. Triggered by an event, the ServerSocket on the server listens to the port, and once there is a connection, it establishes a Socket bound to the client and uses this Socket for processing.Copy the code
  1. There are two types of time events, one is timing events, the other is periodic events.

The server creates a globally unique ID for the time event. The new event ID is larger than the old event ID in descending order.

    

When, the millisecond precision UNIX timestamp, records the time when the time arrived; TimeProc, the event handler, a function that is called to handle the event when it arrives. Whether the event is a timed or periodic event depends on the value returned by timeProc: if it is a timed event, the return value is AE_NOMORE, which is deleted once and never arrives again. If the return value is not AE_NOMORE, the event is periodic and the server updates the when attribute of the event based on the return value.Copy the code
  1. All time events are stored in an unordered list, and new events are always added to the table header. Because of the unordered list, it is necessary to traverse the entire list to determine all events. Unordered lists do not affect server performance, because there are only one or two time events in the server. For example, the serverCron function is used to check the server state.

  2. For event scheduling, it calculates the arrival time of the latest time event, and then blocks and waits for the arrival of the file event. The specific blocking time depends on the difference between the arrival time and the current time, so the file event is executed first, and the time event is processed after all the file events are processed, and then the cycle is completed.

  3. Both the time event and the file event will be executed in an orderly, atomic manner, while the file event will voluntarily relinquish execution power to avoid starvation events.

  4. For each client that connects to the server, the server saves “client state” to record the client state, including:

    

Socket descriptor 2. Client name 3. Client identifier 4. The pointer to the database that the client is using, and the database number 5. The instruction that the client is currently executing, its parameters, the number of parameters, and the pointer to the function that implements the instruction 6. Client input buffer and output buffer 7. Client replication status information and data structure required for replication 8. The client performs list blocking commands such as BRPOP, BLPOP, etc. using data structures 9. The transaction state of the client and the data structure used when executing the WATCH command 10. Data structures used by clients to perform publish and subscribe functions 11. Client authentication flag 12. The time the client was created, the last communication with the server, and the time the input buffer size exceeded the soft limitCopy the code
  1. The client state is found by traversing a linked list that records the structure of the client state. Loading AOF files and executing Lua scripts, the fd attribute value of these two pseudo-clients is -1. Other real clients are all natural numbers, and the client also has the name attribute, used to distinguish, easy to remember!

  2. The flag attribute of the client records the role of the client. For details about the role, see p165. However, PUBSUB and SCRIPT LOAD are forced to write to AOF files even though they do not perform modifications, because they change the state of the server.

  3. The input buffer size can be adjusted dynamically, but cannot exceed 1G, otherwise the client will be shut down. The client maintains two command properties: the command and the number of commands. It is then the first to look up (in a dictionary) the corresponding execution function based on the command arguments, and executes.

  4. The client has two output buffers, one fixed size and one variable size; Fixed size is used to handle short replies and variable size is used to handle larger replies. The variable size is actually a list of strings that are linked together to form a long reply. The client authentication property records whether the client is authenticated. The client also has two time attributes, one is the time when the client was created and one is the time when the client was last active.

  5. For normal clients, if a new connection is established, the client state is added to the end of the Clients list. Client connections can be closed for a variety of reasons, such as too large a reply, too large an input, sending a non-standard request, or timeout (except when the master or slave server executes certain commands as clients), pseudo-clients, AOF’s are created during loading, Lua’s are closed after loading: A client that executes Lua scripts is created during initialization and exists until the server is shut down.

  6. The server performs client command processing: the client sends command requests, the server receives command requests, parses commands, and enters the command executor.

  7. Find the command specified by argv[0] in the command table (a dictionary) according to the argv[0] parameter of the client state and save the found commands in the CMD property of the client state. The key of the dictionary is a command name, and the value is a redisCommand structure, which records the implementation information of the command, including:

    

Proc: specifies a pointer to a command implementation. Function arity: specifies the number of arguments to a command. If -n is used, the number of arguments must be greater than or equal to n. Milliseconds: indicates the total time it took to execute the commandCopy the code

The preparatory actions are then performed, including:

    

Check whether the command is valid. 2. Check the number of parameters. If maxMemory is enabled on the server, it checks the available memory size first and reclaims it if necessary. BGSAVE error whether to continue writing command 6. Whether command subscribe channel 7. If the database is loading the command, determine whether the command has the I flag 8. The server performs Lua script execution and the client executes the special command during the transaction verification 9. If the server has monitor enabled, the command details are sent to the monitorCopy the code

We then call the implementation function of the command, which calls the CMD property of the client and uses a pointer to the state of the client as an argument, since the state of the client holds the command. Finally, we do some follow-up work, such as turning on logging and updating the milliseconds property. AOF synchronization, from the server.

  1. The server writes the reply to the client’s output buffer, sets it to reply mode, and then sends it to the client, which prints the reply.

  2. The serverCron function, which does the following:

    

1. Update the server time cache (because some applications frequently fetch the time, so there is a time cache, just a number), but for those who require high accuracy of time, still need to fetch online time (online time); 2. Update LRU clock; Update the server execution times per second, using the sampling method to count the number of command requests processed in the last second; 3. Update the memory peak value record of the server. Process SIGTERM signal; 4. Manage client resources. For example, if the connection times out, the client will be released. If the input buffer is too large, the current buffer will be released and a default size input buffer will be created to prevent the input buffer of the client from consuming too much memory. 5. Manage database resources; 6. Execute delayed BGREWRITEAOF; 7. Check the running status of the persistent operation. 8. Write the contents of the AOF buffer to the AOF file. 9. Stop the asynchronous client. 10. Increment the CronLoops counterCopy the code
  1. Initialize the server, including setting some default parameters. Load configuration options, such as ports, such as number of databases; Initialize server data structures, such as server.clients, server.db, etc. Set up signal handler for server, create shared object, open server listener port, create time event for serverCorn; Prepare AOF persistence function, initialize BIO module of server background; Restore database state, if AOF function is enabled, use AOF file to restore, otherwise use RDB file to restore; The final step is to execute the event loop and get started!

  2. The SLAVEOF command can be used to create a relationship between master and slave servers, such as: slave server IP: port > SLAVEOF < master server IP: port >. The two servers will keep the same database state.

  3. In the old version, the replication function of Redis is divided into two operations: synchronization, which synchronizes the status of the slave server to be consistent with the status of the master server, and command propagation is used to cause the inconsistency between the master and slave when the status of the master server changes, so as to make the data return to the consistent state again. Synchronization. When you run the SLAVEOF command, the slave server first synchronizes the status of the slave server to be the same as that of the master server. The slave sends SYNC to the master, the master receives and executes the BGSAVE command, generates the RDB file, and records all write commands from now on, when the BGSAVE is executed, sends the RDB file to the slave, the slave server loads, and then sends the buffer command, keeping exactly the same, Then, execute command propagation to ensure consistency. The disadvantage is that the method of sending the SYNC command is still used for disconnected replication, which consumes a lot of system resources and is inefficient.

  4. In the new version, the PSYNC command is used instead of SYNC to perform synchronization during replication. PSYNC supports full and partial resynchronization modes.

Full resynchronization is used to perform the initial replication. The steps are similar to the SYNC command. Partial resynchronization: The primary server records the write commands of the secondary server during the disconnection and sends them to the secondary server when the secondary server is reconnected.

Partial resynchronization implementation details:

    

First, the replication offset of the master server and the replication offset of the slave server; Replication backlog buffer for primary server; The master sends n bytes to the slave server every time, the offset is + N. The slave server receives N bytes of data, and also puts its own offset + N. In this way, by comparing the offset of the master and slave servers, we can determine whether the synchronization is possible. The replication backlog buffer is a fixed-length, first-in-first-out queue maintained by the primary server (when the number of elements is greater than the queue length, queue first out, queue last in), with a default size of 1MB. If the data from server offset +1 to primary offset (M-n) is in the backlog buffer, send a CONTINUE+ reply indicating partial resynchronization, otherwise full resynchronization. Setting the correct size of the backlog buffer is also key; Each server has its own running ID. During the initial replication, the primary server sends its OWN ID to the secondary server, which saves the ID. After disconnection, the secondary server sends the saved ID when connecting to the secondary server.Copy the code
  1. PSYNC implementation:

    

If the slave server never copied, or "SLAVEOF no one" was executed, then PSYNC? -1 indicates full resynchronization. If yes, PSYNC <ID> <offset> will be sent. If the primary server returns +FULLRESYNC <ID> <offset>, then the resynchronization is about to complete. ID is the ID of the primary server, offset is the current replication offset of the primary server. The secondary server will use this value as its initial offset. If a +CONTINUE reply is returned, the master server will send the missing portion of the slave server; If -err is returned, the main service version is below 2.8 and PSYNC is not recognizedCopy the code
  1. Implementation of replication

    

1. Set the IP address and port of the primary server. 2. Set up socket connections (same as Java IO) 3. To send the Ping command, one is to check whether the socket read and write status is normal, two is to check whether the master server is normal, three is to judge the network status of the slave server, four is to tell the slave server whether the master server can handle the request normally, five is to say that the master server is normal if Pong is read. Authentication, which depends on whether the primary and secondary servers have this option set, so there are four cases 5. 6. Send PSYNC for synchronization. Prior to this, only the slave server played the role of client, and then the master server played the role of both client and server 7. The master server broadcasts commands to achieve real-time synchronizationCopy the code
  1. In the command broadcast phase, the slave server sends the command REPLCONF ACK to the master server once per second. Offset is the current replication offset of the server. Its functions are as follows:

    

One is to detect the network status of the primary and secondary servers. The second is to assist the realization of min-Slaves option. Third, the detection command is lost. If the data sent by the master server is not fully accepted due to network problems, it can be determined by the replication offset sent from the server, and then resent.Copy the code
  1. Sentinel mechanism is used to monitor master and slave servers, and when the master goes offline, one of the slave servers is set up as the new master server to continue processing requests.

  2. Start the Sentinel command: redis-sentinel <sentinel.conf path >

    

1. Initialize the server. 2. Replace the code used by the regular Redis server with Sentinel special code 3. Initialize the Sentinel status 4. Initialize the Sentinel monitoring master server list 5 based on the given configuration file. Create a network connection to the primary serverCopy the code

Sentinel is essentially a Redis server running in a special mode, so a server is initialized first, but there are a few differences. For example, Sentinel does not use a database, so it does not use an RDB or AOF file to load the database contents.

Other commands, such as SET, DEL, FLUSHDB, don’t use database keys on the other side;

Do not use transaction command (MULTI, WATCH); Do not use the script command (EVAL); Do not use RDB and AOF persistence commands (SAVE, BGSAVE, BGREWRITEAOF);

For the copy command, Sentinel is used internally but not by the client;

PUBLISH and subscribe commands, PUBLISH can only be used internally;

File event handlers (used internally and with slightly different processors);

Time event handler (used internally, and also called sentinelTimer after serverCron is called)

Use Sentinel special code, that is, replace the code used by the regular Redis server with Sentinel special code. Because the command table used in Sentinel mode does not contain key-value pair commands, these commands cannot be used. Therefore, the Sentinel client can execute only the following commands:

    

PING
SENTINEL
INFO
SUBSCRIBE
UNSUBSCRIBE
PSUBSCRIBE
PUNSUBSCRIBE
Copy the code

These seven commands.

    

The server initializes a Sentinel structure that holds all sentinel function-related state in the server. Initialize the dictionary property master, which records information about the monitored primary server. The key is the name of the monitored primary server, and the value is the sentinelRedisInstance structure corresponding to the monitored primary server. This instance structure can be master, slave, or even another Sentinel; Initialize the network connection to the master: Sentinel becomes the client of the master, but sentinel creates two connections, a normal command connection and the _sentinel_: Hello channel dedicated to subscriberating to the master. Sentinel sends an INFO command every 10 seconds to get information about the master server, as well as information about affiliated slave servers. Sentinel sends an INFO command to the slave server to get information about the slave server. It then sends a message to the master and slave servers, which contains information about sentinel itself; Receive channel information from master and slave servers; Update the Sentinels dictionary, because it monitors a master server that may not only be monitoring it, but also other Sentinels, so this information is updated; When sentinel finds a new Sentinel, it not only saves its information, but also creates a command connection to it, but does not create a subscription connection.Copy the code
  1. Check the subjective offline status. If there is no reply from +PONG, -loading, or -masterdown within the set time, sentinel considers the server as subjective offline.

    The Sentinel then asks other Sentinels monitoring the server, and if they get enough offline approval (the value of the election parameter), it determines that it is an objective offline and ready for failover.

    The lead Sentinel is selected for failover.

  2. Failover, select a new master server from the slave server of the offline master server, have the remaining slave servers copy the new master server, set the old master server to be the slave server of the master server, and send SLAVEOF no one to make a server become the master server

  3. CLUSTER MEET, you can add nodes specified by IP and port to the CLUSTER of the server connected to the current client. The node in CLUSTER mode is an ordinary server, but with clisterNode, clusterLink and clusterState structures.

  4. ClusterNode stores detailed information about the node, and the Link attribute points to a clusterLink. This structure is similar to redisClient and contains connection information, socket descriptors, input and output buffers, etc. The difference between redisClient and redisClient is that redisClient is used to connect clients, while clusterLink is used to connect nodes. ClusterState Records the cluster status of the current node.

  5. Implementation principle of CLUSTER MEET:

    

1. Node A shakes hands with node B, creates A clusterNode structure for node B, and saves the structure in the clusterState.nodes dictionary 2. After receiving the message, node B also creates A clusterNode structure for node A to store node A's information 3. Node B returns A PONG message 4. Node A returns PING message 5. Node B receives the PING, and the handshake is completeCopy the code
  1. The cluster saves data by fragmentation. The database of the cluster is divided into 16384 slots. Each key in the database belongs to one of the 16384 slots.

    If each slot has nodes processing, the cluster is said to be in online mode, otherwise it is offline mode. Assign slots to nodes: CLUSTER ADDSLOTS [slot…] For example: CLUSTER ADDSLOTS 0 1 2 3 4… 5000.

    The clusterNode slots attribute (a binary array) and the NumSlot attribute record which slots the node is responsible for processing. In addition to recording, a node will also tell other nodes which slot it wants to process.

    ClsterState also has an array of slots Pointers that point to the slots to be processed. Empty indicates that the slot is not processed. The use of two types of records improves the efficiency of sending slot information and slot locations.

  2. The CLUSTER ADDSLOTS implementation first iterates through all slots to determine whether the assigned slot has been assigned. If it has been assigned, an error is returned. Otherwise, it iterates again to set binary values and Pointers (two slots attributes). When a command is executed in a cluster, the server checks which slot the command is intended to process and will process it if it is its own. If not, it returns a MOVED error and directs the client to the correct node for processing.

  3. Clsterstate. slots[I] = ‘MOVED’; ‘MOVED’ < slot number >;

    In fact, clients in a cluster mode are often connected to more than one node, so when they receive a MOVED error, they jump to another socket to execute, and if they don’t connect, they connect first and jump again.

    The realization of node database, and stand-alone database, but can only use 0 database; The resharding operation allows a node that has been dispatched to be assigned to another node. This operation can be performed online.

    Resharding implementation principle: Resharding is performed by Redis cluster management software redis-trib. Redis provides all required commands, while Redis-Trib sends commands to source nodes and target nodes. Specific steps:

        

    1. Send CLUSTER SETSLOT <slot> IMPORTING <source_id> to the target nodes to prepare the key-value pairs of the imported slots. CLUSTER SETSLOT <slot> MIGRATING <target_id> to prepare the source node to migrate the slot to the target node. 3. Send CLUSTER GETKEYSINSLOT <slot> <count> to the source node to obtain a maximum of count of key-value pairs belonging to slot <slot> 4. For each key name, redis-trib will send a MIGRATE <target_ip> <target_port> <key_name> 0 <timeout> to the source node, and the selected key will MIGRATE atomically from the source node to the destination node. 6. Redis-trib sends the CLUSTER SETSLOT <slot> NODE <target_id> command to any NODE in the CLUSTER to tell all nodes in the CLUSTER about the migration.Copy the code

If the node on which the client operates is being moved, the server returns an ASK error and directs the client to the target node, again sending the command it wanted to execute.

  1. For the implementation of the command, the clusterState structure import_SLOts_from records the slot where the current node is being imported from another node. In addition, migrating_SLOts_to [I] stores the slot where the current node is being migrated to another node. In this case, Will tell the client;

    The client sends ASKING(which sets the ASK flag of the client to ASKING) and then resends the command to the target node.

    At this point, the destination node will check the ASKING flag of the client. If it does, it will execute the command for it temporarily. Otherwise, the MOVED command will be returned.

    The MOVED error is that the slot is not taken care of by the node, so a move error will occur again, but the ASK error will normally occur once, because the next slot will be processed by the new node (the original old node told the client node), and the slot will be found here, so the ASK error will only occur once, unless the node is MOVED again.

  2. Nodes are classified into primary and secondary nodes. The secondary node becomes the new primary node after the primary node fails. The master node is responsible for processing slots, which is done by the other master nodes of the cluster, which will own the slots of the old master node due to slot transfer;

    When the old master node comes online again, it becomes the slave node of the new master node. Send to a node:

        

    CLUSTER REPLICATE <node_id> 
    Copy the code

    Makes the node that receives the command become the secondary node of node_id and begins replication of the primary node.

    The node sends a PING message to detect faults. If the PING message is not returned in time, it is recorded as PFAIL(suspected offline). At the same time, each node will exchange information with each other.

    A node is marked as FAIL if half of its primary nodes are considered PFAIL;

    Next comes failover (selecting a new master node, then copying the failover to become the new master node), followed by voting for the new master node.

  3. Nodes communicate via messages, of which there are five main types:

        

    MEET(request recipients to join the current cluster); PING(check whether online); PONG(reply to PING); FAIL(telling all other nodes that a node has gone offline); PUBLISH(when the node receives the command, it also executes the command and sends it to other nodes, like a chain reaction, one by one);Copy the code

    Node sends messages similar to HTTP, has a header, which contains a sender of information (including, message length, message type, message body contains the number of node information, both a few messages, the sender’s configuration era (all) is the primary node configuration era, the name of the sender, the sender’s current slot assignment information, The name of the current replication master node, the port number of the sender, the identifier of the sender, the cluster status of the sender, and the body (content) of the message.

  4. MEET, PING and PONG will randomly select two nodes as the target to send;

    FAIL information spreads rapidly and can immediately tell all primary nodes about the offline information of a primary node.

    A PUBLISH to a node causes all nodes to send messages to a channel

  5. If some clients subscribe to a channel, sending a command to the channel will make the channel subscription clients receive the command;

    In addition to the SUBSCRIBE command, there is also the PSUBSCRIBE command that causes a client to SUBSCRIBE to a pattern. Whenever a client sends a message to a channel, not only the channel’s subscribers receive the command, but also the subscribers of the matching pattern receive the command.

  6. When a client executes a SUBSCRIBE command to SUBSCRIBE to a channel or channels, a subscription relationship is established between the client and the channel to which it is subscribed.

    Redis stores all channel subscriptions in the server state: Pubsub_Channels dictionary. The dictionary’s key is a channel subscribed to, and its value is a linked list of all the clients subscribed to that channel. Similar to channel, the subscription relationship of mode is also saved by Redis database.

    The PUBsub_Patterns attribute is a linked list. The node of the linked list is a subscription relationship. The pattern attribute of the node records the subscribed pattern, and the client attribute records the clients of the subscribed pattern.

  7. Commands to view subscription information:

    

1. PUBSUB NUMSUB <channel-1 channel-2 < CHANNELS <pattern> Channel-n >: returns the number of subscribers to the channel 3. PUBSUB NUMPAT: returns the number of modes to which the server is currently subscribedCopy the code
  1. Transaction: provides a mechanism to package multiple command requests and execute multiple commands in sequence at once, with the server blocked and not stopped to perform other tasks during the transaction.

    The commands to implement the transaction function are:

        

    MULTI;
    EXEC;
    WATCH;
    Copy the code

    Such as the order.

    The implementation of a transaction typically goes through three phases:

        

    Transaction start; Command to join the ranks; Transaction execution;Copy the code
  2. The execution of the MULTI command marks the beginning of a transaction, which switches the client to the transaction state.

    Command execution:

        

    For the command sent by the client, if one of the four commands is EXEC, DISCARD, WATCH, or MULTI, the server immediately executes this command. Otherwise, the command is placed in a transaction queue and QUEUED is returned; Each client has its own transaction state, which is stored in the mState property of the client state. The transaction state contains a transaction queue and a counter of the queued commands. A transaction queue is a multiCmd array in which each element holds information about a queued command, including the function that the command implements, its arguments, and the number of arguments. When the client executes the EXEC command, the server executes all the commands in the transaction queue and returns the execution results of all the commands.Copy the code
  3. The WATCH command is an optimistic lock that monitors any number of database keys and, when executing the EXEC command, checks to see if at least one of the monitored keys has been modified and, if so, rejects the transaction and returns an empty reply.

    Each database has a watched_keys dictionary whose key is a database key that is being monitored by the WATCH command. The dictionary’s value is a linked list whose element monitors the client for that key. Any command that modifies the key triggers the unsafe key function, which causes the server to turn on unsafe flags for all clients monitoring the key according to the watched_keys dictionary. When these databases fire EXEC, they return an error because the key has been modified.

  4. ACID properties of the event:

    

Atomicity: For operations, either all or none are executed, so Redis transactions are atomically consistent: if the database is consistent, then the database should remain uniformly isolated regardless of whether the transaction is successfully executed: Even if the database has multiple tasks executing, each transaction does not affect each other, and the results of transaction execution are the same: Transactions in RBD mode are not durable (because RDB saves are triggered only when certain conditions are met), data is lost once the database is down, and persistence is only available when AOF and appendfSync is set to alwaysCopy the code
  1. The Redis SORT command sorts the values of list keys, set keys, or ordered set keys. There is also SORT BY, ASC, DESC, ALPHA, LIMIT, STORE, GET, BY.

  2. Use of the SORT command and implement:

    

SORT <key>: This command sorts a key that contains numeric values, that is, the elements of the key are all numbers. SORT [number] Create a redisSortObject array as long as number. This array has two attributes (obj pointer and double u), where u is used to record values. Obj points to num[I], and u is the value of num[I]. SORT by key ALPHA: This command sorts by key containing the string. Create an array of redisSortObject items, iterate over the array, SORT each item obj to each element in the set of strings, SORT by the order of the set elements obj points to, SORT by the order of the set elements obj points to, SORT by the order of the set elements obj points to, SORT by the order of the set elements obj points to, SORT by the order of the set elements obj points to in the dictionary. SORT <key> BY: SORT BY the value of the specified field as a keyword. The obj pointer points to each element of the set. Iterate through the array to find the appropriate weight key based on the element pointed to BY the obj pointer and the pattern given BY the option; Convert the weight key to a float of type double and store it in the u property. The u attribute is used as the weight to sort the array and get the new array. SORT <key> BY <? > ALPHA: BY: returns the element obj points to BY iterating through the array again, finding the weight key (string), sorting it, iterating again, and returning the element obj points to. SORT <key> [Optional] LIMIT <offset> <count>: offset indicates the number of elements to skip, and count indicates the number of elements to return after skipping a given number of elements. SORT <key> [optional] GET < mode >: the GET command can select some key values instead of all keys. Sort, traverse, find the corresponding key according to the pattern specified by GET, return the found key. For multiple Gets, multiple selections are made and the result is returned. SORT <key> [Optional] STORE < new key name >: STORE saves the sorting result in the new key. Implementation: sort, check whether the new key name exists, exists to delete, do not exist to create a list of keys, traversal number group, in turn the elements into the list inside, and then traversal number group, return to the client sort results.Copy the code
  1. Sort conditions are executed in order: Sort -> Limit length -> GET external key (GET option)-> save sorted result set -> return result to client; The preceding order cannot be executed before the preceding one. At the same time, the order in which the options are placed does not affect the order in which they are sorted.

  2. Redis provides SETBIT, GETBIT, BITCOUNT, and BITOP commands for handling binary arrays. SETBIT can be specified as an array of binary set values on the offset, the offset starting from 0 count, count from right to left, the binary value can be 0 or 1, usage:

    

SETBIT bit <index> [0/1]; GETBIT: GETBIT <index>; GETBIT: GETBIT <index>; The BITCOUNT command is used to count the number of bits 1 in an array of bits. Finally, the BITOP command can not only perform bitwise and, bitwise or, bitwise XOR operations on multiple bits groups, but also invert a given bit array.Copy the code
  1. Redis uses string objects (SDS) to represent arrays of bits, and it also uses the manipulation functions of SDS structures to handle arrays of bits.

  2. The slow query log function records the execution of commands whose execution time exceeds the specified time limit. Two configurations of the server:

    

Slowlog-log-slower than option logs command requests that take longer than n milliseconds to execute. The slowlog-max-len option specifies the maximum number of slow query logs to be saved by the serverCopy the code
  1. The SLOWLOG GET command displays slow query logs.

  2. The client sends the MONITOR command to make the client a MONITOR that can receive and print information about the command requests currently being processed by the monitored server.