Trust me, read on, no gain you take a knife to me!

preface

Let’s start with the CPU cache. The CPU cache is the memory that can exchange data at a high speed. It exchanges data with the CPU before the memory, so it is fast. The working principle of cache is that when the CPU wants to read a data, it first looks up the data in the CPU cache, and immediately reads it and sends it to the CPU for processing. If the data is not found, it is read from the relatively slow memory and sent to the CPU for processing. At the same time, the data block in which the data resides is transferred to the cache, so that the future reading of the whole data block is carried out from the cache without the need to call the memory

Why introduce CPU caching? Before the explanation must first understand the execution process of the program, first from the hard disk to execute the program, stored in memory, and then to the CPU calculation and execution. Since the speed of memory and hard disk is much slower than that of CPU, THE CPU has to wait for memory and hard disk for each program execution. The cache technology is introduced to solve this contradiction. The speed of cache and CPU is the same, and the CPU reads data from cache much faster than the CPU reads data from memory, thus improving the system performance. Currently, mainstream cpus have level 1 and level 2 caches, and high-end cpus even have level 3 caches

The above is about CPU cache, so you can read the following article to have a drug introduction, if you do not understand the problem, continue to read

Here we go! In development, we often mention the cache and the CPU cache above have the same effect, but is not the same as CPU cache, when we write a server program, the purpose of using the cache does not reduce the number of database access to reduce the pressure of the database and improve application response time, however, according to the specific usage scenarios and can be derived out numerous kinds of circumstances:

  • For example, if the program reads the database frequently, but the query results are always the same, can the same part of the results be cached?
  • It takes a lot of time to get the query result. Can the result be cached?
  • Some of the data that is needed on every page of the site, such as user data, can be cached?

There are many ways to implement caches, such as Google’s Guava package Cache, distributed Cache redis, memcached, EHcache, custom Cache (such as using static Map implementation), etc. Here we will explain the most commonly used Redis, there will be some simple operations, have not downloaded, will not? Don’t panic, click download. Redis. IO/releases/download, after the download is complete the steps please click my redis tutorial series;

Redis data types and memory principles

Redis is a high performance key-value storage system based on C language. It runs in memory but can be persisted to disk. It has a variety of data structures such as String, hash, list, set and zset. Ordered collections), and some advanced data structures HyperLogLog, Geo, Pub/Sub. Each type of storage has a different encoding format (redisObject, SDS, etc.) underneath.

What makes Redis so strong?

  • High performance, Redis can read 110000 times /s, write speed is 81000 times /s.

– Support multiple data structures, such as String, list, dict, set, zset, hyperloglog, cardinality estimation

  • Atomic, all operations in Redis are atomic, meaning that they either succeed or fail at all. Individual operations are atomic. Multiple operations also support transactions, namely atomicity, wrapped in MULTI and EXEC instructions.
  • Support AOF and RDB persistent operation, data backup;
  • Rich features, Redis also supports pub/sub, notifications, key expiration and more.

Introduction to Data Types

String(String)

A Redis string is a sequence of bytes. Redis strings are binary safe, which means they have a known length without any special characters terminating, so you can store anything up to 512MB; Example:

  • Set Key name Stores a string object value

Hash(hash表)

Redis’s hash is a collection of key-value pairs. Redis hashes are mappings between string fields and string values, so they are used to represent objects.

Example:

  • Hset Stores a set of hash key field pairs (hset key field value)
  • Hget Retrieves the value of a hash key (hget key field)

Xiaoming is recognizing Hash for redis storage, and Hash is really storing key(year, score), value(18, 99).

The List (List)

Redis lists are simple lists of strings, sorted in insertion order. You can add elements to the top or bottom of the Redis list, allowing you to add duplicate elements.

Example:

  • Lpop Key removes one element from the left and RPOP Key removes one element from the right
  • Len key returns the number of elements in the list equivalent to select count(*) in a relational database.
  • The lrange key start end lrange command returns all elements indexed from start to stop. The Redis list starts with an index of 0.

Set (Set)

The collection of Redis is an unordered collection of strings, with no repetition allowed.

Example:

  • Sadd Key value Adds a string element to the set corresponding to the key, returns 1 on success, or 0 if the element is already in the set;
  • Smembers key returns all elements of a set that correspond to the key.

SortedSet zset

Redis ordered collections, like collections, are collections of string elements and do not allow duplicate members. The difference is that each element is associated with a double score. Redis uses scores to sort the members of a collection from smallest to largest. The members of an ordered set are unique, but the score can be repeated.

Example:

  • Zadd Key score Value Adds one or more values and their socres to a set
  • Zrange key start end 0 and -1 indicate from the element with index 0 to the last element (similar to the LRANGE command)
  • Zrangebycore Key start end sort the output by score
  • Zremrangebyscore Key Start end can be used for range deletion

Redis memory model

I’m going to give a brief introduction to Redis from the underlying memory storage to the system usage level. Then you also want to go to the peak of life to marry bai Fu Beauty, want to obediently see, obediently read, obediently learn!

Redis uses info Memory to check memory usage. As an in-memory database, Redis mainly stores data in memory. Redis memory is divided into several parts:

  • Data: As a database, data is the most important part;
  • Memory required by the process itself: The main Redis process itself must use memory, such as code, constant pool, etc.
  • Buffer memory: buffer memory includes client buffer, replication backlog buffer, AOF buffer, etc.
  • Memory fragmentation: Memory fragmentation is generated when Redis allocates and reclaims physical memory.

Redis data memory

The figure above shows the data model designed when executing Set Hello World.

Redis is a database of key-value pairs. Each key-value pair has a dictEntry containing Pointers to keys and values. Next points to the next dictEntry, independent of this key-value. Each type has at least two internal codes. On the one hand, the interface is separated from the implementation, so that users are not affected when internal codes need to be added or changed. On the other hand, internal codes can be switched according to different application scenarios to improve efficiency.

However, regardless of the type, Redis is not directly stored, but through the object of redisObject storage, redisObject is very important, redis object types, internal encoding, memory reclamation, shared objects, and other functions, need redisObject support.

Also important in Redis is an SDS structure, which stands for Simple Dynamic String. Instead of using a C string directly (that is, an array of characters ending with a null character ‘\0’) as the default string representation, Redis uses SDS. Their structure is not mentioned here, but will be described in detail in the following articles.

Transaction and pipe Pipline

As we all know, a transaction refers to a complete action, either all of which is executed or nothing is done.

A: An Atomicity transaction is the logical unit of work of A database. All or none of the operations involved in A transaction are performed.

C: Consistency transaction execution must result in the database changing from one consistent state to another consistent state. Consistency is closely related to atomicity.

I: Isolation The execution of a transaction cannot be disturbed by other transactions.

Once a transaction commits, its changes to data in the database should be permanent.

Redis transactions

A brief introduction to transactions in Redis, starting with several Redis directives: MULTI, EXEC, DISCARD, WATCH, and UNWATCH. These five instructions form the basis of redis transaction processing.

Multi is used to assemble and provide transactions. Exec executes all commands within the transaction block. Discard Cancels a transaction and abandons all commands in a transaction block. 4. Watch monitors a key (or keys) and if the key (or keys) is changed by another command before the transaction is executed, the transaction will be interrupted. 5. Unwatch Unmonitors all keys using the watch command.Copy the code

Redis transactions can execute multiple commands at once and go through three phases: start transaction, command enqueue, and transaction execution. And with the following three important guarantees:

1. Batch operations are placed in the queue cache before sending EXEC commands.

2. After receiving the EXEC command, the transaction is executed. Any command in the transaction fails to be executed, and other commands are still executed.

3. During transaction execution, command requests submitted by other clients will not be inserted into the transaction execution command sequence.

In the above example, we see QUEUED, which means that when we assemble a transaction with MULTI, each command will be cached in the memory queue. If QUEUED is present, the command will be successfully inserted into the cache queue. All QUEUED commands are QUEUED into a transaction to execute.

For transaction execution, if Redis has AOF persistence enabled, the commands in the transaction will be written to disk once the transaction is executed using the write command (persistence is described below).

In transaction execution, there are two kinds of problems, one is before the call of EXEC, the other is after the call of EXEC.

The problem before calling EXEC

An “error before calling EXEC” can be caused by a syntax error, or it can be caused by a lack of memory. Whenever a command fails to write to the buffer queue, Redis logs it, and when the client calls EXEC, Redis rejects the transaction. (This is the policy after 2.6.5. In versions prior to 2.6.5, Redis ignored commands that failed to join the queue and only executed commands that succeeded.

Redis ruthlessly rejects transaction execution due to “previous errors”; (error) EXECABORT Transaction discarded because of previous errors.

Problems after calling EXEC

For “errors after EXEC calls,” Redis takes a completely different approach, ignoring these errors and continuing to execute other commands in the transaction. This is because application-level errors are not a problem that Redis itself needs to deal with, so if one command fails in a transaction, it does not affect the execution of subsequent commands.

One of the questions that many people might ask is that you didn’t say that one of the properties of a transaction is atomicity, that all or none of the operations in a transaction are executed. Doesn’t that violate atomicity? Does this mean redis does not support transaction atomicity?

Solution: Redis transactions do not support transaction rollback. If a command fails during a Redis transaction, an error is returned and the following command continues. Because redis transaction does not support transaction rollback, if the transaction has a command execution error, only the current command error will be returned to the client, will not affect the execution of the following command, so many people feel different from the relational database (MySQL), and MySQL transactions are atomic. So it is assumed that Redis transactions do not support atomicity.

In fact, under normal circumstances, Redis transactions support atomicity, it is also all command execution success or none command execution. Looking at the example of the error before calling EXEC we described above, after the transaction starts, the user can enter the command to execute the transaction; Before entering the command into the transaction queue, the command is checked. If the command does not exist or the command parameters are incorrect, an error client is returned and the client status is changed. When a later client executes the EXEC command, the server simply refuses to execute the transaction.

Redis does not support transaction rollback, but it will check every command in the transaction for errors (it does not support checking individual programmer logic errors), if there is an error, the transaction will not be executed, only through the Redis layer of check will start the transaction execution and will be executed (not guarantee all successful execution), So it’s fair to say that Redis transactions support atomicity.

Consider the difference between Redis and mysql, Oracle, such a relational database transaction, first of all, Redis positioning is noSQL non-relational database, while mysql, Oracle, such a relational database.

SQL queries executed in a relational database can be quite complex and are checked and analyzed only when the SQL is actually executed (and in some cases precompiled). Without the concept of transaction queues, the mysql database does not know if the next SQL will be correct, so it is necessary to support transaction rollback. But in Redis, redis uses transaction queues to store commands and format checks to know whether the command is correct in advance, so if only one command is wrong, the transaction cannot be executed.

The author of Redis believes that programming errors that occur only in the development environment are virtually impossible in the production environment (such as LPUSH operations on String database keys), so he doesn’t see the need to change the simplicity and efficiency of Redis’s design for this transaction rollback mechanism.

So in the end, Redis transactions really support the premise of atomicity: don’t write code with logic problems!

Redis pipeline technology

Redis is a TCP service based on the client-server model and request/response protocol. This means that typically a request will follow the following steps: the client sends a query request to the server and listens for the Socket return, usually in blocking mode, waiting for the server to respond. The server processes the command and returns the result to the client.

Redis pipeline technology allows the client to continue sending requests to the server when the server does not respond, and eventually read all of the server’s responses at once. The image point is that for Redis is generally synchronous mode to request return results, and the pipeline technology can make Redis can realize asynchronous access, the client does not need to wait for the server to return results, can continue to send requests to the server, waiting for the final results read.

Persistence mechanisms (RDB and AOF)

Redis persistence

Redis also features data persistence. It is used to back up data and persist data in the memory to hard disks to ensure that data will not be lost due to service exit. Redis is an in-memory database. We need to periodically store the data in Redis in some form (data or commands) on the hard disk. When Redis restarts next time, persistent technology can be used to restore the data. Sometimes, for disaster backup purposes, we can also copy the data files generated by persistence to a remote location.

Just like when we see a nice picture in moments, we have to save it to our phone so that we can find it and use it again. And the memory here is equivalent to our brain, the brain after many days of “polishing” forget, and save more convenient to find it next time!

Redis persistence is divided into RDB persistence, which saves the current data to disk, and AOF persistence, which saves each write command to disk (similar to MySQL’s binlog). AOF is currently the dominant persistence method because it is more real-time, meaning less data is lost when the process unexpectedly exits, but RDB persistence still has its place.

RDB persistence

RDB persistence is to generate snapshots of data in the current process and save them to disks (hence also called snapshot persistence). The saved files are compressed binary files with the suffix RDB. When Redis restarts, the snapshot file can be read to recover data. RDB persistence can be triggered manually or automatically.

Advantages:

1, small size: the same amount of data RDB data is smaller than AOF, because RDB is a compact file;

2, fast recovery: because the RDB is a snapshot of data, data replication, do not need to re-execute the command;

3, high performance: the parent process only needs to fork a child process to save the RDB, without the parent process for other IO operations, but also ensure the performance of the server.

Disadvantages:

1, failure loss: because the RDB is full, we usually use shell script to achieve 30 minutes or 1 hour or every day for REDis RDB backup, but at least 5 minutes for a backup, so when the service dies, at least 5 minutes of data loss.

2. Poor durability: Compared with the ASYNCHRONOUS strategy of AOF, because the replication of RDB is full, even if the sub-process of fork is used for backup, the disk consumption can not be ignored when the data volume is large, especially when the traffic is high, the fork time will be prolonged, resulting in CPU strain and relatively poor durability.

AOF persistence

AOF persistence (Append Only File persistence) records every write command executed by Redis to a separate log File (sort of like MySQL’s binlog). When Redis restarts, the command in the AOF file is executed again to recover the data.

It was invented to compensate for RDB’s shortcomings (data inconsistencies), so it takes the form of a log to record every write operation and appends it to a file. We can set different fsync policies, such as no fsync, fsync every second, or fsync every time a write command is executed.

The default AOF policy is fsync once per second. In this configuration, Redis still performs well and loses at most a second of data in the event of an outage (fsync is performed in background threads, so the main thread can continue to struggle to process command requests).

Advantages:

1. Data guarantee: We can set different fsync policies. The default is everysec

2. File rewriting: When the size of aOF file reaches a certain level, the background will automatically perform AOF rewriting. This process will not affect the main process.

Disadvantages:

1. Poor performance: You need to run commands again to recover data, which is lower than RDB performance.

2. Relatively large size: Although the AOF file is rewritten, it is still large;

3. Slower recovery;

Master/slave replication (read/write separation)

The persistence described above focuses on the single-machine backup of Redis data (from memory to hard disk), while the master-slave replication focuses on the multi-machine hot backup of data. First, hot backup ensures uninterrupted service running. Data on one server is replicated in real time from another server to ensure data consistency between the two servers. We refer to online backups as hot backups, as opposed to offline data backups as cold backups. Cold backup refers to periodically backing up data to a backup server or storage system when the system is not running.

Primary/secondary replication refers to the replication of data from one Redis server to other Redis servers. The former is called the master node and the latter is called the slave node. The replication of data is one-way and can only go from the master node to the slave node. By default, each Redis server is the primary node; And a master node can have multiple slave nodes (or none), but a slave node can only have one master node.

The role of master-slave replication:

  • Data redundancy: Master/slave replication implements hot backup of data, provides remote backup, data redundancy, and service redundancy.

  • Load balancing: On the basis of master/slave replication and read/write separation, the master node provides the write service, and the slave node provides the read service (that is, the application connects to the master node when writing Redis data, and the application connects to the slave node when reading Redis data) to share server load. Especially in the scenario of less write and more read, the concurrency of the Redis server can be greatly increased by sharing the read load with multiple slave nodes.

Principle of master-slave replication:

The master-slave replication consists of three stages: connection establishment, data synchronization, and command transmission.

(1) The connection establishment stage is mainly used to establish connections between master and slave nodes to prepare for data synchronization;

(2) After the connection is established, data can be synchronized, which is also the initialization of data in the slave phase. The data synchronization phase is the core phase of the master/slave replication, and can be divided into full replication and partial replication according to the current status of the master/slave node. Full replication, as the name implies, copies all data on the primary node to secondary nodes for backup. Full replication is inefficient when the primary node has a large amount of data. Partial replication is introduced in Redis2.8 to handle data replication when a network interruption occurs. The primary and secondary nodes automatically determine whether the current state is suitable for full replication or partial replication.

(3) After the data synchronization stage, the master and slave nodes enter the command transmission stage; In this phase, the master node sends the write command to the slave node, and the slave node receives and executes the command to ensure data consistency between the master and slave nodes. In the command propagation phase, besides sending write commands, the primary and secondary nodes also maintain heartbeat mechanisms, such as PING and REPLCONF ACK. The heartbeat mechanism is useful for determining timeout and data security of the primary and secondary replication.

Please go to the Redis tutorial master/slave replication for details!

The guard mechanism

What do you think of sentinels?

We mentioned above that persistence is to solve the data storage problem of single-machine REDis. Master-slave replication can realize data redundancy, focusing on the solution of multi-machine hot backup of data. However, there is a problem in master/slave replication that fault recovery cannot be automated. The sentinel mechanism in Redis, based on master/slave replication of Redis, is mainly used to solve the automatic problem of master node fault recovery and further improve the high availability of the system. However, the sentinel mechanism also has some drawbacks, that is, the write operation cannot be load balanced, and the storage capacity is limited by single machine.

Redis Sentinel, or Redis Sentinel, was introduced in Redis 2.8. The core function of Sentry is automatic failover of the master node. Here’s how the Redis official documentation describes the Sentry feature:

Monitoring: The sentry continuously checks whether the master and slave nodes are functioning properly. Automatic failover: When the master node does not work properly, the Sentry starts an Automatic failover operation by upgrading one of the slave nodes of the failed master node to the new master node and making the other slave nodes replicate the new master node instead. Configuration Provider: During initialization, the client connects to the sentinel to obtain the address of the current Redis service master node. Notification: The sentry can send the result of a failover to the client.

Redis sentinel mechanism architecture

It consists of two parts, sentinel node and data node:

1. Sentinel node: The sentinel system is composed of one or more sentinel nodes, which are special Redis nodes and do not store data. 2. Data node: Both primary and secondary nodes are data nodes. The master-slave node in sentry system is no different from the common master-slave node. Failure detection and transfer are controlled and completed by sentry. Sentry is also a Redis node in essence, but it does not store data. Each sentinel node only needs to be configured to monitor the master node (it can be configured to monitor multiple master nodes) to automatically discover other sentinels and slave nodes.

Principle of sentry mechanism

  1. Scheduled task: Each sentinel node maintains three scheduled tasks.

Get the latest master/slave structure by sending the info command to master/slave nodes

Obtain the information of other sentinel nodes by publishing and subscribing;

You can send the ping command to another node to check whether the node is offline.

  1. Subjective offline: In the scheduled task of heartbeat detection, if other nodes do not reply for a certain period of time, the sentinel node will take them subjective offline. As the name suggests, subjective logoff means that a sentinel node “subjectively” judges a logoff;

  2. Objective offline: After subjectively offline the master node, the sentinel node will ask other sentinel nodes about the status of the master node through commands. If the number of sentinels on the primary node goes offline reaches a certain value, the primary node goes offline objectively.

It is important to note that objective logoff is a concept only for the master node; If the slave node and sentinel node fail, there are no subsequent objective offline and failover operations after being subjectively offline by sentinel.

  1. Leader sentry node election: When the master node is judged to be offline objectively, each sentry node will negotiate to elect a leader sentry node, and the leader node will failover it.

All sentinels monitoring the master node are potentially elected as leaders using the Raft algorithm; The basic idea behind Raft’s algorithm is first come, first served: in A round of elections, sentry A sends A request to BE leader to B, and if B doesn’t agree with any of the other sentries, he agrees with A to be leader. The specific process of election is not described in detail here, generally speaking, the sentry selection process is very fast, who completes the objective referral first, generally can become the leader.

  1. Failover: The elected leader sentry starts the failover operation, which can be broken down into three steps:

(1) Select new master nodes from nodes: the principle of selection is to filter out unhealthy slave nodes first; Then select the slave node with the highest priority (specified by slave-priority). If the priorities cannot be distinguished, the secondary node with the largest replication offset is selected. If it still cannot be distinguished, the slave node with the smallest RUNId is selected.

(2) Update master/slave status: run the slaveof no one command to make the selected slave nodes become master nodes; And using the slaveof command to make other nodes its slave nodes.

(3) Set the offline primary node as the secondary node of the new primary node. When the original primary node comes online again, it will become the secondary node of the new primary node.

clustering

The sentinel mechanism in front was flawed, writes were not load-balanced, and storage capacity was limited by a single machine. Cluster is born to solve these problems. It is the distributed storage scheme introduced from Redis3.0. Cluster is composed of multiple nodes, and Redis data is distributed in these nodes. Nodes in a cluster are classified into primary nodes and secondary nodes. Only the primary node is responsible for read/write requests and cluster information maintenance. The slave node replicates only the data and status information of the master node. Note the distinction between master and slave nodes in the sentinel mechanism, where only the master node is responsible for write requests and the slave node is responsible for read requests.

The main functions of clustering can be summarized into two aspects: data partitioning and high availability; Data partitioning is to solve the problems mentioned above. Data partitioning is to solve the limitation of storage capacity limited by single machine, while high availability is to solve the load balancing failure of write operations and realize automatic fault recovery.

** Data partition storage **

Data partition has sequential partition, hash partition and so on, and hash partition has natural randomness. The partition scheme used by cluster is one of hash partition.

There are many criteria to measure the quality of data partitioning methods, and two of the most important factors are

  • Whether the data is evenly distributed
  • The influence of adding or deleting nodes on data distribution.

Due to the randomness of hash, hash partition can basically ensure the uniform distribution of data. Therefore, when comparing hash partition schemes, we should focus on the influence of adding and subtracting nodes on data distribution.

Hash partitions can be divided into hash residuals, consistent hash partitions, and consistent hash partitions with virtual nodes. Next, a brief introduction is given.

** Hash cofactor partition **

The idea of hash partitioning is very simple: compute the hash value of the key and then mod the number of nodes to determine which node the data is mapped to. The biggest problem of this scheme is that when nodes are added or deleted, the number of nodes changes, and all data in the system need to recalculate the mapping relationship, resulting in large-scale data migration.

** Consistency hash **

Consistent hash: Organizes the entire hash space into a virtual circle with a range of 0-2^32-1; For each piece of data, you compute the hash value based on the key, determine the position of the data on the ring, and then walk clockwise around the ring from that position. The first server you find is the server to which it should be mapped.

Compared with hash residuary partition, consistent hash partition limits the effect of adding or subtracting nodes to neighboring nodes. For example, if node5 is added between node1 and node2, only part of the data from Node2 will be migrated to Node5. If node2 is removed, the data in the original node2 will only be migrated to Node4, and only Node4 will be affected.

The main problem of consistent hash partition is that when the number of nodes is small, adding or deleting nodes may have a great impact on individual nodes, resulting in serious data imbalance. Again, if node2 is removed, the data in Node4 changes from about 1/4 of the total data to about 1/2, and the load is too high compared with other nodes.

Original paper links on hash consistency:

The original paper Consistent Hashing and Random Trees is linked below:

Official link – PDF version

The related paper Web Caching with Consistent Hashing is linked below:

Official link – PDF version

Consistent hash with virtual nodes

The cluster uses a consistent hash partition with virtual nodes. In Redis, virtual nodes are called slots. Redis is designed with 16,384 slots. A slot is a virtual concept between data and actual nodes. Each actual node contains a certain number of slots, and each slot contains data with hash values in a certain range. When slots are introduced, the data mapping changes from data hash-> actual nodes to data hash-> slot -> Actual nodes.

In a consistent hash partition that uses slots, slots are the basic unit of data management and migration. Slots decouple the relationship between data and actual nodes. Adding or deleting nodes has little impact on the system.

Here’s a simple example: The key we access will obtain a result according to crC16 algorithm, and then take the remainder of the result to 16384, so that each key will correspond to a hash slot with the number between 0 and 16383. Through this value, we can find the node corresponding to the corresponding slot. Then jump directly to the corresponding node for access operation.

Node communication

In a sentinel system, nodes are divided into data nodes and sentinel nodes: the former stores data and the latter implements additional control functions. In a cluster, there is no distinction between data nodes and non-data nodes: all nodes store data and participate in the maintenance of the cluster state. For this purpose, each node in the cluster provides both a normal port and a cluster port. Cluster port The value is a common port plus 10000 (10000 is a fixed value and cannot be changed). Cluster ports are used only for communication between nodes, such as communication between nodes during cluster building, adding or removing nodes, and failover. Do not use the client to connect to the cluster interface. Messages sent between nodes are divided into five types: meet messages, ping messages, Pong messages, fail messages, and Publish messages. Different message types, communication protocols, frequency and timing of sending, and the choice of receiving nodes are different. The exact meaning of the message is not covered here

Nodes in a cluster need specialized data structures to store the state of the cluster. Among the data structures provided by nodes to store cluster status, the most critical are clusterNode and clusterState structures: the former records the status of a node, while the latter records the status of the cluster as a whole.

Fault tolerant election

In the process of sending messages, Redis determine whether the nodes can be connected by pinging each other. If more than half of the nodes ping a node do not respond, the cluster assumes that the node is down and goes to its secondary nodes. If a node and all slave nodes fail, our cluster goes into a fail state. Also, if more than half of the primary nodes go down, then our cluster also enters the fail state. This is how we vote in Redis.

All the master nodes in the cluster participate in the voting process. If more than half of the master nodes communicate with the master node timeout (cluster-node-timeout), the current master node is considered to be down.

What exactly is Redis used for

  • Cache: cache is now almost all of the large and medium websites are using the kill skill, reasonable use of cache can not only improve the speed of website access, but also greatly reduce the pressure of the database. Using cache to improve performance at the same time, it will also bring more problems, for example, if the cache if the sudden failure of a large number of requests hit the DB, resulting in cache avalanche, cache breakdown how to deal with, it is designed to cache expiration time (periodic deletion and lazy deletion) and 6 memory elimination mechanism. How to deal with a large number of requests that do not exist in the cache coming to the DB and causing cache penetration involves filtering requests and using efficient Bloom filters (I don’t know? Follow my Redis tutorial series for interview plus!! . How to ensure cache and database consistency, whether to ensure strong consistency or final consistency.

  • Distributed problem: distributed transaction, distributed session, distributed lock, etc. Just a quick introduction

Distributed transaction

It refers to transactions involving multiple databases. The key is to ensure that data write operations between multiple nodes are either all executed or none executed. However, one machine cannot know the execution results of local transactions in other machines when executing local transactions. Therefore, it is not known whether the transaction should be commit or roolback. The conventional solution is to introduce a coordinator to uniformly schedule the execution of all distributed nodes, and Redis can act as this role

Distributed session

This refers to the problem of tracking user status when users access the system and are randomly assigned to different machines in a clustered environment. When users log in to the website, the load is balanced on server A the first time and on server B the second time. How to prevent status data loss of users

A distributed lock

If we have multiple threads on a machine preempting the same resource, and there is an exception if we execute it multiple times, we call it thread-safe. If it is a different Java instance on the same machine, we can use the system’s file read/write lock to solve the problem, and then extend to different machines? We usually solve this with distributed locks.

  • Message queue: Message queue is a necessary middleware for large websites, such as Kafka with Broker violence, RabbitMQ, RocketMQ and ZeroMQ without Broker. It is mainly used for service decoupling, traffic peak clipping and asynchronous processing with low real-time performance. Redis provides publish/subscribe and blocking queue functions. Blocking queue uses timeout mechanism to realize a simple message queue system. Also, this is not comparable to professional message-oriented middleware.

  • Automatic expiration: in fact, the blocking queue introduced above is also an example of redis automatic expiration, but I think it is necessary to mention that Redis can set the expiration time for data, this feature is also widely used, expired data cleaning without users to pay attention to, performance is also relatively high. The most common is: SMS verification code, with the time of the display of goods, no need to check the time like the database for comparison.

The author remarks

Dry goods have quality, hydrology has feelings, wechat search [program control], pay attention to this interesting soul