preface

Let’s see what Redis is. The official brief explains:

Redis is an open source project based on BSD. It is a storage system that puts structured data in memory, which you can use as a database, cache and messaging middleware. Strings, lists, Hashes, sets, sorted sets, bitmaps, Hyperloglogs, and Geospatial Indexes are supported. It also has built-in replication, LUa script, LRU, transaction and other functions. It implements high availability through Redis Sentinel and automatic sharding through Redis Cluster. And transactions, publish/subscribe, automatic failover, and so on.

To sum up, Redis offers a wealth of features, which may be dizzying at first sight. What are these features for? What problems do they solve? When should I use the feature? So here is a rough explanation of the evolution from zero to step by step.

Starting from 0

The initial requirement was simple. We had an API that provided a list of hot news stories, and consumers of the API complained that it took about 2 seconds to return results on each request.

Then we started looking at how to improve the API’s consumer-aware performance, and soon the simplest and most crude solution came out: add an HTTP-based cache control to the API response cache-control:max-age=600, allowing the consumer to cache the response for ten minutes.

If the API consumer makes good use of the cached control information in the response, it can effectively improve its perceived performance (within 10 minutes). But there are two disadvantages: the first is that within 10 minutes of the cache taking effect, the API consumer may get old data; The second is that if the CLIENT of the API ignores the cache and accesses the API directly, it still takes 2 seconds.

Native memory-based caching

In order to solve the problem that it still takes 2 seconds to call the API, after investigation, the main reason is that it takes nearly 2 seconds to get hot news using SQL, so we came up with a simple and crude solution. That is, the SQL query results are directly cached in the memory of the current API server (set the cache validity to 1 minute). Requests within the next 1 minute are read directly from the cache instead of taking 2 seconds to execute the SQL.

If the API receives 100 requests per second, that’s 6000 requests per minute, which means that only the first 2 seconds of congested requests will take 2 seconds, and all the requests in the next 58 seconds can be answered without having to wait another 2 seconds.

Other API friends found this a good idea, and soon we found that the API server was running out of memory…

Redis on the server

When the MEMORY of the API server was full, we found that we had to come up with another solution. The immediate idea is to put all of these caches on a dedicated server with a very large memory configuration. And then we went after Redis… How to configure and deploy Redis is not explained here, redis official has a detailed introduction. We then used a separate server as the Redis server and the API server memory stress was resolved.

3.1 Persistence

A single Redis server is in a bad mood for a couple of days a month, and when it’s in a bad mood, it strikes, and all the caches are lost (Redis data is stored in memory). It was possible to bring the Redis server back online, but the API server and database were immediately under pressure due to the cache avalanche caused by data loss in memory.

This is where Redis persistence comes in handy to mitigate the impact of cache avalanches. Redis persistence refers to the fact that Redis writes data from memory to hard disk and loads the data when Redis restarts to minimize the impact of cache loss.

3.2 Sentinel and Replication

The unannounced strike at Redis servers was a nuisance. So what should we do? Answer: back up one, you hang on to it. So how to know a certain Redis server down, how to switch, how to ensure that the backup machine is a full backup of the original server?

That’s where Sentinel and Replication come in. Sentinel manages multiple Redis servers by providing monitoring, alerting, and automatic failover; Replication is responsible for making it possible for a Redis server to have multiple backup servers. Redis also takes advantage of these two features to ensure that Redis is highly available. In addition, The Sentinel feature is an advantage of Redis’ publish and subscribe capabilities.

3.3 Cluster

There is always an upper limit of CPU resources and I/O resources on a single server. We can separate the READ and write of CPU resources and I/O resources through primary and secondary replication, and transfer some OF the PRESSURE of CPU and I/O to the secondary server. But how to do memory resources, master-slave mode only do the same data backup, and can not horizontally expand memory; The memory of a single machine can only be enlarged, but there is always a limit.

So we needed a solution that would allow us to scale horizontally. Ultimate goal is to each server responsible for only part of the integrated to let all the server and the REST of the world consumers, this group of the distributed server is like a centralized server (before interpretation in the REST of the blog, explained based on network is distributed in difference: based on the network application architecture).

Before the official distributed scheme of Redis came out, there were two schemes, twemProxy and CODIS. Generally, these two schemes relied on proxy for distribution. That is to say, Redis itself did not care about the distributed thing, but put twemProxy and CODIS in charge. However, the official redis cluster scheme is to do this part of the distributed things in each Redis server, so that it does not need other components can independently complete the distributed requirements.

We don’t care about the superiority of these solutions here, we care about what is distributed here to deal with? In other words, twemProxy and CODIS independently handle the distributed logic and cluster integrated into Redis service. What problems are they solving?

As we said earlier, a distributed service looks like a centralized service to the outside world. To do this, there is a problem: increasing or decreasing the number of servers in a distributed service should be insensitive to the clients who consume the service; This means that clients can’t penetrate distributed services and tie themselves to a single server, because if you do, you can’t add servers or replace them.

There are two ways to solve this problem:

The first way is the most direct, that is, I added a middle layer to isolate the specific dependency, which is the way that TwemProxy uses it to isolate the dependency (but you will find that TwermProxy will become a single point). In this case, each Redis server is independent and they do not know each other’s existence;

The second way is to let the Redis server know the existence of each other and guide the client to complete the required operation through the redirection mechanism. For example, the client connects to a certain Redis server and says I want to perform this operation, but the Redis server finds that it cannot complete the operation. Then can finish this operation to the client and server information to let the client to request another server, then you will find that every redis server need to keep a full distributed server information of a material, or it how do you know which server for the client to try to find other action to execute the client want?

The above paragraph explains so much, I wonder if you found that whether the first path or the second path, there is a common thing, that is, all the servers in the distributed service and their services can provide information. In any case, these information is to exist, the difference is that the first way is to manage this part of the information separately, using these information to coordinate the backend of multiple independent Redis servers; The second way is to let each Redis server hold this information and know each other’s existence, so as to achieve the same purpose as the first way. The advantage is that no additional component is needed to handle this part of the thing.

The specific implementation details of Redis Cluster are based on the concept of Hash slots, that is, 16,384 slots are allocated in advance. On the client, the corresponding slot is calculated by CRC16 (Key) % 16384 on the Key. In redis server is each server is responsible for part of the tank, when there is a new server to add or remove, and then to migrate these slots and the corresponding data, each server at the same time have a full tank and its corresponding server information, which makes the server can be redirected to the client’s request.

Redis on the client

The third section above mainly describes the evolution steps of the Redis server, explaining how Redis evolved from a stand-alone service to a highly available, decentralized, and distributed storage system. This section looks at the Redis services that clients can consume.

4.1 Data Types

Redis supports a wide range of data types, from the most basic string to the most complex and commonly used data structures:

  1. String: the most basic data type, binary secure string, maximum 512 MB.
  2. List: A list of strings in the order in which they were added.
  3. Set: An unordered set of strings with no duplicate elements.
  4. Sorted set: sorted set of strings.
  5. Hash: A collection of key-value pairs.
  6. Bitmap: a more refined operation, in bits.
  7. Hyperloglog: Probability-based data structure.

These numerous data types are designed to support various scenarios, but each has a different time complexity. In fact, these complex Data structures are equivalent to the implementation of Remote Data Access (RDA) that I described earlier in the “Understanding REST” series of blogs about architecture styles for Web-based applications. By executing a standard set of operational commands on a server, Getting the desired scaled-down result set between servers simplifies client use and improves network performance. For example, if there is no such data structure as list, you can only save the list as a string. The client can get the complete list, and then submit it to Redis completely after operation, which will produce a lot of waste.

4.2 transactions

Each of these data types has its own command to operate on, and in many cases more than one command needs to be executed at once, and it needs to succeed or fail at the same time. Redis’s support for transactions also stems from this part of the need to support the ability to execute multiple commands in sequence at once, with atomicity.

4.3 the Lua script

On a transactional basis, lua is useful if we need to perform more complex operations (including some logical decisions) on the server side at once (such as acquiring a cache while extending its expiration). Redis ensures that lua scripts are atomicity, and in certain scenarios, can replace the transaction-related commands provided by Redis. This corresponds to a concrete implementation of Remote Evluation (Remote Evluation = REV) introduced in the web-based application architecture style.

4.4 the pipe

By default, only one command can be executed per connection, because the redis client and server are connected based on TCP. Pipes allow multiple commands to be processed on a single connection, thereby saving some of the overhead of TCP connections. The difference between a pipeline and a transaction is that a pipeline is designed to save communication overhead, but does not guarantee atomicity.

4.5 Distributed Locking

It is recommended to use Redlock algorithm, that is, to use string type, to give a specific key, and then set a random value; When unlocking, use lua scripts to perform a fetch comparison before deleting the key. The specific commands are as follows:

SET resource_name my_random_value NX PX 30000
if redis.call("get",KEYS[1]) == ARGV[1] then
  return redis.call("del",KEYS[1])
else
  return 0
end
Copy the code

conclusion

This article focuses on the abstract level to explain the various functions of Redis and the purpose of its existence, without paying attention to its specific details. In this way, we can focus on the problem it solves, and the concept based on the abstract level allows us to choose a more appropriate solution for a particular scenario, rather than being limited to its technical details.