⚠️ this article for nuggets community first contract article, not authorized to forbid reprint

Most databases, due to their frequent work with disk, respond very slowly in high concurrency scenarios. To address this speed difference, most systems routinely include a caching layer to speed up data reading. Redis has become the de facto distributed cache standard due to its excellent processing power and rich data structure.

However, if you think redis is all about caching, you’re underestimating it.

The rich data structure of Redis enables it to be used in a wide range of business scenarios. Coupled with the persistence feature of RDB, it can even be used as a grounded database. In this case, Redis could support the majority of Internet companies, especially social, gaming, and live streaming companies.

1. Redis is competent for storage work

Redis provides a very rich cluster pattern: master slave, sentinel, cluster, to meet the needs of high availability of services. At the same time, Redis provides two persistence methods: Aof and RDB, the most commonly used is RDB.

With the BGsave directive, the main process forks out a new process and writes back to disk. Bgsave is equivalent to a snapshot. Because it does not have WAL logs and checkpoint mechanism, it cannot be backed up in real time. If the machine suddenly loses power, it is easy to lose data.

Fortunately, Redis is an in-memory database, and the synchronization of the primary bundle is very fast. If your cluster is well maintained and memory is allocated properly, redis slAs will remain very high unless the machine room is powered off.

That doesn’t sound infallible, there’s a chance of losing data! This is intolerable in the normal CRUD business. But why does Redis meet the needs of most Internet companies? This is also determined by the business attributes.

Before deciding to embrace Redis to the Max, you need to confirm whether your business has the following characteristics:

Besides the core business, do most businesses have low requirements for data reliability, and the loss of one or two pieces of data is tolerable?

  1. Faced with C-end users, a class of data can be quickly located according to the user ID, and the data set is generally small. No large range query requirements?
  2. Can you tolerate the cost of in-memory data?
  3. Does the business require almost no transaction operations?

Fortunately, this type of business has a lot of special needs. For example, common social, gaming, livestreaming, and operation businesses can be completely dependent on Redis.

2. Application scenarios of Reids

Redis has a loose document structure and rich data types, which can adapt to the ever-changing scheme change requirements. Next, I will introduce a large number of application scenarios of Redis except caching.

2.1 Basic User Data Stores

In traditional database design, user tables are very difficult to design and can be hard to change. Using the Hash structure of Redis, a loose data model can be designed. Some unfixed, validating functional properties can be stored directly in the hash value using the JSON interface. Using the hash structure, you can use HGET and HMGET commands to obtain only the data you need, which is also very convenient in use.

>HSET user:199929 sex m
>HSET user:199929 age 22
>HGETALL user:199929
1) "sex"
2) "m"
3) "age"
4) "22"
Copy the code

This non-statistical scenario, with more read and less write, is well suited to the KV structure for storage. The Redis hash structure provides a rich set of instructions, and a property can be incremented and decremented using HINCRBY, which is very convenient.

2.2 Implementation of counters

The HINCRBY directive is mentioned slightly above, and for Redis keys themselves, there are also INCRBY directives, which increment and decrement a value.

For example: Counting the number of likes on a post; Store the number of followers on a topic; The number of fans that store a hashtag; Store a general number of comments; A post heat; Red dot message number; Likes, likes, favorites, etc.

> INCRBY feed:e3kk38j4kl:like 1
> INCRBY feed:e3kk38j4kl:like 1
> GET feed:e3kk38j4kl:like
"2"
Copy the code

Such as weibo easy to appear hot business, the traditional database, is certainly not able to hold up, with the help of memory database. Because Redis is so fast, you don’t have to use the very slow count operations of traditional DB. All of this incrementing is at the millisecond level, and the effect is real time.

2.3 list

Leaderboards increase the motivation of participants, so this business is very common, and it’s essentially a Topn problem.

Redis has a data structure called Zset, which makes it easy to implement problems like leaderboards using ordered lists implemented by hops. When the data stored in the ZSet reaches the level of tens or even hundreds of millions, it still maintains very high concurrent reads and writes and has a very good average response time (under 5ms).

Zadd allows you to add new records. We use the score associated with the ranking as the score value of the record, and then use the ZRevRange directive to get real-time leaderboard data. Zrevrank makes it very easy to get real-time ranking of users.

>ZADD sorted:xjjdog:2021-07 55 dog0 >ZADD sorted:xjjdog:2021-07 89 dog1 >ZADD sorted:xjjdog:2021-07 32 dog2 >ZCARD Sorted :xjjdog:2021-07 >3 > ZREVRANGE sorted:xjjdog:2021-07 0-10 WITHSCORES "55" 5) "dog2" 6) "32"Copy the code

2.4 Friendship

Set structure is a collection of non-duplicate data. You can store a user’s follow list, fan list, two-way follow list, blacklist, like list, etc., in a separate Zset.

ZADD and ZRANK are used to add a user to the blacklist. ZRANK uses the returned sorce value to determine whether the user is in the blacklist. Using the sinter command, you can get the mutual friends of A and B.

The set structure can be used to store data in service scenarios with clear blacklists and whitelists, except for friends. There are many other business scenarios, such as the address book uploaded by a user, calculating the friend relationship in the address book, and so on.

In practice, zsets are more likely to be used to store such relationships. Zset, like set, does not allow duplicate values, but zset has a SCORE field. We can store a timestamp to indicate the time when the relationship was established, which has a clearer business meaning.

2.5 Counting the Number of Active Users

There are too many scattered demands, such as daily statistics of active users, user check-ins, and user online status. It would take up too much space to store a single bool variable for each user. In this case, we can use the Bitmap structure to save a lot of storage space.

>SETBIT online:2021-07-23 3876520333 1 >SETBIT online:2021-07-24 3876520333 1 >GETBIT online:2021-07-23 3876520333 1 >BITOP AND active online:2021-07-23 online:2021-07-24 >GETBIT active 3876520333 1 >DEBUG OBJECT online:2021-07-23 Value At: 0x7FDFDe438BF0 RefCount :1 Encoding: RAW SerializedLength :5506446 LRU :16410558 LRU_seconds_IDLE :5 (0.96s)Copy the code

Note that if you have a large ID, you need to preprocess it first, otherwise it will take up a lot of memory.

A bitmap consists of a series of consecutive binary numbers, using 1bit to represent the true/false problem. On a Bitmap, you can use and, OR, and XOR alleles (BITOP).

2.6 Distributed Locking

Redis distributed locking is a lightweight solution. Although it is not as reliable as systems like Zookeeper, Redis distributed locking has extremely high throughput.

One of the simplest locking actions can be done using redis set with nx and px parameters. The following is a short sample code for a simple distribution style.

public String lock(String key, int timeOutSecond) { for (; ;) { String stamp = String.valueOf(System.nanoTime()); boolean exist = redisTemplate.opsForValue().setIfAbsent(key, stamp, timeOutSecond, TimeUnit.SECONDS); if (exist) { return stamp; } } } public void unlock(String key, String stamp) { redisTemplate.execute(script, Arrays.asList(key), stamp); }Copy the code

The lua of the delete operation is.

local stamp = ARGV[1]
local key = KEYS[1]
local current = redis.call("GET",key)
if stamp == current then
    redis.call("DEL",key)
    return "OK"
end
Copy the code

Redisson’s RedLock, which uses the most common distributed locking solution, has read-write lock differences and handles exceptions in the case of multiple Redis instances.

2.7 Distributed traffic limiting

Using counters to implement simple stream-limiting is very convenient in Redis, using incr and expire directives.

 incr key
 expire key 1
Copy the code

This simple implementation is usually not a problem, but in the case of heavy traffic, there is a risk of sudden spikes in traffic over time. The root cause is that time segmentation is too fixed, and there is no smooth transition scheme like sliding Windows.

RRateLimiter, also redisson, implements a similar distributed flow limiting tool class in Guava, which is very convenient to use. Here’s a quick example:

 RRateLimiter limiter = redisson.getRateLimiter("xjjdogLimiter");
 // 只需要初始化一次
 // 每2秒钟5个许可
 limiter.trySetRate(RateType.OVERALL, 5, 2, RateIntervalUnit.SECONDS);
 
 // 没有可用的许可,将一直阻塞    
 limiter.acquire(3);
Copy the code

2.8 Message Queues

Redis can implement simple queues. On the producer side, use LPUSH to add to a list. On the consumer side, the data can be constantly fetched using the RPOP instruction or the blocked BRPOP instruction, which is suitable for small-scale snapping up.

Redis also has a PUB/SUB model, but PubSub is better suited for things like broadcast messages.

In Redis5.0, data structures of type stream were added. It is similar to Kafka, with themes and consumer groups, multicast and persistence, and already meets most business needs.

2.. 9 LBS application

As early as Redis3.2, GEO features were introduced. By adding lAT and LNG longitude and latitude data with GEOADD instruction, distance calculation between coordinates, inclusion relationship calculation, nearby people and other functions can be realized.

The most powerful open source solution for GEO functionality is PostGIS, which is based on PostgreSQL, but Redis is sufficient for general-scale GEO services.

2.10 More Extended Application Scenarios

To see what Redis can do, we have to mention the following Java client class library, Redisson. Redisson contains rich distributed data structures, all of which are designed based on Redis.

Redisson provides things like Set, Set MultiMap, ScoredSortedSet, SortedSet, Map, ConcurrentMap, List, ListMultimap, Queue, BlockingQueue and many other data structures, making redis-based programming more convenient. On Github, you can see hundreds of such data structures: github.com/redisson/re…

Basic arrays, lists, collections, and other apis for a given language work together to do most of the business. Redis is no exception, as it has these basic API capabilities that can be combined into distributed, thread-safe, highly concurrent applications.

Since Redis is memory based, it is very fast and we will also use it as an intermediate data storage place. For example, some common configurations are shared in Redis, which acts as a configuration center; For example, by storing JWT tokens in Redis, you can break some of the limitations of JWT and safely log out.

3. Challenges for One-stop Redis

Redis is rich in data structures and generally does not cause trouble in terms of functionality. However, as the volume of requests increases and SLA requirements increase, we will definitely have some modifications and customization to Redis.

3.1 HA Challenges

Redis provides three cluster modes, such as master and slave, sentinel and cluster, among which cluster mode is used by most companies at present.

However, redis cluster mode, there are a lot of hard. Redis Cluster adopts the concept of virtual slots and maps all keys to integer slots ranging from 0 to 16383, which belongs to a decentralized architecture. But it is expensive to maintain, and slaves cannot participate in read operations.

Its main problem lies in the limitations of some batch operations. Because the key is hashed across multiple machines, operations such as mget, hmset, and sunion are unfriendly and performance problems often occur.

The master-slave mode of Redis is the simplest mode, but automatic failover is not possible. Usually, after the master/slave switchover, the service code needs to be modified, which is intolerable. Even with load balancing components like HAProxy, the complexity is high.

The sentry model is of great value when the number of master and slave is large. A sentinel cluster can monitor hundreds or thousands of clusters, but the sentinel cluster itself is difficult to maintain. Fortunately, Redis’ text protocol is very simple, and in Netty, even redis’ codec is provided directly. It is feasible to develop a sentinel system and strengthen its function.

3.2 Separation of hot and cold data

The characteristic of Redis is that, no matter what data, all get into memory to do calculation, this has caused a very large cost test for the business which has the concept of time series, cold and hot data. Why do most developers like to store data in MySQL instead of Redis? In addition to transactional requirements, a large part of the problem is historical data.

Usually, this switch between hot and cold data is done by middleware. As we mentioned above, Redis is a text protocol, very simple. It is relatively easy to build a middleware, or a protocol compatible Redis analog storage.

In Redis, for example, we only keep active users in the last year. A user who has been inactive for several years suddenly accesses the system, and when we retrieve the data, we need the middleware to convert it to a larger, slower storage.

At this point, Redis acts more like a hot repository, more like a traditional cache layer, which happens when the business has scaled up. But notice that up until this point, our business layer code has always been the Redis API that operates. They use all these function instructions and don’t care whether the data is actually stored in Redis or SSDB.

3.3 Functional Requirements

Redis can also do a lot of tricks. Take, for example, full-text search. Many people would prefer ES, but Redis ecology provides a module: RediSearch, which can do queries and filters.

But we often have more requirements, such as statistics, search, operational performance analysis, etc. This type of requirement is related to big data, and even traditional DB is not up to it. At this point, of course, we need to import the data from Redis to other platforms for calculation.

If you choose a Redis database, then the DBA deals with RDB, not binlog. There are a number of RDB parsing tools (such as Redis-RDB-tools) that periodically parse RDB into records that can be imported to other platforms such as Hadoop.

At this point, the RDB becomes the hub for all teams and the basic data exchange format. After importing the business to other DB, you can play the business as you like, and it will not be unable to work because the business system uses Redis.

4. To summarize

Most business systems, running on Redis, which is a lot of students who have been using MySQL to do business systems can not imagine. After reading the above introduction, I believe you can have a general understanding of the storage function that Redis can achieve. Open up your social app, your game app, your video app, and see how many features they cover.

I want to emphasize here that some data does not have to land on an RDBMS to be safe; they are not a strong requirement.

If Redis is so good, why do you need mysql, TIDB? The key is also the business attributes.

If a business system, each time the data is interacted with, is a very large result set, and involves very complex statistics, filtering work, then RDBMS is necessary; But if you have a system that can quickly locate a category of data that is limited for the foreseeable future, by a certain identifier, that’s a good fit for Redis storage.

An e-commerce system that uses Redis for storage is dead, but a social system is much happier. Choose the right tool in the right scene, is what we should do.

However, the key to success is whether a system can respond to changes quickly and develop quickly during the product verification period. This is the biggest benefit of using Redis as a database. Don’t be intimidated by the low probability of losing data. Your system, even if it’s steel, is nothing compared to product success.

⚠️ this article for nuggets community first contract article, not authorized to forbid reprint