My official account: MarkerHub, Java website: Markerhub.com

For more selected articles, please click: Java Notes Complete.md

Small Hub read:

In this paper, MQ, traditional scheduled task and SortSet queue of Redis are introduced respectively, the feasibility of one by one analysis, and finally the logic and part of the code implementation of Redis is given. Did you learn?


  • Author: I’m Lin Lin
  • www.cnblogs.com/linlinismin…

Recently, I developed the project of the company’s coupon center, which is based on Redis as a key technology.

First, let’s talk about the project of the coupon center. This project is similar to the coupon center of JINGdong APP. Of course, the picture is taken from JINGdong, not the company’s.

One of the features is called a subscription push for coupons.

What is the subscription push for coupons?

The user subscribes to the push of the coupon, and the reminder information will be pushed to the user’s APP one minute before they can receive it.

The subscription function was supposed to be done in the message center, but they said it won’t be done for a while. So let me, the coupon guy, do…! . The solution is that the Coupon system calls the push interface of the message center at a specific push time to push the information out.

Let’s look at the business scenario for this feature. The company currently registered users 6000W+, which is not to ask about… For example, if there is a coupon with no threshold, immediately reduce by 20 yuan, then there will be more people who grab this coupon. We conservatively estimate that it is 10W+, and it is difficult to say the level of millions. We initially set the target at 20 million people, so the 20 million push messages should be pushed in one minute! And a user can subscribe to multiple coupons. So we know that there are two salient difficulties with this subscription feature:

  1. Effectiveness of push: If push is slow, users will complain that they have missed the opening time without timely notification.

  2. Push the volume is large: explosive style of coupons, everyone wants to grab!

However, the volume of push will affect the effectiveness of push. This is really a headache!

Let’s take them one by one.

Problem of the effectiveness of push: After a user subscribes to the reminder of collecting a certain coupon in the coupon center, a user’s reminder record of subscription will be generated in the background, which records the time point at which push information will be sent to the user. So the problem becomes how the system can quickly select which records to push in real time!

Plan 1:

Delayed delivery of MQ. MQ supports delayed delivery of messages, but the scale is too large to be used for precise point-in-time delivery. And if a user subscribes and then unsubscribes, it will take a long time to delete the sent MQ message. And users can cancel and then subscribe, which involves the problem of de-duplication. So the MQ scheme is rejected.

Scheme 2:

Traditional scheduled tasks. This is relatively simple, with a scheduled task is to load the db user subscription reminder records, from which the current record can be pushed. But it is well said that any design that deviates from the actual business is a hooligan. Let’s analyze whether traditional timed tasks are suitable for our business.

Can support multiple machines running at the same time Generally not, at the same time can only run single.
Storage data source It is typically mysql or some other traditional database and is stored in a single table
frequency Support seconds, minutes, hours, days, generally can not be too fast

To sum up, we know that the general traditional timed task has the following disadvantages:

  1. Performance bottlenecks. With only one machine to process, it is unable to cope with the large volume of data!

  2. Poor effectiveness. The frequency of scheduled tasks should not be too high, which will cause great pressure on the business database.

  3. Single point of failure. In case the missing machine fails, the whole business becomes unavailable. – This is a terrible thing!

So traditional timed tasks are not a good fit for this business…

Is there nothing we can do about it? It’s not! We only need to do a simple transformation of the traditional timed task! We can turn it into a timed task cluster that can run on multiple machines at the same time, be accurate to the second level, and reject single points of failure! This will be with the help of our powerful Redis.

Solution 3: Scheduled task cluster

First we will define three problems to be solved by a timed task cluster!

1, the effectiveness of high

2, the throughput should be large

3, the service should be stable, no single point of failure

The following is an architecture diagram of the entire scheduled task cluster.

The architecture is simple: We store the subscription push records of users in the sortedSet queue of Redis cluster, and use the reminding time stamp of users as the score value. Then, we set a timer in each business server with the frequency of seconds. My setting is 1s. After load balancing, the user records to be pushed are obtained from a queue for push. Let’s examine the following architecture.

1. Performance: excluding bandwidth and other factors, it is linearly related to the number of machines. The larger the number of machines, the greater the throughput, and the smaller the number of machines, the lower the relative throughput.

2, effectiveness: improved to the second level, the effect is acceptable.

3. Single point of failure? It doesn’t exist! Unless the Redis cluster or all servers are down…

Why redis?

First, Redis can be used as a high performance storage DB, performance is much better than MySQL, and support persistence, good stability.

Second, the Redis SortedSet queue naturally supports time sorting as a condition, which perfectly meets the record we select to push.

Ok ~ since the program has been that how to put this program in a day? Yes, I designed the plan and the basic code was done in a day… Because of the rush of time.

First we use user_id as the key and then mod the queue number to hash into redis SortedSet. Why? Because if the user subscribes to two coupons at the same time and pushes them very close, the two pushes can be combined into one ~, and the hash is relatively uniform. Here’s a screenshot of some of the code:

  

Then we decide on the number of queues, and normally we define as many queues as we have processing servers. Too few queues will result in queue contention, and too many may result in records not being processed in a timely manner.

However, the best practice is that the number of queues should be dynamically configurable, since the number of clustered machines on the line can change frequently. Greatly promote when we can add machine is not, and business volume grew, machine number also can increase is not ~. Therefore, I borrowed Diamond from Taobao for the dynamic configuration of queue number.

 

How many records we fetch from the queue at a time is also dynamically configurable

This allows you to adjust the throughput of the entire cluster at any time according to actual production conditions ~. So our scheduled task cluster still has a feature that supports dynamic adjustment ~.

The last key component is load balancing. This is very important! Because this is not done properly can result in multiple machines competing to process a queue at the same time, affecting the efficiency of the entire cluster! In the case of time is very tight, I used a simple and practical use of Redis an increment key and then mod queue number algorithm. This largely ensures that there are no two machines competing for the same queue.

Finally, let’s calculate the throughput of the entire cluster

10 (number of machines) * 2000 (number of one pull) = 20000. The message is then pushed to the message center in the form of MQ, which is asynchronous, with other processing for 0.5s.

In fact, sending 20W push is a matter of a few seconds.

Ok ~ here our entire scheduled task cluster is almost basically landed. If you ask me if there’s anything I can improve on, it’s this:

  • Add monitoring, cluster how can not monitor it, in case of a problem there are tasks to do ~

  • Plus visual interface.

  • It is better to have intelligent scheduling and increase the priority of tasks. High-priority tasks run first.

  • Resource scheduling, in case the number of machines is not enough, to ensure the execution of important tasks first.

At present, the project has been on the front line, smooth operation ~.

Recommended reading

Java Notes Complete.md

Great, this Java site, everything! https://markerhub.com

The UP master of this B station, speaks Java really good!