Expose a learning site that a lot of people don't know about.

Hi, I’m crooked.

This time I want to share with you a learning website that many people don’t know about.

Aliyun.

I estimate that many people have a bad feeling when they see the words “Ali Cloud”, and secretly shout in their hearts: no, I feel this is an advertisement. Ran.

Indeed, at its core, this is a site that sells services. But strong man please stop, I don’t ask you to go to this to buy service, I really learned a lot of useful knowledge on this.

Don’t worry, these things are free.

Of course, if Aliyun can see my article and want to pay me a sum of money, I am also happy.

All right, no more talking. Let’s go.

Help center

If you think aliyun is a website that sells services and is a place where you can spend money, state that you open it the way the authorities expect.

But if you open it the way I do, it turns into a free learning site.

The way I opened it was through its help center:

help.aliyun.com/

Then, focus on the column in the navigation bar on the right.

For example, here in the database, LET me circle a few:

For example, middleware here, I also circle a few:

Of course, there are a lot of other things to look at, AND I’m not going to show them all, but if you’re interested, you can turn them over.

What is on display inside the help center is what Aliyun can sell.

And if they want to sell a service, they have to have documentation.

It’s “brag force”, how my service is so good, how stable, how much less worry, those application scenarios, blah, blah, blah…

So going through the documentation provided by the corresponding technical point is the correct way to open it.

Let me use Redis as an example.

Aliyun Redis document

I think the Redis technical document in Ali cloud is the best, so LET me show you.

Help.aliyun.com/product/263…

There is much to be concerned about in this learning path.

For example, let’s take a look at the content of these several places:

Application scenarios

What are the application scenarios for the first Redis?

Is this a question that interviewers often ask you in interviews: What is Redis for in your application?

What do you say?

To put it mildly, 95% of people would say: Use it as a cache.

Yeah, it’s basically just a cache. But can you add that in addition to being used as a cache, I also know that it can be used in other scenarios, such as:

Help.aliyun.com/document_de…

Kaka cut, on the documents directly to the examples listed again, with industry, has a scene, such responses must be literal answer than yours a “cache” is all right.

Disaster preparedness plan

Then let’s look at “Disaster Recovery plan” :

Help.aliyun.com/document_de…

You may have heard of the concept of “disaster preparedness”, but most of it is related to operations personnel, and we as developers don’t really know much about it.

But if you can, that’s a plus.

Aliyun describes three DISASTER recovery solutions.

In the single-AVAILABILITY zone HA solution, the disaster recovery level is the lowest. The active and standby nodes are deployed on different machines in the same Availability zone. When either node fails, the High Availability (HA) system automatically performs failover to avoid service interruption caused by a single point of failure.
In the same-city Dr Solution of medium Dr Level, the active and standby nodes are deployed in two availability zones in the same region. If either availability zone loses communication due to power or network factors, the HIGH availability (HA) system performs a failover to ensure the continuity of the entire instance.
Cross-regional disaster plan, disaster preparedness, highest made global distributed more child instance instance, all child instance by maintaining a real-time data synchronization, synchronization channel by channel manager is responsible for instance health condition monitoring, and so on the processing of abnormal events, main/backup is suitable for different disaster, different live, application access, contributing to the nearest load, etc.

A single AVAILABILITY zone HA solution with the lowest Dr Level has different deployment architectures:

For example, here is the standard two-copy high availability architecture:

It adopts the master-replica architecture. When the HA module detects a fault on the Master node, it automatically switches the Replica to Master, and the original Master becomes a new Replica after the connection is restored.

This is the most basic and simple architecture diagram for high availability.

But the problem with this architecture is that there is only one cluster. What if the whole cluster fails?

There is no choice but to evolve the architecture.

So, there is also the cluster version of the two-copy high availability architecture:

Cluster architecture (Double copy) Data fragments in an instance are used to carry data. Each data fragment is a double copy (deployed on different machines) in a high availability architecture. If the primary node fails, the system automatically switches over the primary node to ensure high availability.

Then there is the oft-mentioned read-write separation architecture:

In addition, I will not introduce the same-city disaster recovery plan and cross-region disaster recovery plan, but we will go to turn over the line.

Command support

I also found something new in command support:

There are some commands that are only supported in the Enterprise edition, which means that This is a re-development of Redis by Ali, and some commands have been enhanced.

Take these two commands for example:

What are they for?

Let me ask you a question: what is the command used to unlock distributed locks using Redis?

DEL command. Yes, that’s a good answer.

So what do you need to pay attention to when unlocking?

Is it necessary to check that the unlocked thread and the locked thread must be the same?

If you’re not sure why, the website offers an example:

At time t1, App1 sets the distributed lock resource_1 and the expiration time is 3 seconds.
App1 waits for more than 3 seconds because the program is slow, and Resource_1 is released at time T2.
At time T3, App2 acquires the distributed lock.
App1 recovers from the wait and releases the distributed lock held by App2 by running DEL Resource_1 at time T4.

Oh dear, you didn’t put the lock on yourself, but you let it go?

Therefore, as can be seen from the above procedure, a lock set by a client must be unlocked by itself.

Therefore, the client needs to run the GET command to confirm that the lock is set by itself, and then run the DEL command to unlock the lock.

Fetch, then judge, then delete. Obviously, this is not an atomic operation. What do we do?

Yes, lua scripts.

In Redis, you usually need to use lua scripts to implement self-locking and self-unlocking functions, such as:

if redis.call("get",KEYS[1]) == ARGV[1] then
    return redis.call("del",KEYS[1])
else
    return 0
end
Copy the code

Is that a bit of a hassle?

So Ali created a CAD command of his own.

The locking logic remains the same:

SET resource_1 random_value NX EX 5

Unlocking looks like this:

/* if (GET(resource_1) == my_random_value) DEL(resource_1) */
CAD resource_1 my_random_value
Copy the code

Isn’t that much simpler?

The underlying layer, which I haven’t seen yet, is an encapsulation of Lua scripts.

CAS command is to renew the lease, you can see for yourself.

Help.aliyun.com/document_de…

At the same time, the article also mentioned how to guarantee consistency:

If you don’t understand “if the lost data is associated with a distributed lock, it will cause a problem in the locking mechanism, which will cause a business exception”, then you will not understand the solution of “red lock”.

So is this another cue for you to find new things to learn in the system?

On this point, in fact, I have also written an article explaining the story of “Redis lock from interview rapid-fire chat to fairy fights.” If you don’t know, you can check it out.

In addition, you can see several examples in the left navigation bar:

These are self-developed, but some of the commands are actually open source, including the CAD and CAS commands mentioned earlier:

Github.com/alibaba/Tai…

Have you found another direction to study?

For example, there is a TairZset command.

As we know, Redis native Zset can mainly be used to make leaderboards, but only supports single-dimensional sorting.

Ali created the TairZset command, which was extended to support multiple dimensions:

This command is also open source.

Performance screening and tuning

Help.aliyun.com/document_de…

This part of the story, it’s just amazing.

You can tell by the headline it’s dry.

Let me give you an example:

May I ask, Redis CPU usage is high, what solutions or troubleshooting ideas do you have?

It doesn’t matter if you don’t know. There are three.

The first is to find and disable high-consumption commands.

High-consumption commands: commands whose time complexity is O(N) or higher. Generally, the higher the time complexity of a command is, the higher the CPU usage is, the more resources the command consumes.

Due to the characteristics of Redis, high consumption commands will cause queuing and slow application response.

In extreme cases, the instance can be blocked altogether, causing application timeout interrupts or traffic to skip the cache layer and reach the back-end database, causing an avalanche effect.

So how do we find high-cost commands?

This is one of the functions provided by their products:

Here even the data are cut up for you, basically see is crystal clear.

So what if instead of using its visual pages, we use native features?

Slowlog-log-slower than slowlog-max-len Slowlog-slower slowlog-slower than slowlog-max-len slowlog-len slowlog-slower slower than slowlog-max-len slowlog-slower slower

Moreover, even if you said in the interview that we found the high cost command through the visual page, I would have no problem, after all, it depends on what ideas you have.

Having ideas, finding corresponding solutions is not easy.

In addition, there are several schemes:

What if, after all this, CPU utilization is still high?

Add money, add machines, upgrade configuration in the case of normal business evaluation.

Best practices

Help.aliyun.com/document_de…

Do, do, do look at the best practices section for each technology point.

In the case of leaderboards, for example, you are given an environment and code to run directly:

You just follow the procedure.

For example, in JedisPool resource pool optimization.

Instructions and recommended values are given for Redis configuration. Reasonable configuration can improve Redis service performance and reduce resource overhead.