Application scenarios and implementation principles of cluster flow control

preface

Resources are limited, prediction is necessary, and accidents can happen. We can see that some major production accidents are often washed down by sudden flow, flow management and protection is particularly important. Nip in the bud and ensure high availability of services need to be paid attention to. In addition, we also need the first-class governance capability of the standard industry. This article introduces another member of high availability, cluster traffic limiting.

I. Application scenarios of cluster flow control

Scenario 1 needs to control the total number of calls

In some scenarios, you need to set a limit on the total number of resources (interfaces) that can be invoked by an APP. For example, because the APP relies on a third party to provide services, the third party traffic is limited and the total amount needs to be controlled. The capacity of deployed nodes may expand or shrink. Therefore, traffic limiting by a single node cannot take effect.

Scenario 2 The single-node traffic is unbalanced

The APP deployed 10 nodes with a total flow of 2000QPS, 200 for each node. This is an ideal situation, and may actually skew traffic due to load balancing. Some nodes have high traffic and some have low traffic. If traffic limiting is implemented only from the perspective of single-node traffic limiting, some nodes in an APP may have traffic limiting and some nodes have low loads.

Scenario 3 Deployment node configurations are different

The application APP has 10 nodes, some 2C4G, some 8C16G. In a mixed deployment scenario, if the traffic limiting threshold is set only for a single node, a low pressure threshold can be used as the threshold. A high number of nodes may waste resources.

Note: It is a good practice for traffic prevention to better cope with traffic prevention in different scenarios through cluster flow control and single-node traffic limiting.

Ii. Implementation principle of cluster flow control

To achieve cluster flow control, you need to count the total number of requests. We select an internal device as the Token Server for traffic statistics and Token provisioning.

Request process

As shown in the figure below, when service A invokes service B in the request link, service B enables cluster flow control. One node acts as A Token Server and the other nodes act as Token clients. The process is as follows:

When the request reaches the node, a Token request is sent to the Token Server
The Token Server determines whether to issue a Token based on whether the threshold is reached
The node that receives requests obtains the token and invokes the request downstream. The request exceeds the traffic limit threshold. The token Server does not issue a token, and the request is rejected

Realize the principle of

The implementation of cluster flow control is still based on token buckets, as shown in the diagram below:

The working process of the

The request traffic obtains the token from the token bucket and passes with the token, otherwise it is rejected
If the threshold is set to 100 requests per second, the request sending rate r=100/s
You need a token production rate of 1/r, or 1/100 of a token every 10 milliseconds
The token bucket of capacity B is full and excess tokens are discarded
The empty token bucket request was rejected
The maximum burst traffic allowed is the token bucket capacity B
The request is removed from the token bucket by the corresponding token

Dynamic selected main

Our services are released, scaled, and the nodes originally selected as Token servers may go offline. We adopt a distributed lock (fair lock) method to dynamically select the master.

Using fair locks mainly avoids the herd benefits of exclusive locks.

Threshold Settings

There are two types of cluster thresholds: global threshold and single-node allocation

For example, if the global threshold is set to 500, the total traffic threshold remains the same no matter how many nodes are deployed in the application
The threshold of each Token Server changes dynamically with the increase or decrease of nodes. For example, if the threshold of each Token Server is set to 100, the total flow control threshold of a three-node cluster is set to 300. After two nodes are added, the total threshold is 500