Load balancing algorithm

Describes the load balancing algorithm

Introduction to Load Balancing

Load balancing, also known as Load Balance in English, refers to a server set composed of multiple servers in a symmetrical manner. Each server has an equivalent status and can provide services independently without the assistance of other servers.

Through some load balancing technology, the external requests are evenly distributed to a server in a symmetric structure, and the server receiving the request responds to the customer’s request independently.

Load balancing, which evenly distributes customer requests to the server array, provides quick access to critical data and solves the problem of large numbers of concurrent access services. This clustering technology can achieve near-main-host performance with minimal investment.

Load balancing mode

Software load and hardware load

Software load balancing

Common load balancing software includes Nginx, LVS, and HAproxy

Information:

Advantages and disadvantages the three software load balancer: www.ha97.com/5646.html the three software load balancer contrast: www.21yunwei.com/archives/58…

Hardware load balancing

Common load balancing hardware components are: Array, F5

Load balancing algorithm

Random algorithm, weighted polling, consistent hash, minimum active number algorithm

Load balancing algorithm simulation

Data to support

(1) Random algorithm

1. Simple random algorithm

This simple random algorithm is used when the performance of the machines is similar to that of the machines on a daily basis. In fact, some machines in production may have a little more performance. It can handle more cases, so we can set a weight for each server.

2. Weighted stochastic algorithm — V1

The V1 version of the weighted stochastic algorithm has a simple idea. Each server is copied according to its corresponding weight. This implementation method consumes more memory when the sum of weights is particularly large, because IP addresses need to be copied. The larger the sum of weights, the more memory is needed for the IPS mentioned above.

3. Weighted stochastic algorithm — V2

Here’s another way to do this. Consider A set of servers=[A, B, C] with weights=[5,3,2] and weights that add up to 10. (1) now tile these weights on one-dimensional coordinates, so that interval [0,5] belongs to server A, interval [5,8] belongs to server B, and interval [8,10] belongs to server C. (2) next, generate a random number in the range of [0,10] through random numbers, and then calculate the interval on which the random number falls.

(2) Polling algorithm

1. Simple polling algorithm

This simple polling algorithm, it’s fair, each server takes its turn to serve, but some of the machines are better than others, so they work harder than others, just like the random algorithm, so we can set a weight for each server.

2. Weighted polling algorithm

Weighted polling algorithm: idea:

For example, servers=[A,B,C], weights=[2,5,1], the total weight is 8.

We can understand that there are 8 servers, 2 A’s, 5 B’s, and 1 C’s

To access in order, if there are 10 calls, the call order is AABBBBBCAA.

Steps:

Because the number of calls will become larger and larger, and the server is fixed, the number of calls needs to be “reduced” to take mod

The weight values are tiled on the one-dimensional coordinate values: [0,2] is A, [2,7] is B, and [7,8] is C

Next, we need to obtain the number of requests. We need to do the mod operation on the total weight and obtain the offset

This algorithm has one drawback: For example, if we have three servers=[A,B,C], for example, if we have three servers=[A,B,C], for example, if we have three servers=[A,B,C], for example, if we have three servers=[A,B,C], Corresponding weights=[2,5,1], the total weight is 7, then the result of the above algorithm is AAAAABC. What if the result is AABACAA, B and C are inserted into the middle of the five A’s on average, which is more balanced.

3. Smoothing weighted polling algorithm

Then the smooth weighted polling idea is introduced:

Each server corresponds to two weights, weight and currentWeight. Where weight is fixed, currentWeight is dynamically adjusted, starting at 0

When a new request comes in, the list of servers is traversed, its currentWeight plus its own weight, and when the traversal is complete, the maximum currentWeight is found.

And the maximum currentWeight minus the sum of the weights, and then return the corresponding server.

Hypothesis: Test data:

WEIGHT_LIST.put(“A”, 5); WEIGHT_LIST.put(“B”, 1); WEIGHT_LIST.put(“C”, 1);

The operation process is as follows:

The number of	CurrentWeight array (currentWeight+=weight)	Select result Max (currentWeight)	CurrentWeight (Max (currentWeight)-=sum(weight)
1	,1,1 [5]	A	[- 2,1,1]
2	,2,2 [3]	A	[4,2,2]
3	Filling [1]	B	[1, 4, 3]
4	[6, 3, 4]	A	[- 1, 3, 4]
5	(4, 2, 5)	C	[4,-2,-2]
6	[9,-1,-1]	A	(2, 1, 1]
7	[0, 7]	A	[0, 0]
8	,1,1 [5]	A	[- 2,1,1]

As shown in the table, the obtained server sequence is [A, A, B, A, C, A, A] after sliding processing, which is better distributed than the previous sequence [A, A, A, A, A, B, C]. In the initial case currentWeight=[0,0,0], after the seventh request is processed, currentWeight changes back to [0,0,0] again. You will be surprised to find that the current currentWeight array changes back to [5,1,1] on the 8th time.

Concrete code implementation is as follows:

(3) Consistent hash algorithm

Cluster server receives a request call, can according to the request of information, such as the client’s IP address, or request path and parameter information such as hash, you can get a hash value, characteristic is for the same IP address, or request the path and parameter hash value is the same, as long as you can add an algorithm, Being able to map this hash value to the address of a server IP will allow the same request to land on the same server.

Since the number of requests made by the client is infinite, the hash value is also infinite, so you cannot map all the hash values to the server IP, so you need to use a hash ring here. The diagram below:

If the ip4 server is down, this is what it looks like:

It will be found that the direct range of IP3 and IP1 is relatively large, and more requests will fall on IP1, which is unfair. To solve this problem, we need to add virtual nodes:

Ip2-1 and IP3-1 are virtual nodes, which cannot handle nodes, but are equivalent to corresponding IP2 and IP3 servers. In fact, this is just one way to deal with this imbalance. In fact, even if the hash ring itself is balanced, you can add more virtual nodes to make the ring more balanced. For example:

This color ring is also fair, and only IP1, IP2, IP3, ip4 are actual server IP, the rest are virtual IP. So how do we do that?

For our server IP address, we must know how many virtual nodes there are and how many virtual nodes we need to control by ourselves. The more virtual nodes there are, the more balanced the traffic will be. In addition, the hash algorithm is also critical, the more hash the algorithm is, the more balanced the traffic will be.

Such rings can be stored using TreeMap; When a request comes in, it retrieves the hash value of the request. Based on the hash value, it retrieves a subtree larger than the hash value from the TreeMap.

And then you just take the first element from the subtree that you get.

Concrete code implementation:

(4) Minimum active number algorithm

The main goal of the previous approaches is to get as many calls allocated to the server as possible, but is that really the case? The same number of calls, the server load is balanced? Of course not. There’s also the time per call, which is what the minimum active number algorithm does.

A smaller number of active invocations indicates that the service provider is more efficient and can process more requests per unit of time. At this point, the request should be assigned to the service first

Service provider. In the implementation, there is one active number for each service provider. The initial number of active service providers is 0. Each time a request is received, the number of active requests is increased by one, and when the request is completed, the number of active requests is decreased by one. After the service runs for a period of time, the service provider with good performance can process the request faster, so the active number decreases faster. At this time, such a service provider can get the new service request first, which is the basic idea of the minimum active number load balancing algorithm. Besides the minimum active number, the minimum active number algorithm also introduces the weight value in the implementation. So to be precise, the minimum active number algorithm is based on the weighted minimum active number algorithm. For example, in a service provider cluster, there are two excellent service providers. If they have the same number of active requests at a certain moment, requests will be allocated according to their weights. The greater the weights, the greater the probability of obtaining new requests. If the two service providers have the same weight, select one at random.

Implementation:

Because the number of active requests needs to be processed by the server logic, the start of the call is active +1, the end is active -1, so this part of the logic is not simulated, directly using a map to simulate.

Concrete code implementation:

// Minimum active algorithm
public class LeastActive {
    private static String getServer(a) {
        // Find the server with the smallest number of active servers
        Optional<Integer> minValue = ServerIps.ACTIVITY_LIST.values().stream().min(Comparator.naturalOrder());
        if (minValue.isPresent()) {
            List<String> minActivityIps = new ArrayList<>();
            ServerIps.ACTIVITY_LIST.forEach((ip, activity) -> {
                if(activity.equals(minValue.get())) { minActivityIps.add(ip); }});// If there are multiple IP addresses with the minimum number of active IP addresses, they are selected according to the weight
            if (minActivityIps.size() > 1) {
                // Filter out the corresponding IP address and weight
                Map<String, Integer> weightList = new LinkedHashMap<>();
                ServerIps.WEIGHT_LIST.forEach((ip, weight) -> {
                    if(minActivityIps.contains(ip)) { weightList.put(ip, ServerIps.WEIGHT_LIST.get(ip)); }});int totalWeight = 0;
                boolean sameWeight = true;
                Object[] weights = weightList.values().toArray();
                // Calculate the total weight and determine whether the ownership weight is the same
                for (int i = 0; i < weights.length; i++) {
                    Integer weight = (Integer) weights[i];
                    totalWeight += weight;
                    if (sameWeight && i > 0 && !weight.equals(weights[i - 1])) {
                        sameWeight = false; }}// Generate a random number within [0,totalWeight]
                java.util.Random random = new java.util.Random();
                int randomPos = random.nextInt(totalWeight);
                if(! sameWeight) {for (String ip : weightList.keySet()) {
                        Integer weight = weightList.get(ip);
                        if (randomPos < weight) {
                            returnip; } randomPos = randomPos - weight; }}// If all ownership weights are the same, use the random algorithm
                randomPos = random.nextInt(weightList.size());
                return (String) weightList.keySet().toArray()[randomPos];
            } else {
                return minActivityIps.get(0); }}else {
            java.util.Random random = new java.util.Random();
            int randomPos = random.nextInt(ServerIps.WEIGHT_LIST.size());
            return(String) ServerIps.WEIGHT_LIST.keySet().toArray()[randomPos]; }}public static void main(String[] args) {
        for (int i=0; i<10; i++){ System.out.println(getServer()); }}}Copy the code

Load balancing summary:

Reference: dubbo.apache.org/zh-cn/docs/…

Load balancing algorithm