In the architecture design of software system, the load balancing design of cluster is an essential scheme in the optimization of high-performance system. Load balancing is essentially used to balance and decompress user traffic. Therefore, it is of self-evident importance in heavy traffic projects on the Internet.

What is load balancing?

In the early Internet applications, due to the small user traffic and simple business logic, a single server is often able to meet the load requirements. As Internet traffic now more and more big, a little bit good system, traffic is very large, and the system function is becoming more and more complex, then the single server performance optimization even if it is again good, also can’t support such a large users access the pressure, this time will need to use the machine, to design high performance cluster.

So how do multiple servers balance traffic and form a high-performance cluster?

At this point, it is necessary to invite the “load balancer” to enter.

A Load Balancer evenly distributes user access traffic to multiple back-end servers based on a forwarding policy. The back-end servers can independently respond to and process requests, thus spreading the Load. Load balancing technology improves the service capability of the system and enhances the availability of applications.

(Can be understood according to the picture, picture source network)

How many load balancing schemes are there?

At present, there are three most common load balancing technical solutions in the market:

DNS based load balancing

Based on hardware load balancing

Based on software load balancing

The three schemes have their own advantages and disadvantages. DNS load balancing can realize traffic balancing in regions. Hardware load balancing is mainly used to load large server clusters, while software load balancing is mostly based on machine-level traffic balancing. In real scenarios, these three types can be used together. Here’s a closer look:

DNS based load balancing

Load balancing based on DNS is the simplest solution. You can perform a simple configuration on the DNS server. The principle is that when a user accesses a domain name, the DNS server first resolves the IP address corresponding to the domain name. At this time, we can ask the DNS server to return different IP addresses according to users in different geographical locations. For example, users in the south will return to the IP address of our business server in Guangzhou, while users in the north will return to the IP address of the business server in Beijing.

In this mode, users can distribute requests according to the “proximity principle”, which not only reduces the load on a single cluster, but also improves the access speed of users.

The DNS load balancing solution has the natural advantage of simple configuration, low implementation cost and no additional development and maintenance. However, there is an obvious disadvantage: when the configuration is modified, it does not take effect in a timely manner. This is due to the DNS feature. DNS generally has multi-level cache. After the DNS configuration is modified, IP addresses are not changed in a timely manner due to cache, affecting load balancing.

In addition, DNS load balancing is based on regions or IP polling without advanced routing policies. This is also the limitation of THE DNS solution.

2. Based on hardware load balancers

For example, the famous F5 Network big-IP, also known as F5, is a Network device. You can simply understand it as something similar to a Network switch. It can resist pressure completely through hardware, and its performance is very good, with millions of requests per second. That is, the load of millions per second, of course, the price is very, very expensive, tens of thousands to millions of yuan.

Because this kind of equipment is generally used in the front of the traffic entrance of large Internet companies, as well as the government, state-owned enterprises and other enterprises that are not short of money will use it. General small and medium-sized companies are not willing to use.

The use of F5 hardware to do load balancing, the main is to save worry and trouble, buy a fix, powerful performance, the general business is not a problem. It also supports many flexible policies in load balancing algorithm and has some security functions such as firewall. But the disadvantages are also obvious, in one word: expensive.

3. Based on software load balancing

(Internet photo)

Software load balancing refers to the use of software to distribute and balance traffic. Software load balancing is divided into layer 7 protocols and layer 4 protocols. Network protocols have seven layers. Traffic distribution schemes based on layer 4 transport layer are called layer 4 load balancing, such as LVS, while traffic distribution schemes based on layer 7 application layer are called layer 7 load balancing, such as Nginx. There are some differences in performance and flexibility between the two.

The performance of load balancing based on layer 4 is higher, typically reaching several hundred thousand/SEC, while load balancing based on layer 7 is typically in the tens of thousands/SEC.

Software-based load balancing is also notable for its cheapness. It can be deployed on a normal server, without additional procurement, is to invest a little technology to optimize, so this method is the most used by Internet companies.

What are the commonly used equalization algorithms?

The above mentioned common load balancing technical solutions, then let’s take a look, in the actual scheme application, generally can use what balancing algorithm?

Polling strategy

Load strategy

The response strategies

Hash strategy

Here are the characteristics of these equalization algorithms/strategies:

Polling strategy

The polling strategy is well understood, as when a user request comes in, the “load balancer” forwards the request to a different back-end service server in turn. This strategy is often used in DNS solutions. There is no need to pay attention to the status of back-end services. If there is a request, it will be forwarded to the back-end in turn.

In practice, there are many ways of polling, including sequential polling, random polling, and polling by weight. The first two is better understood, the third kind of polling, according to weight refers to give every client service after setting a weight value, such as high performance server weight, the weight of the low performance of the server to lower, this setting, distribution of flow rate, high weights to more traffic, can make full play of the back-end performance of the machine.

Load strategy

The load degree policy means that when the load balancer forwards traffic to the back end, it first evaluates the load pressure of each back end server. The back end server with high pressure forwards fewer requests, and the back end server with low pressure forwards more requests.

This method is fully combined with the running state of the back-end server to dynamically distribute traffic, which is more scientific than the polling method.

But also brought some disadvantages, this way because of the need to the evaluation of dynamic backend server load pressure, then the “load balancer” in addition to forward the request, also do a lot of extra work, such as collecting connection number, number of requests, CPU load index, IO load index, etc., through the study of the calculation and contrast of these indicators, Determine which back-end servers are under heavy load.

Therefore, this method not only brings the effect advantage, but also increases the implementation difficulty and maintenance cost of “load balancer”.

The response strategies

Response policy means that when a user request comes in, the load balancer will preferentially forward the request to the back-end server that is most responsive at the moment. In other words, regardless of whether the back-end server load is high or not, and regardless of configuration, as long as the server is considered to be the fastest response to the user’s request at the current moment, then the request will be forwarded to it first, so that the user will have the best experience.

How does the “load balancer” know which back-end services are the most responsive at the moment? This requires the “load balancer” to continuously count the speed of each back-end server, say, once a minute, to generate a ranking of back-end server speed. The “load balancer” then forwards the service based on this ranking.

The problem here is the cost of statistics. Doing these statistics all the time can itself consume some performance, and also increase the implementation difficulty and maintenance cost of the “load balancer”.

Hash strategy

The Hash policy is also easy to understand. It is to Hash a certain information in a request and then modulo it according to the number of back-end servers to obtain a value. The requests with the same value are forwarded to the same back-end server.

A common use of this policy is for the USER’s IP or ID, and the “load balancer” ensures that the same IP source or user will always be sent to the same backend server. This is especially useful for caching, sessions, and other functions.

The above are common technical solutions and strategies for implementing high-performance load balancing. Welcome to share with us.

Java technology, architecture technology interested students, welcome to add QQ group: 863621962, study together, discuss with each other.