1. Origin of load balancing

In the early days, the system architecture was basically in the following form:

The client sends multiple requests to the server, the server processes the requests, some of which may interact with the database, and when the server finishes processing the requests, it returns the results to the client.

This architectural pattern is suitable for early systems that are relatively single, with relatively few concurrent requests, and the cost is low. However, with the increasing amount of information, traffic and data, and the increasing complexity of system services, this architecture will slow down the requests from the corresponding clients of the server. When the number of concurrent requests is very large, the server may crash directly. Obviously this is due to server performance bottlenecks, so how do you solve this situation?

Our first thought might be to upgrade the server configuration to improve the physical performance of the machine, such as more CPU execution frequency, more memory, etc., but we know that Moore’s Law is failing, and the hardware performance can no longer meet the increasing demand. The most obvious example is that on the Day of Tmall’s Double 11, the instantaneous page view of a hot commodity is extremely large, so the system architecture similar to the above cannot meet the demand if the machines are added to the existing top physical configuration. So what to do?

The above analysis removes the problem of increasing the physical configuration of servers, which means that the vertical solution is no longer feasible. How about increasing the number of servers horizontally? Generated by this time the concept of cluster, a single server won’t solve, we increase the number of the server, and then will request distribution to each server, the original request is to focus on a single server to request distribution to multiple servers, load distribution to a different server, which is what we call the load balance.

2. Nginx implements load balancing

The Nginx server is an intermediary between the client and the server. Through the reverse proxy function described in the previous blog, the client sends a request through Nginx, and then Nginx distributes the request to the corresponding server according to the corresponding rules.

The main configuration directives are the pass_proxy directive and the upstream directive. Load balancing is implemented by specialized hardware devices or software algorithms. The hardware achieves good load balancing effect, high efficiency, and stable performance, but the cost is high. The load balancing implemented by software mainly depends on the selection of the balancing algorithm and the robustness of the program. Equalization algorithms are mainly divided into two categories:

Static load balancing algorithm: the static load balancing algorithm mainly includes polling algorithm, weighted polling algorithm based on ratio, or weighted polling algorithm based on priority.

Dynamic load balancing algorithm: it mainly includes the least connection optimization algorithm based on task quantity, the fastest response first algorithm based on performance, prediction algorithm and dynamic performance allocation algorithm.

Static load balancing algorithm can also perform better in general network environment, while dynamic load balancing algorithm is more suitable for complex network environment.

Example:

(1) Ordinary polling algorithm

This is Nginx’s default polling algorithm.

1. Example: Two identical Tomcat servers, accessing Tomcat1 through localhost:8080, and Tomcat2 through localhost:8081. Now we need to enter the localhost address. We can alternate access between the two Tomcat servers.

1. Change the Tomcat server ports to 8080 and 8081. Then modify the Home page of Tomcat so that the two pages can be accessed separately. As follows:

Change the port number file to server.xml:

  

Change the home page path to webapps/ROOT/index.jsp

After the modification is complete, start the two Tomcat servers separately and enter the corresponding address port number:

Enter the address localhost:8081

Enter the address localhost:8080

Modify the nginx configuration file nginx.conf


 1     upstream OrdinaryPolling {
 2     server 127.0. 01.:8080;
 3     server 127.0. 01.:8081;
 4     }
 5     server {
 6         listen       80;
 7         server_name  localhost;
 8 
 9         location / {
10             proxy_pass http://OrdinaryPolling;
11             index  index.html index.htm index.jsp;
12         
13         }
14     }

Copy the code

Start nginx. Then enter the localhost address in your browser and watch the page change:

(2) Based on proportional weighted polling

Basically, the two Tomcat servers are accessed alternately. But here we have a need:

Since the Tomcat1 server is more configured, we expect it to accept more requests, while the Tomcat2 server is less configured and expects it to handle relatively few requests.

This is where the weighted polling mechanism comes in.

The nginx.conf configuration file is as follows:


 1     upstream OrdinaryPolling {
 2     server 127.0. 01.:8080 weight=5;
 3     server 127.0. 01.:8081 weight=2;
 4     }
 5     server {
 6         listen       80;
 7         server_name  localhost;
 8 
 9         location / {
10             proxy_pass http://OrdinaryPolling;
11             index  index.html index.htm index.jsp;
12         
13         }
14     }

Copy the code

In contrast to the above unweighted polling method, there is a weight directive in the upstream directive. This directive is used to configure the weight of the previous request processing, which defaults to 1.

In other words: the first kind of unweighted ordinary polling, in fact, its weight value is 1.

Let’s look at the corresponding results of the page:

Obviously, port 8080 appears more often, and the more trials we run, the closer we get to the ratio we configured.

③ IP routing load

We know that when a request is processed by a server, the server will save relevant session information, such as session, but if the request is not processed by the first server, the second server will be polling through nginx, then the server does not have session information.

A typical example: When a user enters a system for the first time, login authentication is required. The request is first redirected to the Tomcat1 server for processing. The login information is stored on Tomcat1. Since Tomcat2 does not save session information, it will assume that the user is not logged in and continue to log in once. If there are multiple servers, it will have to log in for each first visit, which obviously affects the user experience.

One of the problems that arises here is session sharing in a clustered environment. How do you solve this problem?

There are usually two methods:

1. The first method is to select a middleware and save the login information on a middleware, which can be a database such as Redis. For the first login, we save the session information in Redis. When we jump to the second server, we can first check whether there is login information on Redis. If there is, we can directly perform the operations after login without having to repeat login.

2. The second method is to distribute the requests from the same IP address to the same Tomcat server each time according to the IP address of the client, so there will be no session sharing problem.

Nginx’s IP-based routing load mechanism is the second form of appeal. The configuration is as follows:


1     upstream OrdinaryPolling {
 2     ip_hash;
 3     server 127.0. 01.:8080 weight=5;
 4     server 127.0. 01.:8081 weight=2;
 5     }
 6     server {
 7         listen       80;
 8         server_name  localhost;
 9 
10         location / {
11             proxy_pass http://OrdinaryPolling;
12             index  index.html index.htm index.jsp;
13         
14         }
15     }

Copy the code

Note: we added the ip_hash directive in the upstream block. This directive tells the Nginx server that requests from clients with the same IP address will be sent to the same Tomcat server for processing.

Load allocation based on server response time

The load is based on how long it takes the server to process the request, and the faster the request is processed, the shorter the response time is allocated first.


1     upstream OrdinaryPolling {
 2     server 127.0. 01.:8080 weight=5;
 3     server 127.0. 01.:8081 weight=2;
 4     fair;
 5     }
 6     server {
 7         listen       80;
 8         server_name  localhost;
 9 
10         location / {
11             proxy_pass http://OrdinaryPolling;
12             index  index.html index.htm index.jsp;
13         
14         }
15     }

Copy the code

By adding the fair directive.

(5) Load balancing for different domain names

With the location directive block we can also achieve load balancing for different domain names.


1     upstream wordbackend {
 2     server 127.0. 01.:8080;
 3     server 127.0. 01.:8081;
 4     }
 5 
 6     upstream pptbackend {
 7     server 127.0. 01.:8082;
 8     server 127.0. 01.:8083;
 9     }
10 
11     server {
12         listen       80;
13         server_name  localhost;
14 
15         location /word/ {
16             proxy_pass http://wordbackend;
17             index  index.html index.htm index.jsp;
18         
19         }
20     location /ppt/ {
21             proxy_pass http://pptbackend;
22             index  index.html index.htm index.jsp;
23         
24         }
25     }

Copy the code