What is load balancing
In the early days of the website establishment, we generally used a single machine to provide centralized services, but with the increasing volume of business, both performance and stability have become more challenging. At this point, we think of expanding the way to provide better service. We usually group several machines into a cluster to provide services externally. However, our website provides only one access point, such as www.taobao.com. So how do you distribute a user’s request to different machines in the cluster when the user types www.taobao.com in the browser? That’s what load balancing does.
At present, most Internet systems use the server cluster technology, that is, the same service is deployed on multiple servers to form a cluster and provide services externally as a whole. These clusters can be Web application server clusters, database server clusters, distributed cache server clusters, and so on.
In practical applications, there is always a load balancing server before the Web server cluster. The task of the load balancing device is to select the most suitable Web server as the traffic entrance of the Web server and forward the client request to it for processing, so as to realize transparent forwarding from the client to the real server.
Cloud computing and distributed architecture, which have become very popular in recent years, essentially take back-end servers as computing and storage resources and encapsulate them into a service provided externally by a certain management server. Clients do not need to care which machine really provides the service. It was as if he were dealing with a server with nearly unlimited power, when, in essence, it was the back-end cluster that provided the services.
Software loads solve two core problems: who to select and forward, most notably LVS (Linux Virtual Server)
The topology of a typical Internet application looks like this:
Load Balancing classification
As we now know, load balancing is a computer network technique used to distribute load among multiple computers (clusters of computers), network connections, CPUS, or other resources in order to achieve optimal resource usage, maximize throughput, minimize response times, and avoid overloads. Load balancing can be implemented in various ways. It can be roughly divided into the following types, among which the most commonly used are layer 4 and layer 7 load balancing:
Layer 2 Load Balancing
The load balancing server provides a VIP (virtual IP address) for external servers. Different servers in the cluster use the same IP address but have different MAC addresses. After receiving the request, the load balancing server rewrites the destination MAC address of the packet and forwards the request to the destination machine for load balancing.
Layer 3 load balancing
Similar to Layer-2 load balancing, a load balancing server still provides a VIP, but different machines in the cluster use different IP addresses. After receiving the request, the load balancing server forwards the request to different real servers through IP based on different load balancing algorithms.
Layer 4 load balancing
Layer 4 load balancing works at the transport layer of the OSI model. At the transport layer, only TCP/UDP protocols are available. These two protocols contain source and destination IP addresses as well as source and destination port numbers. After receiving the request from the client, the Layer-4 load balancing server forwards the traffic to the application server by modifying the IP address and port number of the packet.
Layer 7 load balancing
Layer-7 load balancers work at the application layer of the OSI model. The application layer has many protocols, such as HTTP, RADIUS, and DNS. Layer 7 loads can be loaded based on these protocols. There is a lot of interest in these application layer protocols. For example, the load balancing of the same Web server can be determined not only by IP and port, but also by the URL, browser type and language of the seven layers.
For general applications, Nginx is enough. Nginx can be used for layer 7 load balancing. However, for some large websites, DNS+ four – layer + seven – layer load balancing is generally adopted.
Common load balancing tools
Hardware load balancing has excellent performance and comprehensive functions, but it is expensive. It is generally suitable for the early stage or tuhao companies to use it for a long time. Therefore, software load balancing is widely used in the Field of Internet. Commonly used software load balancing software includes Nginx, LVS, HaProxy and so on.
1, the LVS
LVS (Linux Virtual Server), also known as Linux Virtual Server, is a free software project initiated by Dr. Zhang Wensong. The objective of LVS technology is to realize a cluster of high performance and high availability servers by LVS load balancing technology and Linux operating system. It has good reliability, scalability and operability. Thus achieving optimal service performance at low cost.
LVS is mainly used for four-layer load balancing.
The LVS architecture
The Server cluster system constructed by LVS consists of three parts: the front-end Loader Balancer layer, the middle Server group layer represented by Server Array, and the bottom data sharing Storage layer represented by Shared Storage. All applications are transparent to the user, who is simply using a high performance service provided by a virtual server.
Detailed introduction to the various levels of LVS:
-
The Load Balancer layer:
Located at the front end of the whole cluster system, there is one or more load schedulers (Director Server). LVS module is installed on the Director Server, and the Director’s main function is similar to a router, which contains the routing table to complete the LVS function. These routing tables distribute user requests to Real Servers in the Server Array layer. At the same time, Ldirectord, a monitoring module for Real Server services, is installed on Director Server to monitor the health of individual Real Server services. Remove Real Server from the LVS routing table when it is unavailable and rejoin it when it recovers.
-
Server Array layer:
Consisting of a set of machines that actually run application services, Real Servers can be Web servers, Mail servers, FTP servers, DNS servers, and video servers. Each Real Server is connected via a high-speed LAN or distributed WAN. In Real applications, the Director Server can also act as a Real Server.
-
Shared Storage layer:
Is to provide Shared storage space for all the Real Server and content uniformity of storage area, in physics generally consists of disk array device, in order to provide consistency of content, can generally through the NFS Shared network file system data, but the NFS in the busy business system, performance and is not very good, this time can be used in a cluster file system, For example, Red Hat GFS file system and Oracle OCFS2 file system.
“
It can be seen from the whole LVS structure that Director Server (load scheduler) is the core of the whole LVS. Currently, the operating system used for Director Server can only be Linux and FreeBSD. Linux2.6 kernel can support LVS function without any Settings. FreeBSD as a Director Server is not many applications, performance is not very good. For Real Server, it can be almost any system platform, Linux, Windows, Solaris, AIX, BSD family is well supported.
2, Nginx
Nginx (pronounced engine X) is a web server that can reverse proxy HTTP, HTTPS, SMTP, POP3, IMAP protocol links, as well as a load balancer and an HTTP cache.
Nginx is mainly used for layer 7 load balancing.
Features:
-
Modular design: good expansibility, function can be extended by module.
-
High reliability: The master process and worker are synchronized. If one worker fails, the other worker will start immediately.
-
Low memory consumption: 10,000 keep-alive connections consume only 2.5MB of memory.
-
Hot deployment: You can update configuration files, replace log files, and update server program versions without stopping the server.
-
Strong concurrency capability: official data supports 50,000 concurrent requests per second;
-
Rich functions: Excellent reverse proxy function and flexible load balancing policy
Basic working mode of Nginx
A master process generates one or more worker processes. But here the master starts as root because nginx works on port 80. Only administrators have permission to start ports less than 1023. Master is mainly responsible for starting worker, loading configuration files, and smooth upgrade of the system. Other work is assigned to the worker. When the worker is started, it is only responsible for some of the simplest work on the Web, while other work is realized by the module called in the worker.
Functions are implemented in a pipelined fashion between modules. Pipelining refers to a user request that is completed by multiple modules combining their respective functions in turn. For example, the first module is only responsible for analyzing the request header, the second module is only responsible for finding the data, and the third module is only responsible for compressing the data. To achieve the completion of the entire work.
How do they achieve hot deployment? As we said earlier, the master is not responsible for specific work, but calls the worker. It is only responsible for reading the configuration file. Therefore, when a module is modified or the configuration file is changed, the master reads the configuration file, so the work of the worker will not be affected at this time. After the master reads the configuration file, the worker is not notified of the modified configuration file immediately. Instead, the modified worker is allowed to continue working with the old configuration file. When the worker finishes working, it directly shuts down the subprocess, replaces it with a new subprocess, and uses the new rules.
3, HAProxy
HAProxy is also a load balancing software that is widely used. HAProxy provides high availability, load balancing, and proxy based on TCP and HTTP applications. It supports virtual hosting and is a free, fast and reliable solution. This is especially useful for heavily loaded Web sites. The runtime mode makes it easy and secure to integrate into the current architecture while protecting your Web server from being exposed to the network.
HAProxy is free and open source software written in C that provides high availability, load balancing, and TCP – and HTTP-based application proxies.
Haproxy is mainly used for layer 7 load balancing.
Common load balancing algorithms
As mentioned above in the introduction of load balancing technology, load balancing servers use load balancing algorithms to determine which real servers to forward requests to. Load balancing algorithm can be divided into two types: static load balancing algorithm and dynamic load balancing algorithm.
-
Static load balancing algorithms include polling, ratio, and priority.
-
Dynamic load balancing algorithm includes: minimum number of connections, fastest response speed, observation method, prediction method, dynamic performance allocation, dynamic server replenishment, quality of service, service type, rule mode.
Round Robin: Sequential loops will request a sequential loop to connect to each server once. When one of the servers fails at levels 2 through 7, big-IP removes it from the sequential loop queue and does not participate in the next poll until it recovers.
Advantages: simple and efficient implementation; Easy horizontal expansion
Disadvantages: The request to the destination node is uncertain, so it is not suitable for write scenarios (cache, database write)
Application scenario: The database or application service layer only has read data
Random mode: request is randomly distributed to each node; In the case of large enough data, a balanced distribution can be achieved
Advantages: simple and easy horizontal expansion
Disadvantages: Same as Round Robin, cannot be used in written scenarios
Application scenario: Database load balancing is also a read-only scenario
Hash method: according to the key to calculate the need to fall on the node, can ensure that a key must fall on the same server;
Advantages: The same key must reside on the same node, which can be used in cache scenarios with write and read
Disadvantages: In a node failure, will lead to hash key redistribution, resulting in a significant drop in hit ratio
Solution: Consistent hashing or using Keepalived to ensure the high availability of any node, after failure will have other nodes to top
Application scenario: Cache, read and write
Consistency hashing: when a node of the server fails, only the key on this node is affected to ensure the maximum hit ratio; For example, ketama scheme in TwemProxy; The production implementation can also plan to specify the sub-key hash, so as to ensure that locally similar key energy distribution on the same server;
Advantages: node failure after the hit rate is limited
Application scenario: Cache
Load by key range: Load by key range, the first 100 million keys are stored in the first server, 100-200 million in the second node.
Advantages: Easy horizontal expansion, when the storage is insufficient, add a server to store the subsequent new data
Disadvantages: uneven load; Database distribution is uneven;
(The data is hot and cold. Generally, the recently registered users are more active, which causes the subsequent servers to be very busy, while the early nodes are very idle.)
Application scenario: Database fragment load balancing
Load by modulo key to server nodes: load by modulo key to server nodes; For example, if there are four servers, key modulo 0 falls on the first node and key 1 falls on the second node.
Advantages: data hot and cold distribution balance, database node load balance distribution;
Disadvantages: Difficult to scale horizontally;
Application scenario: Database fragment load balancing
Pure dynamic node load balancing: based on CPU, IO, and network processing capacity to decide how to schedule the next request.
Advantages: make full use of server resources, ensure the load processing balance on each node
Disadvantages: complex implementation, less real use
No active load balancing: use message queue to asynchronous model, eliminate the problem of load balancing; Load balancing is a push model, sending data to you all the time, so send all the user requests to the message queue, all the downstream nodes who are idle, who come up to fetch data processing; After conversion to pull model, the problem of load on downlink nodes is eliminated.
Advantages: Through the buffer of message queue, the backend system is protected and the backend server will not be overwhelmed when requests surge. Horizontal expansion is easy, after adding a new node, directly fetch queue; Disadvantages: not real-time;
Application scenario: Scenarios where real-time return is not required. For example, 12036 place an order, immediately return a message: your order into the queue… After the processing is completed, and then asynchronous notification;
Ratio: assign a weighted Ratio to each server, and assign user requests to each server according to this Ratio. When a layer 2 through 7 failure occurs on one of these servers, big-IP removes it from the server queue and does not participate in the allocation of the next user request until it recovers.
Priority: assign all server groups, assign Priority to each group, and assign requests from big-IP users to the server group with the highest Priority (within the same group, assign requests from users using polling or ratio algorithms); Big-ip sends requests to the lower-priority server group only when all servers in the highest priority are down. In this way, users are actually provided with a hot backup mode.
Least Connection: Pass new connections to the server that does the Least Connection processing. When a layer 2 through 7 failure occurs on one of these servers, big-IP removes it from the server queue and does not participate in the allocation of the next user request until it recovers.
Fastest mode: Transfers connections to the Fastest servers. When a layer 2 through 7 failure occurs on one of these servers, big-IP removes it from the server queue and does not participate in the allocation of the next user request until it recovers.
The server is selected for the new request based on the optimal balance of connection number and response time. When a layer 2 through 7 failure occurs on one of these servers, big-IP removes it from the server queue and does not participate in the allocation of the next user request until it recovers.
Predictive mode (Predictive mode) : Big-IP analyzes the collected server performance data and selects a server whose performance is expected to be the best in the next time slice. (Detected by BIG-IP)
Dynamic Ratio-APM: Big-IP collects performance parameters of applications and application servers to dynamically adjust traffic allocation.
Dynamic Server Act. : Dynamically adds backup servers to the primary Server group when the number of backup servers in the primary Server group decreases due to a failure.
Quality of service (QoS) : Data flows are allocated according to different priorities.
Service Type (ToS): Allocates data flows based on load balancing of different service types (identified in Type of Field).
Rule mode: Users can set guidance rules for different data flows.
conclusion
The official account “Open Source Linux” focuses on sharing Linux/ UNIx-related content, including Linux operation and maintenance, Linux system development, network programming, virtualization and cloud computing and other technical dry goods. Background reply “learning”, send you a series of books to learn Linux, look forward to meeting with you.