Common techniques for balancing network traffic, their advantages and trade-offs.
Large multi-site Internet systems, including content delivery networks (CDNS) and cloud service providers, use several methods to balance incoming traffic. In this article we will take a look at common traffic balancing designs, including their techniques and trade-offs.
Early cloud computing service providers could provide a single client Web server, assign an IP address, configure a DNS record to point to that IP address with a human-readable domain name, and then advertise the IP address over the Border Gateway Protocol (BGP). BGP is a standard way to exchange routing information between different networks.
This is not load balancing per se, but it can distribute traffic over redundant multiple network paths, and network technology can be used to bypass unavailable networks, thereby improving availability (and giving rise to asymmetric routing).
Simple DNS load balancing
With more traffic from customers, the boss wants the service to be highly available. You launch a second Web server with its own public IP address, and then you update DNS records to redirect user traffic to both servers (hoping they’ll serve you equally). This was done without problem until one of the servers failed. Assuming you can detect the failure quickly, you can update the DNS configuration (manually or through software) to delete the records resolved to the failed machine.
Unfortunately, because DNS records are cached, about half of all requests fail before the client cache and the cache on the DNS server on which they depend become invalid. DNS records have a lifetime (TTL) of a few minutes or more, so this approach can have a serious impact on system availability.
To make matters worse, some clients ignore TTL entirely, so some requests will continue to be directed to your malfunctioning machine. Setting a very short TTL is not a good idea either, as it means higher DNS service loads and longer access latency as clients do more DNS queries. If the DNS service becomes unavailable for some reason, setting a shorter TTL will reduce traffic to the service faster because fewer clients will have a cache of your site’s IP address.
Example Increase network load balancing
To solve this problem, you can add a pair of redundant Layer 4 (L4) network load balancers and configure the same virtual IP address (VIP). The equalizer can be hardware or software such as HAProxy. The DNS records of the domain name point to the VIP and do not perform load balancing.
The layer 4 equalizer directs network traffic evenly to the back-end server. This is usually done by hashing (a mathematical function) a quintuple of IP packets: source address, source port, destination address, destination port, and protocol (such as TCP or UDP). This approach is fast and efficient (it also maintains the basic properties of TCP) and does not require an equalizer to maintain the state of each connection. (For more information, read Google’s Maglev paper, which discusses the implementation details of a four-tier software load balancer.)
A tier 4 equalizer can perform health checks on back-end services and distribute traffic only to healthy machines. Unlike DNS for load balancing, in the event of a back-end Web service failure, traffic can be quickly redistributed to other machines, although existing connections on the failed machine can be reset.
When back-end servers have different capabilities, layer 4 balancers distribute traffic based on weights. It provides great power and flexibility for operations personnel with relatively small hardware costs.
Scale to multiple sites
The size of the system continues to grow. Your customers want to be able to use the service all the time, even when the data center fails. So you build a new data center with a separate set of services and a four-tier load balancer cluster, still using the same VIP. DNS Settings remain unchanged.
The edge routers at both sites advertise their address Spaces, including VIP addresses. Requests to this VIP can reach any site, depending on how the network between the user and the system is connected, and how routing policies are configured for each network. This is called pan-casting. Most of the time this mechanism works fine. If one site has a problem, you can stop announcing VIP addresses over BGP, and customer requests can quickly be diverted to another site.
There are some problems with this setup. The biggest problem is that you can’t control which site requests go to, or limit traffic to a site. There is also no clear way to route a user’s request to the nearest site (to reduce network latency), although network protocols and routing configurations should in most cases route a user’s request to the nearest site.
Controls inbound requests in a multi-site system
To maintain stability, you need to be able to control the amount of traffic at each site. To achieve this control, assign a different VIP address to each site and use simple or weighted DNS polling for load balancing.
Now there are two problems.
First, using DNS equalization means that there will be cached records, which can be troublesome if you want to redirect traffic quickly.
Second, each time a user makes a new DNS query, it may be connected to any site, which may not be the closest. If your service runs on many widely distributed sites, users will experience a significant difference in response time, depending on how much network latency there is between the user and the site hosting the service.
The first problem is solved by having each site be configured with the VIP addresses of all the other sites and advertised (and therefore including the failed sites). There are some network tricks, such as backing up the site to declare a route without using a specific destination address as the primary site, to ensure that every VIP primary site is serviced first as long as it is available. This is done via BGP, so we should see the traffic start to shift within a minute or two after the BGP update.
Even if the nearest site to the user is healthy and capable of service, but the user is not really visiting the site, this problem has not been a good solution. Many large Internet services use DNS to return different resolution results to users in different regions, which can also have some effect. However, because the structure of network addresses is independent of geographical location, an address segment can change location (for example, when a company replans its network), and many users may use the same DNS cache server. Therefore, this scheme has a certain complexity and is prone to error.
Add layer 7 load balancing
After a while, your customers start asking for more advanced features.
Although layer-4 load balancers can efficiently distribute traffic between multiple Web servers, they only operate on source addresses, destination addresses, protocols, and ports, and the content of the request is unknown, so many advanced functions cannot be implemented on Layer-4 load balancers. A seven-tier (L7) load balancer knows the content and structure of the request, so it can do more.
Layer 7 load balancing can implement caching, speed limiting, error injection, and load balancing with awareness of the cost of requests (some of which take longer for the server to process).
Layer 7 load balancing can also distribute traffic based on requested properties (such as HTTP cookies), terminate SSL connections, and help defend against denial of service (DoS) attacks at the application layer. The downside of large L7 load balancing is cost — processing requests requires more computation, and each active request consumes some system resources. Running an L4 balancer cluster in front of one or more L7 balancers helps to scale.
conclusion
Load balancing is a complex problem. In addition to the strategies described above, there are different load balancing algorithms, high availability technologies for load balancers, client-side load balancing technologies, and more recently, service networks.
The core load balancing model continues to evolve as cloud computing evolves, and will continue to evolve as large Web service providers work to make load balancing technology more controllable and flexible.
via: https://opensource.com/article/18/10/internet-scale-load-balancing
Author: Laura Nolan Lujun9972
This article is originally compiled by LCTT and released in Linux China