HAProxy understanding column

Column catalog

OSI Model
High concurrent load balancing HAProxy

Column details

OSI Model

The Open System Interconnection Model (OSI) is a conceptual Model developed by the international organization to provide a standard framework for connecting different kinds of computers to worldwide networks. It divides the computer network architecture into seven layers, each layer can provide a well-abstract interface. Understanding the OSI model helps to understand the TCP/IP protocol, which is actually the industry standard for the Internet. The data flow during the relationship and communication among the OSI model layers is shown in the figure:

The model name	Model is introduced
The physical layer	Eventually, it encodes the information into current pulses or other signals for transmission over the Internet
The link layer	Data transmission is provided through physical network links. Different data link layers define different network and protocol characteristics, including physical addressing, network topology, error checking, data frame sequence and flow control, which can be simply understood as specifying the subcontracting form of 0 and 1 to determine the form of network packets
The network layer	Responsible for establishing a connection between the source and destination, can be understood as determining the location of the computer using IPv4/IPv6
The transport layer	To provide reliable end-to-end network data flow services to the high-level, can be understood as each application will register a port number in the network card, this layer is port to port communication
The session layer	Establishes, manages, and terminates communication sessions between the presentation layer and entities
The presentation layer	It provides multiple functions for data coding and transformation at the application layer to ensure that the information sent by the application layer of one system can be recognized by the application layer of another system, and can be understood as solving the communication between different systems
The application layer	OSI application layer protocols include file transfer, Access, and management protocol (FTAM), file Virtual Terminal Protocol (VIP), and public Management System Information (CMIP)

Common application layer protocols:

agreement	port	instructions
HTTP	80	Hypertext Transfer Protocol
HTTPS	443	HTTP + SSL certificate
FTP	20/21/990	File Transfer Protocol
POP3	110	Post office protocol
SMTP	25	Mail Transfer Protocol
Telnet	23	Remote Terminal protocol

In theory, the OSI seven-layer model is often integrated or its functions are dispersed to other layers in practical system applications. For example, TCP/IP does not copy the OSI model, nor does it have a recognized TCP/IP hierarchical model, which is generally divided into three-layer to five-layer models to describe THE TCP/IP protocol. Each layer is strongly related to the OSI model but may intersect.

The design of TCP/IP is to absorb the essence of the layered model idea: encapsulation, each layer to provide services to the last layer, the data structure of the last layer is black box, directly as the data of this layer, and do not need to care about any details of the previous layer protocol.

TCP/IP hierarchical model toEthernetOn the transmissionUDPPacket flow:

Broadly speaking, a packet contains two basic elements:

The name of the	function
The header	Contains some instructions for this packet
data	The contents of the packet

The four-layer model is as follows:

The model name	Model is introduced
Network interface layer	The network interface layer includes protocols for collaborating on the transmission of IP data over existing network media It defines protocols such as Address Resolution Protocol (ARP) Provides interfaces between TCP/IP data structures and physical hardware It can be understood as: determine the form of network packets
Network layer between	The internetwork layer corresponds to the network layer of the OSI Layer 7 reference model This layer contains IP Protocol and Routing Information Protocol (RIP). Responsible for data packaging, addressing, and routing It also contains Internet Control Message Protocol (ICMP) to provide network diagnosis information This layer determines the location of the computer
The transport layer	The transport layer corresponds to the transport layer of the OSI Layer 7 reference model It provides two end-to-end communication services Transmission Control Protocol (TCP) provides reliable data flow transport services The Use Datagram Protocol (UDP) provides unreliable user Datagram services TCP: Three handshakes and four waves UDP: Only send regardless of whether others receive
The application layer	The application layer corresponds to the application layer and presentation layer of the OSI seven-layer reference model

High concurrent load balancing HAProxy

HAProxy is a free and open source software written in C language that provides high availability, load balancing and load balancing based on TCP(Layer 4) and HTTP(Layer 7) applications. It supports virtual hosting. It is a free, fast and reliable solution. It is also suitable for web sites that are heavily loaded and require persistent connections or tier 4 and 7 processing mechanisms. As a professional load balancing software, it has the following significant advantages:

The maximum number of requests processed per unit time and the maximum data processing capacity are large
Supports 8 load balancing algorithms and session persistence
After version 1.3, it supports connection denial and transparent proxy, which other load balancers do not have
Has a powerful server status monitoring page
With strong ACL support, support virtual host

Well-known sites like GitHub, Bitbucket, Stack Overflow, Reddit, Tumblr, Twitter and Tuenti, as well as Amazon Web Services, all use HAProxy.

HAProxy can modify its scheduling algorithm by modifying the value of the balance field, which is applied to the default, Frontend, and BACKEND configuration blocks.

algorithm:

The algorithm name	Algorithm,
roundrobin	Polling based on weights This is the most balanced and fair algorithm when the server’s processing time is evenly distributed The algorithm is dynamic, which means that its weights can be adjusted at run time However, by design, each back-end server can only accept the most`4128`A connection
static-rr	Round call based on weight with`roundrobin`Similar, but static Adjusting its server weights at run time does not take effect However, there is no limit to the number of back-end server connections
leastconn	New connection requests are dispatched to the back-end server with the minimum number of connections This algorithm is recommended in scenarios with long sessions, such as LDAP and SQL It is not well suited to shorter session application-layer protocols such as HTTP The algorithm is dynamic and its weights can be adjusted at run time
first	The first server with an available slot is connected These servers will be selected from the smallest to the largest IDS Once a server reaches its maximum number of connections, the next server will be used If not defined for each server`maxconn`Parameter, this algorithm is meaningless The goal with this algorithm is to use as few servers as possible so that other servers can be on standby during non-intensive periods This algorithm will ignore server weights
source	Hash the source address of the request, divide the total weight of the back-end server, and distribute the request to a matching server This allows requests from the same client IP address to always be dispatched to a specific server However, when the total number of server weights changes, such as when a server goes down or a new server is added, Many client requests may be sent to a different server than the previous request TCP – based protocols that are used for load balancing without cookie function It defaults to static, but you can use it`hash-type`Modify this feature
uri	The left half of the URI (“? ) or the entire URI Divide the total weight of the servers and distribute it to a matching server This allows requests to the same URI to always be dispatched to a particular server, unless the total weight of the server changes This algorithm is often used with proxy caches or anti-virus agents to improve the hit ratio of the cache Note that this algorithm applies only to HTTP back-end server scenarios It defaults to a static algorithm, but you can use it`hash-type`Modify this feature
url_param	through`< argument>`The parameters specified for the URL will be retrieved in each HTTP GET request If the specified parameter is found and passes the equals sign`=`Is given a value The value will be hashed, divided by the total weight of the server, and then distributed to a matching server This algorithm can ensure that requests with the same user ID will be sent to the same specific server by tracking the user ID in the request Unless the total weight of the server has changed If the specified parameter does not appear in the request or does not have a valid value Then the round call algorithm is used to schedule the corresponding requests This algorithm is static by default But it can also be used`hash-type`Modify this feature
hdr(< name>)	For each HTTP request, through`<name>`The specified HTTP beginning will be retrieved If the corresponding header does not appear or it has no valid value The polling algorithm is used to schedule the corresponding requests It has an optional option`use_domain_only` You can specify that only the domain name part is computed when the header of a class like Host is retrieved (e.g`www.baidu.com`for Only calculate`baidu`String hash value) to reduce the computation of the hash algorithm This algorithm is static by default But it can also be used`hash-type`Modify this feature

For each hash algorithm, you can specify a hash-type:

hash-type < method> < function> < modifier>
Copy the code

Method Selects the server from the hash calculated in function:

Optional parameters Parameters that

map-based A hash table is a static array of all active servers

The hash is going to be very fluid, it’s going to take into account weights, but it’s going to be static

When the server is enabled, the hash table ignores changes in server weights

In addition, because the server is selected by its location in the array

So when the number of servers changes, most mappings change

This means when the server is started or shut down, or when the server is added to the server pool

Most connections will be reassigned to different servers

For example, this has a significant impact on the cache

consistent Consistent hashing algorithm

will0 to 2 ^ 32-1A loop is formed in which each back-end server generates a large number of nodes evenly distributed at different locations in the loop

Hash the value of the URL to2 ^ 32Take the modulus of the dividend

The calculated values must be at different points in the ring

The first server encountered by rotating the values clockwise is the selected server

This hash is dynamic and allows the weight to be changed at server startup

It has the advantage that when a server is started or shut down, only its associations are moved

When a server is added to the pool, only a small portion of the mapping is reallocated, making it ideal for caching

However, because of the principle, the allocation is never smooth, and sometimes you may need to adjust the server weights or ids to get a more even allocation

In order to get the same allocation across multiple load balancers, it is important that all servers have exactly the same ID

Note: If a hash function is not specified, a consistent hash is usedsdbmandavalanche

Optional parameters	Parameters that
map-based	A hash table is a static array of all active servers The hash is going to be very fluid, it’s going to take into account weights, but it’s going to be static When the server is enabled, the hash table ignores changes in server weights In addition, because the server is selected by its location in the array So when the number of servers changes, most mappings change This means when the server is started or shut down, or when the server is added to the server pool Most connections will be reassigned to different servers For example, this has a significant impact on the cache
consistent	Consistent hashing algorithm will`0 to 2 ^ 32-1`A loop is formed in which each back-end server generates a large number of nodes evenly distributed at different locations in the loop Hash the value of the URL to`2 ^ 32`Take the modulus of the dividend The calculated values must be at different points in the ring The first server encountered by rotating the values clockwise is the selected server This hash is dynamic and allows the weight to be changed at server startup It has the advantage that when a server is started or shut down, only its associations are moved When a server is added to the pool, only a small portion of the mapping is reallocated, making it ideal for caching However, because of the principle, the allocation is never smooth, and sometimes you may need to adjust the server weights or ids to get a more even allocation In order to get the same allocation across multiple load balancers, it is important that all servers have exactly the same ID Note: If a hash function is not specified, a consistent hash is used`sdbm`and`avalanche`

The hash function used by function:

Optional parameters
sdbm
djb2
wt6
crc32

Modifier will key hash after the optional method:

Optional parameters
avalanche

The rich configuration of HAProxy makes it easy to implement functions such as session binding, URL binding, HDR binding and dynamic separation

Reference article: https://blog.csdn.net/erix1991/article/details/76090885 https://blog.csdn.net/varyall/article/details/80423037 https://blog.csdn.net/eddie_cm/article/details/79796883

Column catalog

Column details

OSI Model

High concurrent load balancing HAProxy

Related Posts

Test platform series (16) Write class Postman page (5)

Vue+Django independently developed e-commerce projects

2021 Year-end summary — a year of harvest