Abstract

In our article on HTTP packets, we explained how the HTTP protocol works, but what about the components needed to build a real web site? What do some common nouns mean?

  1. What is forward proxy and what is reverse proxy
  2. The difference between service proxy and load balancer
  3. With NGINx, why do you need LVS
  4. What are the load balancing methods

Server Evolution

In the previous article, we introduced the simplest client-server response pattern, as follows

This is the simplest form of HTTP service, served by a layer of Web servers.

Now we have more complex servers, more users, more concurrency. The demands on our server have increased

  • Service capacity: One server cannot handle so many HTTP requests. We need to add machines, expand services
  • Security protection: there is a network attack on our service. We need to protect the server and limit the IP address
  • Website upgrade: after the launch of the website, the need to provide 7*24 hours uninterrupted service, release the new version, need to ensure the availability of the website.

The proxy service

To solve these problems, we need to introduce the middle layer, or proxy, and insert an intermediate link, proxy service, between the client and server. Proxy, in a narrow sense, does not produce content, but merely forwards requests and responses upstream and downstream.

Proxy services can be classified according to whether they are anonymous

  • Anonymous proxy: Outsiders do not know the real machine, only the proxy server
  • Transparent proxy: The outside world knows about the proxy and the real server

According to the proximity of the client or the server, divided into

  • Forward proxy: Proxy client, representing the client to send requests to the server
  • Reverse proxy: The proxy server that sends requests to the client on behalf of the server.

HTTP protocol support for proxies

Because HTTP was not originally designed with proxy services in mind, the protocol was designed for the client-server mode only. According to our usual architectural standards, the HTTP protocol layer does not care about how the consumer uses it, and proxy services as intermediates do not matter. If the server requests the IP address of a Squid client, the Squid cache agent first introduces the X-Forwarded-For header field, which is the real IP address of the client.

The format is as follows: From the client to each proxy service, the forwarding of each layer is recorded

X-Forwarded-For: client, proxy1, proxy2
Copy the code

This requirement was so universal that it became standard, widely used by proxy services, that it was later written into the RFC 7239 standard

The agency agreement

The HTTP protocol itself doesn’t say much about proxy services, so the proxy protocol was derived, an Internet protocol developed and designed by HaProxy author Willy Tarreau in 2010. By adding a small header to TCP, To facilitate the transfer of client information (protocol stack, source IP address, destination IP address, source port, destination port, etc.), which is useful when the network is complex and the client IP address needs to be obtained.

  • Multi-layer NAT network
  • TCP proxy (Layer 4) or layer 4 TCP proxy
  • HTTPS Reverse proxy HTTP (in some cases x-forword-for is not passed on every request due to keep-alive)
  • HTTPS encrypts communication and does not allow original packets to be modified

This cost is also high because each layer of proxy service has to parse the HTTP Header X-Forwarded-For and then append its own address. So the proxy protocol has become a requirement, although it is proposed by Haproxy, but it is also supported by major proxy servers, such as Nginx, Apache, squid. Format of proxy agreement

PROXY TCP4/TCP6 client IP Responder IP port number of the requester Port number of the responder \r\nCopy the code

This allows the requester to parse the first line and get the client IP instead of processing HTTP packets.

Load balancing

Load balancing is basically distributing requests. According to the OSI layer 7 protocol

There are two types of load balancing

  • Layer-4 load balancers work at layer 4 transport layer and use IP addresses and ports to forward requests. Because no other operations are performed, the efficiency is high
  • Layer 7 load balancers work at the layer 7 application layer and forward specific hosts based on HTTP request headers and URL information. Relatively inefficient.

Nginx is a 4-tier load balancer and LVS is a 7-tier load balancer.

So small sites, NGINx is enough, when traffic is large enough, load balancing becomes a bottleneck, can be introduced in front of the LVS layer.

For specific load balancing algorithm, refer to this article, here is not repeated

Safety protection

As mentioned earlier, security is also an important feature of proxy services. To cope with external attacks, you need to introduce the Web Application Firewall (WAF). Working at OSI Layer 7, it is mainly responsible for more detailed auditing of HTTP packets, namely various filters. Such as

  • IP blacklist and whitelist
  • DDOS attack
  • All kinds of injection

When the security requirements of the service are not that high, or the ROI for the business development of the company is not that high, we usually just configure some rules at the Nginx level. When requirements are upgraded, we introduce specialized models, such as ModSecurity1. When requirements are upgraded, WAF services provided by external cloud vendors are introduced.

Final architectural form

The evolution of the HTTP server architecture is similar to the evolution of our single-application architecture. When the business is not complex, you can use individual modules (such as Nginx). When the volume of requests increases and the requirements are upgraded, you need to introduce an intermediate layer to solve the problem. When a module requires an increase, separate modules need to be decoupled to handle it.

So overall, a medium-sized server architecture looks like the following figure.

reference

Juejin. Cn/post / 684490…

www.cnblogs.com/xybaby/p/78…

Follow the public account [Abbot’s Temple], receive the update of the article in the first time, and start the road of technical practice with Abbot

Read HTTP packets

HTTP Packet (2) – How does the Web container parse HTTP packets