1. Overview of Nginx

Nginx (” Engine X “) is a high performance HTTP and reverse proxy server, which is featured by low memory and high concurrency. In fact, Nginx does perform better in the same type of web server. Baidu, JINGdong, Sina, netease, Tencent, Taobao and so on

(1) Nginx is a high-performance HTTP and direction proxy server (2) written in C language (3) support many operating systems, Windows, Linux, MacOS X (4) high security, the outside world can only access the Nginx server, Nginx forwards requests to the internal server. (5) Load balancing (6) Rewrite is powerful

Nginx as a Web server

Nginx can be used as a Web server for static pages and also supports dynamic CGI languages such as Perl, PHP, etc. But Java is not supported. Java programs can only be completed in conjunction with Tomcat. Nginx was developed specifically for performance optimization. Performance is the most important consideration, and the implementation is very efficient. Nginx can withstand high loads and reports indicate that it can support up to 50,000 concurrent connections.

3. Forward proxy

Nginx can not only act as a reverse proxy to achieve load balancing. It can also be used as a forward proxy for Internet access and other functions.

Forward proxy: If you think of the Internet outside the LAN as a huge repository of resources, clients in the LAN need to visit

If you want to access the Internet, you need to access it through a proxy server. This proxy service is called a forward proxy

(1) You need to configure a proxy server on the client to access the specified website.) You need to configure a proxy server on the client to access the specified website.

4. Reverse proxy

Reverse proxy, in fact, the client’s agent is no perception, because the client does not require any configuration can access, we only need to send the request to the reverse proxy server, the reverse proxy server to select the target server to get data, and then returned to the client, the reverse proxy server and the target server is a server, The proxy server address is exposed and the real server IP address is hidden.

The proxy server address is exposed but the real IP address is hidden

In actual project development, the most used direction agent is also, here is a demonstration:

1. Download tomacat on your computer and run it

2, Modify the nginx configuration file, under the default port 80 configuration, add location configuration and save

3. Restart the nginx service

4. Access port 80 of nginx on the vm and the host respectively

5. Access nginx port 80 on the host machine (the machine where VMware is installed)

Test successful!

5. Load balancing

The client sends multiple requests to the server, which processes the requests, some of which may interact with the database, and then returns the results to the client.

This architectural pattern is suitable for early systems with relatively few concurrent requests and low cost. However, with the continuous growth of the amount of information, the rapid growth of the volume of visits and data, as well as the increase of the complexity of the system business, this architecture will cause the server corresponding client requests increasingly slow, when the concurrency is particularly large, but also easy to cause the server directly crash. This is obviously due to server performance bottlenecks, so what can be done to resolve this situation?

Our first thought may be to upgrade the server configuration, such as increasing the CPU execution frequency, increasing the memory and so on to improve the physical performance of the machine to solve this problem, but we know that Moore’s Law is increasingly ineffective, the performance of the hardware can no longer meet the increasing demand. The most obvious example is that the instant traffic of a hot commodity on Tmall’s Singles’ Day is extremely large, so the system architecture similar to the above and the addition of machines to the existing top physical configuration cannot meet the demand. So what to do?

In the above analysis, we have removed the method of increasing the physical configuration of the server to solve the problem, that is to say, the vertical solution to the problem is not feasible, so how about increasing the number of servers horizontally? Generated by this time the concept of cluster, a single server won’t solve, we increase the number of the server, and then will request distribution to each server, the original request is to focus on a single server to request distribution to multiple servers, load distribution to a different server, which is what we call the load balance.

If the Tomcat session does not exist after load balancing is configured, perform solution 1. 2. It is usually stored in Redis

5.1 ip_hash(the client requests the IP to hash and selects the back-end server using the hash value):

When a particular URL path on your server is accessed continuously by the same user, if the load balancing policy is polling, multiple visits from that user will be routed to different servers, which is obviously not efficient (creating multiple HTTP links, etc.).

In an extreme case, the user needs to upload the shards to the server, and then the server merges the shards. In this case, if the user’s request reaches different servers, the shards will be stored in different server directories, making it impossible to merge the shards. Therefore, the IP_hash policy provided by Nginx can be used in such scenarios. It can not only satisfy each user request to the same server, but also meet the load balancing between different users.

5.2 url_hash(hash the requested URL and select the backend server using the hash value) :

Generally speaking, url_hash is used in conjunction with a cache hit.

Take an example I met: there is A server cluster A that needs to provide file download externally. Due to the huge amount of file upload, it cannot be stored on the server disk, so it uses third-party cloud storage to store files. After receiving client requests, server cluster A needs to download files from the cloud storage and return them. In order to save unnecessary network bandwidth and download time, A temporary cache is created on server cluster A (cache for one month).

Because it is a server cluster, multiple requests for the same resource may reach different servers, resulting in unnecessary multiple downloads, low cache hit ratio, and some waste of resource time. In such scenarios, the URL_hash policy is suitable to improve the cache hit ratio. The same URL (that is, the same resource request) will reach the same machine. Once the resource is cached, the request can be read from the cache, which reduces bandwidth and download time.

6. Separation of activity and movement

In order to speed up the resolution of the website, dynamic pages and static pages can be resolved by different servers to speed up the resolution. Reduce the stress of a single server.

Conclusion:

Nginx is currently the mainstream HTTP reverse proxy server (its enterprise version provides a reverse proxy plug-in based on TCP layer), for the construction of large distributed Web applications, has a pivotal role. In short, nGINx has two main functions: dynamic/static resource separation and load balancing.