preface
At present, we have contacted a service whose 7 layers of load balancer use Nginx, and the 4 layers use elastic load balancer ELB developed by our company.
This article introduces and actual combat is working in one of the 4 layer load balancing LVS (Linux Virtual Server), when the right to throw a brick to attract, combat is simplified from the work.
concept
Dr. Zhang Wensong established the free software project of Linux Virtual Server in May 1998 to develop Linux Server cluster. In the later version of Linux2.4, LVS was directly added into the kernel without re-compiling into the kernel.
It’s a shame to see this, this is a technology that was developed in 98, 20 years ago…
Because a single server cannot meet most online needs, and high-end servers are too expensive and inefficient, most online services use multiple servers at the same time, which is commonly known as a cluster. LVS is open source software that is specifically developed for Linux server clusters.
Why cluster servers?
The main purpose of clustering servers is to Load Balance — to have two or more servers or sites provide services that, by some algorithm, try to evenly distribute requests from clients among the machines in the cluster so that one server does not fail due to the high Load. Even if one of the machines fails, the load balancer automatically circumvents the selection and allows users to access the service.
History of web server architecture
Certainly won’t say comprehensive, also may have fallacy, throw a brick to attract jade!
Single deployment
This is the original state of web services, with both a server and a database deployed on one machine.
I remember when I was building my website, and we were working on some of the programs at school, and that was the model.
Program & database server separation
Later, as the volume and type of business grew, the original plan became more and more of an option for self-testing during project development. Such as the completion of the self-test, the domain name for the record, the real online is often the use of a database and server alone deployment mode.
The advantage is that the server is less stressed, and the database is not compromised when the application server is attacked.
Disadvantages: Data remote transmission performance cannot be guaranteed unless the two devices are placed under the same node in the equipment room or under the same switch.
Separation of dynamic and static resources
The website has many resources. There are static resources and dynamic resources.
Static resources, such as HTML, JavaScript, CSS, IMG files, that is, can be directly presented to the user page resources;
Dynamic resources cannot be directly displayed on the page, but need to be transformed in the background, that is, dynamic resources to be transformed into static resources.
For Java Web applications, the JSP/Servlet container’s basic function is to convert dynamic resources into static resources, but of course the JSP/Servlet container does more than that.
Static and dynamic separation is to deploy static resources and background applications separately to improve the speed of accessing static code and reduce the speed of accessing background applications.
benefits
For example, in the current network service I take over, after adopting dynamic and static separation, the back end only needs to provide restful API, and other modules or the front end only needs to access my API interface. In other words, in this way, my service can be called by multiple external modules or even multiple platforms at the same time, logical and easy to maintain.
In addition, it mentioned that the front end calls the restful API of the back end, so that the development and testing schedule of the front end and the back end do not affect each other, and they only need to pay attention to the protocol of the interface.
The ultimate goal is to reduce the pressure on the back-end server and speed up static resource access, since the back-end application does not need to use templates to render pages.
The bad
Must first understand what is SEO, baidu to domestic, for example, baidu spiders may according to the website URL to crawl the page were analyzed, and an analysis of text content, that is to say, can filter out the js code, and the action after the separation, the front-end generally need real-time with back-end asynchronous request and response, the implementation of the asynchronous request is usually done js. However, conventional crawlers generally cannot capture asynchronously loaded content, that is to say, the front-end page captured by crawlers contains a large number of asynchronously loaded operations, but crawlers cannot execute and obtain their content, which will affect the SEO of websites.
The solution I’ve heard so far is to use front-end caching technology to cache data that doesn’t change often.
Load balancing + service level expansion
No more load balancing, let’s talk about service scaling.
Vertical scaling of services
Vertical scaling means trying to improve the service capability of back-end services. How to optimize business logic? How to improve code quality? Using multiple threads (processes)? Can we start from the stand-alone hardware, such as increasing server memory, disk capacity, buy a better CPU… In order to improve the load capacity of the service.
But doing so, obviously the cost is very high, dying a sheep’s hair, in fact, not very significant. So, you have horizontal expansion.
Horizontal expansion of services
In horizontal scaling, the goal is not to increase the load capacity of a single server, but simply to spread the load by deploying more inexpensive servers.
Database performance bottlenecks
With so many servers, combined with LB, the load was spread, but the performance of the database fell behind. It’s mostly I/O bottlenecks, so that’s where the NoSQL cache comes in.
No cache
The main ones are Memcached and Redis.
Of course, there is horizontal scaling on the database side, as well as horizontal and vertical splitting of library tables.
The library table is split horizontally and vertically
Vertical split: when web services has grown, the database will also become more subsystems to support, then there is divide the tables out of the business requirements, which is closely linked to the business data table into the same database, so the entire service list will be assigned to different databases and principle is not to undermine the third paradigm.
However, there are still problems with vertical splitting. If the single table data after splitting is small and the data growth rate is slow, it will generally maintain the status quo. Otherwise, split horizontally.
Horizontal split: To divide data from a table into different tables or databases according to agreed rules.
In plain English, a library table splits rows horizontally into different tables and columns vertically into different tables.
Because then there’s the filter merge.
Anyway, it’s complicated…
Introducing SOA Architecture
SOA, or Service-oriented Architecture (SOA), is a component model officially defined as:
It links the different functional units of an application (called services) through well-defined interfaces and contracts between these services.
Interfaces are defined in a neutral manner and should be independent of the hardware platform, operating system, and programming language that implements the service. This allows services built on a variety of systems to interact in a unified and common way.
SOA, in essence, is a completely separate business and technology, business and technology can freely combine the idea.
SOA architecture has reached the highest level of current software design thinking.
Importing CDN mirroring
Reference: Summary of my project experience — CDN Image: 1 (Preliminary study)
Big data Platform
Firewall – Anti-ddos
Of course there’s a lot more to come
Container deployment
…
Multi-equipment room Dr Mechanism
…
Virtual Cloud Computing
Everything is in the cloud, cloud storage, cloud hosting and so on.
Layer 4 load balancing
This figure uses Nginx for layer 7 load balancing, LVS for layer 4 load balancing, Memcached for cache service, Redis for queue service, NFS for file server, and MySQL master/slave cluster for database.
As you can see, the user’s request first passes through the firewall, then goes to the LVS load balancing master server, and then is distributed to the Nginx load balancing server based on static and dynamic content. If it is static content, the data is read directly from the static Web node cluster and returned to the user. If it is dynamic content, the data is read directly from the static Web node cluster. Nginx then distributes the request to the Tomcat cluster, which is the dynamic Web node.
After the Tomcat cluster receives the request, it will make a series of business decisions, such as whether there is a cache. If there is no cache, it will directly access the primary and secondary MySQL clusters. Resources that need to be persisted if memcached cannot handle them are sent to the Redis queue server.
For both Nginx and Tomcat, files are shared, so NFS file servers can be used here.
Additional, massive data processing, such as log analysis, is currently done using Hadoop.
…
Among them, this article focuses on 4 – layer load balancing LVS.
The history of load balancing
Hardware era
The hardware LOAD balancing solution directly installs load balancing devices between servers and external networks so that dedicated devices can perform specific tasks independently of the operating system (OS). In addition, diversified load balancing algorithms and intelligent traffic management improve overall service performance. But the cost of hardware is relatively high.
Software era
Because of the high cost of hardware, Internet companies will not be the first choice, so software load balancing emerged. It is also the most widely used. One is that its effect is very great B, and the other is that it does not need a great cost, and can even be close to zero cost!
Among them, the common software implementation methods are:
- LVS
- Nginx
- HAProxy
LVS mainly works at layer 4, Nginx mainly works at layer 7, HAProxy is a proxy software that provides high availability, load balancing and TCP and HTTP based applications, and also supports virtual hosting.
Original author: Dashuai
Original address:
My project experience summary – load balancing understanding and practice: 1 – Dashuai’s blog – Blog Garden
Copyright: This article is by the happy science and technology cooperation blogger original, reproduced please indicate the author and source, thank you!
Snap up wechat mini program for one yuan >>>
The link address