Author: huashiou

Original: https://segmentfault.com/a/1190000018626163

preface

Double 11 is coming. Taking the background architecture design of Taobao.com as an example, this paper introduces the evolution process of the server architecture from one hundred concurrency to ten million concurrency, and lists relevant technologies encountered in each evolution stage, so that we can have an overall understanding of the evolution of the architecture.

The article concludes with a summary of some architectural design principles.

The basic concept

Before introducing architecture, in order to avoid confusion among some of the readers, I will introduce some of the most basic concepts in architecture design.

1) What is distributed?

If multiple modules in the system are deployed on different servers, it can be called a distributed system. For example, Tomcat and database are deployed on different servers, or two Tomcats with the same function are deployed on different servers.

2) What is high availability?

If some nodes in the system fail and other nodes continue to provide services, the system is considered highly available.

3) What is a cluster?

Software in a particular domain is deployed on multiple servers and provides a class of services as a whole, called a cluster.

For example, the Master and Slave of Zookeeper are deployed on multiple servers to provide centralized configuration services.

In a common cluster, clients can connect to any node to obtain services. When a node in the cluster goes offline, other nodes can automatically take over its services, indicating that the cluster has high availability.

4) What is load balancing?

When a request is sent to the system, the request is evenly distributed to multiple nodes in some way so that each node can handle the request load evenly. The system is considered to be load balanced.

5) What are forward and reverse proxies?

When the system needs to access the external network, it forwards the request through a proxy server. In the view of the external network, the access is initiated by the proxy server. In this case, the proxy server implements forward proxy.

When an external request enters the system, the proxy server forwards the request to a server in the system. For the external request, only the proxy server interacts with it. At this time, the proxy server implements reverse proxy.

In simple terms, forward proxy refers to the process that the proxy server accesses the external network instead of the internal system, and reverse proxy refers to the process that the external request to access the system is forwarded to the internal server through the proxy server.

Evolution of architecture

Age of Innocence: Standalone architecture





Take Taobao as an example: At the beginning of the website, the number of applications and users were small. Tomcat and database could be deployed on the same server.

When the browser makes a request to Taobao, the DNS server converts the domain name to the actual IP address 10.102.4.1, and the browser accesses Tomcat.

Architecture bottleneck: As the number of users increases, Tomcat and database compete for resources, and the stand-alone performance is insufficient to support services.

First Evolution: Tomcat and database are deployed separately





Tomcat and database monopolize server resources respectively, significantly improving their respective performance.

Architectural bottlenecks: As the number of users grows, concurrent reading and writing to the database becomes a bottleneck.

Second evolution: Local cache and distributed cache are introduced



Add local cache on the same Tomcat server or in the same JVM, and add distributed cache externally to cache popular product information or HTML pages of popular products. Caching can intercept most requests before they are read or written to the database, greatly reducing the database pressure.

The techniques involved include memcached as a local cache, Redis as a distributed cache, cache consistency, cache penetration/breakdown, cache avalanche, hot data set failures, etc.

Architecture bottleneck: The cache can resist most of the access requests. As the number of users increases, the concurrency pressure mainly falls on the stand-alone Tomcat, and the response gradually slows down.

Third evolution: Reverse proxy is introduced to implement load balancing



Tomcat is deployed on multiple servers and requests are evenly distributed to each Tomcat using reverse proxy software (Nginx).

The assumption here is that Tomcat supports up to 100 concurrent requests and Nginx supports up to 50,000 concurrent requests, so in theory Nginx can handle 50,000 concurrent requests by distributing requests to 500 Tomcats.

The technologies involved include: Nginx and HAProxy, both of which are reverse proxy software working in the seventh layer of the network. They mainly support HTTP protocol, and also involve session sharing and file uploading and downloading.

Architectural bottlenecks: Reverse proxies greatly increase the amount of concurrency that the application server can support, but the increase in concurrency also means that more requests penetrate the database, and stand-alone databases eventually become bottlenecks.

Fourth evolution: Read and write separation of databases



A database can be divided into read libraries and write libraries. Multiple read libraries are synchronized to the read library through the synchronization mechanism. If the latest written data needs to be queried, you can write a copy to the cache to obtain the latest data from the cache.

The technologies involved include: Mycat, which is database middleware, which can be used to organize the separate read and write of the database and the sub-database sub-table, and the client can access the lower database through it. It also involves data synchronization and data consistency.

Architecture bottleneck: As the number of services increases, there is a large gap between the visits of different services. Different services directly compete for databases and affect each other’s performance.

Fifth evolution: The database is separated by service





Data of different services is stored in different databases to reduce resource competition among services. For services with heavy traffic, more servers can be deployed to support them.

As a result, cross-business tables cannot be directly associated with analysis, which needs to be solved by other means. However, this is not the focus of this paper. Interested parties can search for their own solutions.

Architecture bottlenecks: As the number of users increases, stand-alone write libraries will gradually reach performance bottlenecks.

Sixth evolution: Split large tables into small tables





For example, comment data can be hashed according to product IDS and routed to corresponding tables for storage.

For payment records, tables can be created on an hourly basis, and each hourly table continues to be split into smaller tables, using user ids or record numbers to route the data.

As long as the amount of real-time table data is small enough and requests are distributed evenly across small tables on multiple servers, the database can improve performance through horizontal scaling. Mycat, mentioned earlier, also supports access control when large tables are split into smaller tables.

This approach significantly increases the difficulty of database operation and maintenance, and has higher requirements for DBAs. When a database is designed to this structure, it can already be called a distributed database

But this is just a logical database whole, and different parts of the database are implemented by different components individually

Such as sub-database sub-table management and request distribution, achieved by Mycat, SQL parsing by the stand-alone database, read and write separation may be achieved by gateway and message queue, query results summary may be achieved by the database interface layer and so on

This architecture is actually an implementation of the MPP (massively parallel processing) architecture.

At present, there are a lot of MPP databases in both open source and commercial, among which Greenplum, TiDB, Postgresql XC and HAWQ are popular, and commercial ones, such as NTNU GBase, Ruifan Snowball DB, Huawei LibrA and so on

Different MPP databases have different emphases. For example, TiDB is more focused on distributed OLTP scenarios, while Greenplum is more focused on distributed OLAP scenarios

These MPP databases generally provide SQL standard support capabilities such as Postgresql, Oracle, and MySQL, which can parse a query into a distributed execution plan and distribute it to each machine for parallel execution. Finally, the database itself summarizes the data and returns it

Also provides such capabilities as authority management, sub-database sub-table, transaction, data copy, and most can support more than 100 nodes of the cluster, greatly reducing the cost of database operation and maintenance, and enable the database can also achieve horizontal expansion.

Architectural bottlenecks: Both the database and Tomcat can scale horizontally, and the concurrency that can be supported increases dramatically. As users grow, stand-alone Nginx will eventually become a bottleneck.

Evolution 7: Use LVS or F5 to load balance multiple Nginx



Because the bottleneck is in Nginx, multiple Nginx load balancing cannot be achieved through two layers of Nginx.

Figure of the LVS and F5 is working in the fourth layer network load balancing solutions, including the LVS is a software that runs on the operating system kernel mode, to TCP forwarding requests or higher level of network protocols, so support agreement is richer, and the performance is much higher than Nginx, can assume that single LVS can support hundreds of thousands of concurrent forward requests;

The F5 is a load balancing hardware that is similar to the capabilities provided by LVS, with higher performance than LVS, but at an expensive price.

LVS is standalone software. If the LVS server is down, the entire backend system cannot be accessed. Therefore, a standby node is required.

Keepalived software can be used to simulate a virtual IP address and then bind the virtual IP address to multiple LVS servers. When a browser accesses the virtual IP address, the router will redirect it to the real LVS server

When the main LVS server is down, Keepalived automatically updates the routing table in the router and redirects the virtual IP address to another normal LVS server, thus making the LVS server highly available.

It is important to note here that the drawing from the Nginx layer to the Tomcat layer does not mean that all Nginx forwards requests to all Tomcat

In practice, there may be several Nginx connected to one part of Tomcat, with keepalived high availability between them, and other Nginx connected to another Tomcat, so that the number of Tomcat connections can be multiplied.

Architecture bottleneck: Since LVS is also a standalone server, when the number of concurrent users increases to hundreds of thousands, THE LVS server will eventually reach the bottleneck. At this time, the number of users reaches tens of millions or even hundreds of millions. The users are distributed in different regions and the distance from the server room is different, resulting in significantly different access delays.

Eighth Evolution: Load balancing between equipment rooms is implemented using DNS polling



On the DNS server, you can configure a domain name to correspond to multiple IP addresses. Each IP address corresponds to a virtual IP address in a different equipment room.

When users access Taobao, the DNS server uses polling or other policies to select an IP address for users to access. This mode implements load balancing between equipment rooms

At this point, the system can achieve the level of machine room expansion, tens of millions to hundreds of millions of levels of concurrency can be solved by increasing the machine room, the system entrance request concurrency is no longer a problem.

Architecture bottleneck: With the development of data richness and business, the demand for retrieval and analysis is becoming more and more abundant. Relying on database alone cannot solve such rich demand.

Evolution 9: NoSQL database and search engine technologies are introduced

When the amount of data in the database reaches a certain size, the database is not suitable for complex queries, and can only meet the scenarios of common queries.

In statistical report scenarios, results may not be generated when there is a large amount of data, and other queries may be slowed down when complex queries are run

For scenarios such as full-text retrieval, variable data structures, databases are inherently unsuitable. Therefore, appropriate solutions need to be introduced for specific scenarios.

For massive file storage, use HDFS. For key value data, use HBase or Redis

For full-text search scenarios, you can use search engines such as ElasticSearch. For multidimensional analysis scenarios, you can use solutions such as Kylin and Druid.

Of course, the introduction of more components will also increase the complexity of the system, the data stored by different components need to be synchronized, the problem of consistency needs to be considered, and more means of operation and maintenance need to manage these components.

Architecture bottleneck: The introduction of more components to solve the rich requirements, business dimensions can be greatly expanded, resulting in an application containing too much business code, business upgrade iterations become difficult.

Tenth Evolution: A large application is split into a small application



Application code is divided by service block, so that the responsibilities of each application are clearer and each application can be upgraded independently. In this case, some common configurations may be involved between applications. You can use the Zookeeper distributed configuration center to solve this problem.

Architecture bottleneck: Multiple copies of the same code exist when modules are shared between different applications. As a result, all application codes must be upgraded when common functions are upgraded.

Evolution 11: Reuse functions are abstracted into microservices

If functions such as user management, order, payment, and authentication exist in multiple applications, you can extract the codes of these functions to form a separate service to manage them

Such services are known as microservices, where common services are accessed between applications and services through HTTP, TCP, or RPC requests, and each individual service can be managed by a separate team.

In addition, Dubbo, SpringCloud and other frameworks can realize service governance, traffic limiting, circuit breaker, downgrade and other functions to improve the stability and availability of services.

Architecture bottlenecks: Different services have different interface access modes, and application code needs to adapt to multiple access modes to use the services. In addition, applications can access services and services may also access each other, so the call chain becomes very complex and the logic becomes confused.

Evolution 12: The ESB, an enterprise service bus, is introduced to mask service interface access differences

The ESB implements unified access protocol transformation. Applications access back-end services through ESB, and services invoke each other through ESB to reduce the coupling degree of the system.

The so-called SOA (Service Oriented) architecture, in which a single application is split into multiple applications, common services are isolated to manage, and the enterprise message bus is used to decouple services from each other, is easily confused with microservices architecture because of its similar presentation.

Personally, microservice architecture refers to the idea of extracting public services from the system for independent operation and maintenance management, while SOA architecture refers to an architectural idea of splitting services and unifying service interface access. SOA architecture contains the idea of microservices.

Architecture bottleneck: With the continuous development of services, the number of applications and services will continue to increase, and the deployment of applications and services will become complex. Multiple services deployed on the same server must solve the problem of conflicts in the operating environment

In addition, for scenarios requiring dynamic expansion and shrinkage, such as large-scale expansion, the performance of services needs to be horizontally expanded, so it is necessary to prepare the operating environment and deploy services on the newly added services, which makes operation and maintenance very difficult.

Evolution 13: Container technology is introduced to achieve operational environment isolation and dynamic service management

At present, the most popular containerization technology is Docker, and the most popular container management service is Kubernetes(K8S). Applications/services can be packaged as Docker images, which can be dynamically distributed and deployed through K8S.

Docker image can be understood as a minimum operating system that can run your application/service, which puts the application/service running code, and the running environment is set up according to the actual needs.

After the whole “operating system” is packaged as an image, it can be distributed to the machines where relevant services need to be deployed. Directly starting the Docker image can get the service up, making the deployment, operation and maintenance of the service simple.

Before scaling up, servers can be partitioned on existing machine clusters to start Docker images and enhance service performance

After a big push, the image can be turned off without affecting other services on the machine. (Prior to Section 18, the system configuration of the service running on the new machine would have to be modified to fit the service, which would cause the environment required by other services on the machine to be broken.)

Architecture bottleneck: After the use of containerization technology, the problem of service dynamic expansion and shrinkage can be solved, but the company still needs to manage the machine itself. In non-large-scale promotion, a large number of idle machine resources are still needed to deal with the large-scale promotion, and the machine cost and operation and maintenance cost are very high, and the resource utilization rate is low.

Evolution: A cloud platform is used to host the system

The system can be deployed on the public cloud to solve the problem of dynamic hardware resources by utilizing the massive machine resources of the public cloud

During the promotion period, the company temporarily applies for more resources in the cloud platform, combines Docker and K8S to rapidly deploy services, and releases resources after the promotion, truly achieving on-demand payment, greatly improving resource utilization rate and greatly reducing operation and maintenance costs.

The so-called cloud platform is to abstract massive machine resources into a whole resource through unified resource management

On the cloud platform, hardware resources (such as CPU, memory, network, etc.) can be applied dynamically on demand, and a general operating system, common technical components (such as Hadoop technology stack, MPP database, etc.) can be provided for users to use, and even developed applications

Users can solve their needs (such as audio and video transcoding services, mail services, personal blogs, etc.) without caring what technology is used inside the application.

The following concepts are involved in cloud platforms:

  1. IaaS: Infrastructure as a service. Corresponding to the above mentioned machine resources unified into a resource whole, can dynamically apply for hardware resources level;
  2. PaaS: Platform as a service. Provide common technical components to facilitate the development and maintenance of the system as mentioned above;
  3. SaaS: Software as a service. To provide developed applications or services as described above, pay for functionality or performance requirements.
So far: there are solutions to all of the problems mentioned above, from high concurrent access issues to service architecture and system implementation.

At the same time, it should be realized that the above introduction has intentionally omitted practical issues such as cross-machine room data synchronization, distributed transaction implementation, and so on, which will be discussed separately at a later time.

Summary of architectural design experience

1) Does the adjustment of the architecture have to follow the above evolution path?

No, the architecture evolution sequence described above is just a single improvement for one aspect

In the actual scenario, there may be several problems to be solved at the same time, or other aspects may reach the bottleneck first. At this time, it should be solved according to the actual problem.

For example, in the scenario where the amount of concurrency in the government class may be small but the business may be very rich, high concurrency is not the key problem to be solved. In this case, the solution to enrich the demand may be the priority.

2) How far should the architecture be designed for the system to be implemented?

For single-implementation systems with well-defined performance metrics, it is sufficient that the architecture is designed to support the performance metrics of the system, but with interfaces to extend the architecture in case it is needed.

Evolving systems, such as e-commerce platforms, should be designed to meet the requirements of the next phase of user volume and performance indicators, and the architecture should be upgraded iteratively based on the growth of the business to support higher concurrency and richer business.

3) What is the difference between server architecture and big data architecture?

In fact, the so-called “big data” is a general term for mass data collection, cleaning, transformation, data storage, data analysis, data services and other scenarios. Each scenario contains a variety of optional technologies

For example, data collection includes Flume, Sqoop, and Kettle, data storage includes DISTRIBUTED file systems HDFS and FastDFS, NoSQL databases HBase and MongoDB, and data analysis includes Spark technology stack and machine learning algorithm.

In general, big data architecture is an architecture that integrates various big data components based on business requirements and generally provides distributed storage, distributed computing, multidimensional analysis, data warehouse, machine learning algorithm and other capabilities.

On the other hand, server-side architecture refers more to the architecture at the application organization level, and the underlying capabilities are often provided by big data architecture.

4) Are there any architectural design principles?

  • N+1 design: every component in the system should be free of single points of failure;
  • Rollback design: To ensure that the system is forward compatible, there should be a way to roll back the version when the system is upgraded;
  • Disable design: a configuration should be provided to control whether specific functions are available and to quickly bring the function offline in the event of a system failure;
  • Monitoring design: in the design stage to consider the means of monitoring;
  • Multi-active data center design: If the system needs high availability, it should be considered to implement multi-active data center in more than one place. The system can still be used when at least one machine room is powered off.
  • Adopt mature technologies: New or open source technologies often have many hidden bugs, and failure without commercial support can be a disaster.
  • Resource isolation design: to avoid a single business to occupy all resources;
  • Architecture should be able to expand horizontally: only when the system can expand horizontally can bottleneck problems be effectively avoided;
  • Non-core purchase: If non-core functions need to occupy a large amount of R&D resources to solve, consider purchasing mature products;
  • Use of commercial hardware: commercial hardware can effectively reduce the probability of hardware failure;
  • Rapid iteration: the system should develop small functional modules quickly and go online for verification as soon as possible, so as to find problems early and greatly reduce the risk of system delivery;
  • Stateless design: The service interface should be stateless, so that access to the current interface does not depend on the state of the last access to the interface.

The last

Welcome to pay attention to my public hao [programmer chasing wind], the article will be updated in it, sorting out the data will be placed inside.