The evolution of the Internet architecture

The four stages of the Internet

  1. The commercialization of traditional advertising in web 1.0 era

  2. Digitalization of content industry in web 2.0 era

  3. Digitalization of consumer services in the Internet + mobile Internet era

  4. Data of all industries in the era of Internet of Everything, cloud computing and big data

1. Stage 1

Single application Architecture

All in One all modules are grouped together without any layering!

All applications and software are deployed on a single machine, and all codes are written in one piece, called All in One

Features:

  1. No code maintainability

  2. Poor fault tolerance (errors are not easy to recover, exceptions cannot be captured and handled, and errors are easy to cause downtime)

Layering development

Solution:

  1. Layered development (improve project maintainability)
  2. MVC Design Pattern (Three-tier Architecture for Web Applications)

Features:

  1. MVC Layered development

  2. Separate the database

Problems arise:

As the number of users increases, the application can no longer meet the demand.

Solution:

The cluster

2. Late stage 1

Which leads to the problem

1. High availability

The term “High Availability” is often used to describe a system that is specifically designed to reduce downtime while maintaining High Availability of its services. (Always available)

2. High concurrency

High Concurrency is one of the most important things to consider when designing an Internet distributed system architecture. It usually refers to the idea of designing a system that can handle many requests in parallel.

Some common indicators related to high concurrency include Response Time, Throughput, Query Per Second (QPS), and number of concurrent users.

Response time: Time for the system to respond to a request. For example, if it takes 200ms for the system to process an HTTP request, this 200ms is the system response time.

Throughput: The number of requests processed per unit of time.

QPS: number of response requests per second. In the Internet domain, the distinction between this metric and throughput is not so obvious.

Number of concurrent users: specifies the number of concurrent users who use system functions. For example, in an instant messaging system, the number of simultaneous online users to some extent represents the number of concurrent users of the system.

Improve the concurrency of the system

There are two methods to improve system concurrency: Scale Up and Scale Out.

1. Vertical scaling

Vertical expansion: improve the processing capacity of single machine. There are two ways to scale vertically:

(1) Enhance the hardware performance of the single machine, for example, increase the number of CPU cores (such as 32 cores), upgrade better network cards (such as 10 gigabit), upgrade better hard disks (such as SSD), expand hard disk capacity (such as 2T), and expand system memory (such as 128G);

(2) Improve the performance of single-machine architecture, such as Cache to reduce I/O times, asynchronous to increase single-service throughput, and lockless data structure to reduce response time;

In the early stage of the rapid development of Internet business, if budget is not a problem, it is strongly recommended to use the method of “enhancing the performance of single machine hardware” to improve the system concurrency capacity, because at this stage, the company’s strategy is often to develop business to rush time, and “enhancing the performance of single machine hardware” is often the fastest method.

Summary: There is a fatal flaw in improving both hardware and architecture performance: there is always a limit to the performance of a single machine. So the Internet distributed architecture design high concurrency ultimate solution or horizontal expansion.

2. Scale horizontally

Horizontal scaling: Linearly scaling system performance by increasing the number of servers. Horizontal extension is a requirement for system architecture design, and the difficulty lies in how to design and expandability of horizontal extension in each layer of architecture.

3. High performance

High Performance refers to High processing speed, low memory, and low CPU usage

Cluster deployment

Cluster: The same service is deployed on multiple servers.

Features:

  1. The project is deployed on multiple servers (cluster)

Advantages:

  1. Support for high concurrency

  2. High availability support

Question:

  1. How to Share sessions

    Redis Cluster Cluster solution

  2. How are user requests forwarded

    Nginx does request distribution, load balancing

Note: Many older companies use this structure

Resolving database stress

The Nginx + Tomcat cluster effectively reduces the pressure on the business layer, but the database pressure increases at this time

1. Read/write separation

Solution:

Read/write separation, master/slave replication

Data is synchronized between the master and slave databases, master load balancing, and slave load operation.

MySql itself provides master-slave replication

Question:

  1. The database itself to fuzzy query function support is not very good, even if do read and write separation, it is difficult to solve the search business to use search engine to relieve the pressure of database access

2. Introduce a search engine

Popular search engine technology Solr ElasticSearch Whoosh

Caching mechanism is introduced to reduce the pressure of database access

As the number of visits continues to increase, the pressure to access the database becomes greater and greater (even with master-slave replication). For these hot data (frequently accessed information by users), if each is queried in the database. (many common query functions).

It doesn’t fit particularly well in memory. (Mobile login verification code operation, limit frequent access to the server for IP…) Try using Redis.

3. Split the database

Horizontal/vertical split of the database.

After all, there are limits to vertical scaling.

Single table: 10 million — “100 million data (the data capacity of a single table is limited after all)

Table: Vertical split.

id ,name,age,bire.. tel… remark….

Hot data/Cold data — Vertical split solution.

Table: Horizontal split.

By: time, region, (Split based on service logic).

Database and table:

Using third-party database middleware: MyCAT Sharding – JDBC DRDS (Ali)

Current status features:

Ensure high availability and concurrency through design.

(Continuously expand server capacity, support high concurrency and high availability)

Question:

  1. Server cost, maintenance cost, labor cost?
  2. Poor maintainability
  3. Poor scalability (almost no component reuse)
  4. Collaborative development is not convenient (everyone changes the same business code, prone to code errors/conflicts)
  5. The monolithic architecture (where the code gets bigger and bigger as the business grows) results in larger files at service deployment time.

3. Stage 2

Vertical application Architecture

A Web framework (MVC) for accelerating front-end page development is key as traffic increases and the acceleration from a single application architecture increases becomes less and less, and the application is split into several unrelated applications to improve efficiency

Horizontal split:

Split the large single application into several small applications

Across the open:

exam-parent

Exam -common 2. Exam - Pojo javaBean 3. Exam -mapper database operations 4. Exam -service service logic 5Copy the code

Using the parent project aggregation, each layer is split to improve reuse, and dependency injection can be carried out when the application is needed. (Note: Maven has the function of passing dependencies from project to project. It can be versioning in the parent project to improve project specifications.)

Solve a problem:

  1. Module reuse
  2. Address server deployment content size

A large number of idle servers (if users have too much traffic to a layer, they just need to deploy more services for this business)

(Alibaba Cloud, Baidu Cloud, Tencent Cloud, Sina Cloud, JINGdong Cloud……)


Before the clouds:

Do some companies need to buy servers + need operation and maintenance personnel to maintain the service?

Industry: Lots of Linux operations engineers

Enterprise: server hosting enterprise

Vertical split:

Divide large single applications into functional modules Solve a problem:

  1. Maintainability (change the requirement, only need to change the corresponding module)
  2. Function expansion (just add new modules)
  3. Collaborative development (different teams responsible for different business modules)
  4. Performance scaling (Flexible deployment, multiple deployment for heavily visited servers)

Question:

  1. (Users have more and more requirements for the front-end page, and more and more frequent modifications.) The page changes greatly, and every application is complete from beginning to end. If the customer wants to modify the page, the entire application service needs to be redeployed
  2. With the continuous increase of services, there will be more and more application modules.

Stage 3

Distributed architecture

As more and more vertical applications become available and interactions between applications become inevitable, core businesses are extracted as independent services to gradually form stable service centers. At this time, the distributed service framework (RPC) for improving service reuse and integration is the key to make front-end applications respond quickly to changing market demands

Distributed: A service is divided into multiple sub-services and deployed on different servers

In view of the above situation

Solve a problem:

  1. (Users have more and more requirements for the front-end page, and more and more frequent modifications.) The page changes greatly, and every application is complete from beginning to end. If the customer wants to modify the page, the entire application service needs to be redeployed

    Separation of front and rear ends [horizontal disassembly]

2. With the continuous increase of services, there will be more and more application modules. Analysis:

Previously on the same server (dependencies between modules can complete calls)

From the figure above, we can see that different applications are deployed on different servers, and the calls between services [interprocess calls]

Solution:

RPC / HTTP(RESTful)

Remote Procedure Call (RPC) – Remote Procedure Call (RPC). It is a protocol that requests services from Remote computer programs over the network without understanding the underlying network technology.

Architectural changes bring new technologies and new problems

Distributed transactions, distributed locks, distributed sessions, distributed log management


Question:

  1. The invocation between services can become very confusing
  2. More and more services, capacity evaluation, waste of small services and other problems gradually appear

5. Stage 4

Mobile Computing Architecture

When there are more and more services, capacity assessment, waste of small services and other problems gradually appear, at this time, only a scheduling center needs to be added to manage the cluster capacity based on access pressure in real time and improve the cluster utilization. At this time, the resource scheduling and governance center (SOA) is the key to improve the utilization of machines.

SOA Service-oriented architecture

Function: Solve the problem of multi-service confusion

Service Governance Middleware (Dubbo/springCloud)

A resource scheduling and governance center that manages cluster capacity in real time based on access pressure, improves cluster utilization, and improves machine utilization Microservices Architecture = 80% SERVICE architecture thinking for SOA + 100% componentized architecture thinking + 80% domain modeling thinking

The fifth stage

Microservices Architecture

Microservices: Individual applications are broken down into discrete, atomic services, each of which is called a microservice

Question:

  1. When building a single application, (SSM, web.xml, all the corresponding JARS, corresponding configuration files)

    When splitting into multiple microservice applications (requires a lot of project (service) creation)

    SpringBoot appears as an initial build and development configuration for simple code

Conclusion:

Advantages:

  1. Each microservice is small enough to focus on a specific business function or business requirement.
  2. Microservices can be developed independently by small teams of two to five developers.
  3. Microservices are loosely coupled, functional services that are independent of each other during development or deployment.
  4. Microservices can be developed in different languages.
  5. Microservices allow an easy and flexible way to integrate automatic deployment through continuous integration tools such as Jenkins, Bamboo.
  6. A new member of a team can go into production faster.
  7. Microservices can be easily understood, modified, and maintained by a single developer so that small teams can focus on their own work. You don’t have to collaborate to be valuable.
  8. Microservices allow you to leverage the latest technology.
  9. Microservices are just code for business logic, not mixed with HTML,CSS, or other interface components.
  10. Microservices can be extended on demand instantly.
  11. Microservices can be deployed on servers in low – and mid-range configurations.
  12. Easy integration with third parties.
  13. Each microservice has its own storage capacity and can have its own database. You can also have a unified database.

Disadvantages:

  1. Too many services and high cost of service management (governance)

  2. Bad for deployment (Docker image/container K8S)

  3. Increased technical difficulties (distributed transactions, distributed locks, distributed sessions, distributed logs)

  4. Increased demand for team technical capabilities (Dubbo /springCloud)

A mature architecture that is widely used today

Traditional and well-developed