preface

With the rapid increase of information, the development of hardware equipment has been slowly unable to keep up with the application system to the processing capacity requirements. At this point, how do we solve the performance requirements of the system?

There is only one way, that is through the transformation of the system architecture system, improve the system expansion capacity, through the combination of multiple low processing capacity of the hardware equipment to achieve a high processing capacity of the system, that is to say, we must be scalable design.

Extensible design is a very complex system engineering, which involves a wide range of aspects and complicated technology, and may also bring many other problems. But no matter how we design, no matter what problems we encounter, there are certain principles that we must ensure.

What is scalability

Before discussing scalability, many friends may ask: I often hear about the design of certain website system in terms of scalability, how excellent the architecture, what is expansion? What is extensible? What is scalability? Scalable, Scalable, and scaleable.

From the perspective of database, Scale is to make our database can provide stronger service ability, stronger processing power. Scalable refers to Scalable database systems that are capable of providing increased processing power through upgrades, including increased stand-alone capacity or increased number of servers. In theory, any database system is Scalable, but the implementation is not the same.

Finally, the Scalability of a database system depends on the Scalability of the system. Theoretically, any system can achieve the improvement of the Scalability, but the upgrade cost (capital and manpower) varies from system to system, which means the Scalability of different database applications varies greatly.

The use of different database applications does not refer to the Scalability software itself (although different database software may vary), but to the different architectural designs of the same database software.

First of all, we need to be clear that the scalability of a database data system is actually mainly reflected in two aspects, one is horizontal expansion, the other is vertical expansion, which is often referred to as Scale Out and Scale Up.

Scale Out means to increase the overall processing power by increasing the number of processing nodes, or more practically, by increasing the number of machines.

Scale Up means to increase the processing capacity of the current processing node to improve the overall processing capacity. Specifically, it means to upgrade the configuration of the existing server, such as increasing memory, increasing CPU, increasing the hardware configuration of the storage system. Or switch to a server with more processing power and a higher-end storage system.

By comparing the two Scale approaches, it is easy to see the strengths and weaknesses of each.

  • Scale OutAdvantages:
  1. Low cost, it is easy to build a computing cluster with very strong processing capacity through the cheap PC Server;
  2. Bottlenecks are less likely because it is easy to increase processing power by adding hosts;
  3. A single node failure has little impact on the system as a whole. There are also disadvantages. More compute nodes are mostly server hosts, which will naturally increase the complexity of the whole system maintenance and definitely increase the maintenance cost in some aspects. Moreover, the architecture requirements of the application system will be higher than that of Scale Up, requiring the cooperation of cluster management software.
  • Scale OutDisadvantages:
  1. The number of processing nodes increases the overall complexity of system architecture and the complexity of application programs.

  2. Cluster maintenance is difficult and costly.

  • Scale UpAdvantages:
  1. Fewer processing nodes, relatively simple maintenance;
  2. All data are centralized, the application system architecture is simple, relatively easy to develop;
  • Scale UpDisadvantages:
  1. High-end equipment has high cost and less competition, which is easy to be restricted by manufacturers.
  2. Limited by the development speed of hardware devices, the processing capacity of a single host is always limited, and it is easy to encounter performance bottlenecks that cannot be solved eventually.
  3. Equipment and data set, the impact of failure is greater;

In the short term, Scale Up will have greater advantages, because it can simplify operation and maintenance costs, simplify system architecture and application system development, and have simpler technical requirements.

However, in the long run, Scale Out will have greater advantages, and it is also the inevitable trend after the system reaches a Scale. Because in any case, the processing power of a single machine is always limited by hardware technology, and the speed of hardware technology is always limited, and a lot of times it is difficult to keep up with the speed of business development. And the higher the processing power of high-end equipment, the worse the cost performance will always be. Therefore, building a distributed cluster with high processing capacity through multiple cheap PC servers will always become a goal of each company to save costs and improve the overall processing capacity. While there may be various technical problems in achieving this goal, it is always worth studying and practicing.

In the following content, we will focus on the analysis and design of Scale Out. To be able to Scale Out well, it is necessary to carry Out distributed system design. For database, there are only two directions to achieve good Scale Out. One is to realize many identical data sources for expansion through continuous replication of data; the other is to realize expansion by cutting a centralized data source into many data sources.

Now let’s look at some principles that must be followed to design a scalable database application architecture.

Second, transaction relevance minimization principle

When building a distributed database cluster, many people are concerned about transactions. After all, transactions are a very core feature of a database.

In the traditional centralized database architecture, the transaction problem is very easy to solve, can rely on the database itself is very mature transaction mechanism to ensure. However, once our database becomes a distributed architecture, many transactions that were previously performed in a single database may now need to span multiple database hosts, which may require the introduction of the concept of distributed transactions.

However, as you are no doubt aware, distributed transaction itself is a very complex mechanism, whether it is commercial large database systems or various open source database systems, although most database manufacturers basically implement this function, but there are more or less a variety of limitations. There are also some bugs that may cause certain transactions to not be well guaranteed, or not complete smoothly.

At this point, we may need to seek alternative solutions to solve this problem, after all, transactions are not negligible, regardless of how we implement, always need to be implemented.

Currently, there are three main solutions:

First, when designing Scale Out, reasonably design the segmentation rules to ensure that the data required by the transaction is on the same MySQL Server as far as possible, avoiding distributed transactions.

If we can design the data sharding rules so that all transactions can be done on a single MySQL Server, our business requirements can be easily implemented, and the application can meet architectural changes with minimal adjustments, resulting in a significant reduction in overall cost. After all, database architecture transformation is not just for dbAs; it also requires a lot of peripheral cooperation and support. Even when designing a new system, we also need to take into account the overall input of each environment and each work, not only the cost of the database itself, but also the corresponding development costs. If there is a “conflict of interest” between each link, then we must make a tradeoff based on subsequent expansion and overall cost to find the best balance for the current stage.

However, even if our sharding rules are cleverly designed, it is difficult to keep all transaction data on the same MySQL Server. Therefore, although this solution needs to pay the lowest cost, but most of the time, it can only take into account some of the most core matters, and it is not a perfect solution.

Second, the large transaction is divided into many small transactions. The database guarantees the integrity of each small transaction, and the application controls the overall transaction integrity between each small transaction.

Compared with the previous scheme, this scheme will bring more application transformation, the application requirements will be more demanding. Applications not only need to break down many large transactions, but also need to ensure the integrity of each small transaction. In other words, the application needs to have some transactional capability of its own, which undoubtedly makes the application more technically difficult.

But the scheme has its own advantages. First of all, our data segmentation rules will be simpler, and it is difficult to meet restrictions. And simpler, which means lower maintenance costs. Secondly, without too many restrictions of data sharding rules, the scalability of the database will be higher and will not be subject to too many constraints. When there is a performance bottleneck, the existing database can be further split quickly. Finally, the database can be further removed from the actual business logic, which is better for subsequent architectural expansion.

Third, combine the above two solutions to integrate their advantages and avoid their disadvantages.

The first two solutions have their own advantages and disadvantages, and are basically opposite to each other. We can take advantage of their advantages, adjust the design principles of the two solutions, and achieve a balance in the overall architectural design. For example, we can ensure that some core transactions need data in the same MySQL Server, while other transactions that are not particularly important can be divided into small transactions and combined with the application system to ensure. Also, for some transactions that are not particularly important, we can analyze them in depth to see if they are unavoidable.

By balancing the design principles, we can avoid the application having to deal with too many small transactions to ensure its overall integrity, and also avoid the complexity of the split rules that lead to later maintenance difficulties and scalability constraints.

Of course, not all application scenarios need to combine the two solutions. For example, for those who are not particularly strict with transaction, or the transaction itself is very simple applications, it can completely through the design slightly split rules can meet the relevant requirements, we can only use the first solution, can avoid applications are required to maintain the integrity of the whole support some small affairs. This greatly reduces the complexity of the application.

However, for those applications where the transaction relationship is very complex and the data is highly correlated, there is no need to design hard to keep the transaction data centralized, because no matter how hard we try, it is difficult to meet the requirements, and most of the cases are one-sided. In this case, we might as well keep the database side as clean as possible and let the application make some sacrifices.

In many large Internet applications today, there are use cases for either of these solutions. Ebay, as it is known, is to a large extent a combination of the third solution. In the combination process, the second scheme is the main one, and the first scheme is the auxiliary one. In addition to the requirements of their application scenarios, the strong technical strength of such architectures ensures the development of strong enough application systems. Another example is a large BBS application system in China (its real name cannot be disclosed). Its transaction correlation is not particularly complex, and the data correlation between various functional modules is not particularly high, but it completely adopts the first solution. Avoid transaction data sources across multiple MySQL servers by properly designing rules for data splitting.

Finally, we need to understand that more is not better, but fewer is better and smaller is better. Regardless of the solution we use, we need to design the application with as little or no transaction dependency as possible. Of course, this is only relative, and surely only part of the data can do it. However, it may be that the overall complexity of the system will be reduced by a large level after a certain part of the data is made transactless, and both the application and the database system may pay a lot less cost.

Data consistency principle

No matter how we Scale Up or Scale Out, no matter how we design our own architecture, ensuring the final consistency of data is an absolute principle that cannot be violated. I think readers must be very clear about the importance of ensuring this principle.

In addition, the guarantee of data consistency, like transaction integrity, may also encounter some problems when we Scale Out the system design. Of course, if you Scale Up, you probably won’t encounter this kind of trouble. Of course, data consistency is also considered transactional integrity to some extent by many people. However, in order to highlight its importance and related characteristics, I will put it out for analysis.

So how do we Scale Out and ensure data consistency at the same time? This is often as much of a headache for us as it is for ensuring transaction integrity, and is also a concern for many architects. After many people’s practice, we finally concluded the BASE model. That is: basically available, flexible state, basically consistent and final consistent. These words seem complicated and abstruse, but can be simply understood as the principle of non-real-time consistency.

That is to say, the application system through the relevant technology, so that the whole system on the basis of meeting the user use, allowing the data in a short period of time in non-real-time state, and through the subsequent technology to ensure that the data in the final guarantee in a consistent state. This theoretical model sounds simple indeed, but we will encounter many difficulties in the practical implementation process.

First of all, the first question is do we need all the data to be non-real-time consistent? I’m sure most of my readers voted no. So if not all data is non-real-time consistent, how do we determine which data needs real-time consistent and which data only needs non-real-time final consistent? In fact, this basically can be said to be a division of business priorities of each module. Naturally, those with a higher priority belong to the camp that ensures real-time data consistency, while applications with a lower priority can be considered to be divided into the camp that allows inconsistency in a short period of time and finally consistency. This is a very tricky problem. We cannot make decisions on the fly, but we need to make decisions through very detailed analysis and careful evaluation. Because not all data can appear in the inconsistent state of the system within a short period of time, and not all data can reach the consistent state through post-processing, at least these two types of data need to be consistent in real time. How to distinguish these two types of data, it is necessary to analyze business scenarios and business requirements in detail and then make full evaluation to draw conclusions.

Secondly, how to make the inconsistent data in the system to achieve the final consistency? In general, we must make a clear distinction between the business modules for which such data is designed and those that require real-time consistent data. Then through the relevant asynchronous mechanism technology, using the corresponding background process, through the data in the system, logs and other information will be the current inconsistent data for further processing, so that the final data is in a completely consistent state. For different modules, different background processes can be used to avoid data disorder and concurrently execute data to improve processing efficiency. For information such as user notifications, it is not necessary to achieve strict real-time consistency. You only need to record the messages that need to be processed and let the background processing process process them in sequence to avoid congestion of foreground services.

Finally, avoid foreground online interaction with both real-time and final consistent data. Due to the inconsistent state of the two types of data, it is likely to lead to disorder in the interaction between the two types of data, so all non-real-time consistent data and real-time consistent data should be effectively isolated in the application. Even in some special scenarios, recording in different MySQL Servers is necessary for physical isolation.

High availability and data security principles

In addition to the above two principles, I also want to mention the high availability and data security aspects of the system. After our Scale Out design, the overall scalability of the system will indeed be greatly improved, and the overall performance will naturally be greatly improved. However, the overall maintainability aspect of the system is becoming more difficult than before. Because the overall architecture of the system is complex, both the application and database environment will be larger and more complex than before. The most direct impact of this is more difficult to maintain and system monitoring.

If the result of such design and transformation is frequent Crash of our system and frequent Down accidents, I think it is definitely unacceptable. Therefore, we must use various technical means to ensure that the availability of the system will not be reduced, or even improve on the whole.

So, this naturally leads us to another principle in our Scale Out design process, which is the principle of high availability. No matter how you adjust the architecture of the design system, the overall usability of the system cannot be reduced.

In fact, while discussing system usability, it is natural to lead to another closely related principle, that is, the data security principle. To be highly available, the data in the database must be secure. Security here is not for malicious attacks or theft, but for loss of exceptions. In other words, we must ensure that our data will not be lost in the event of a software/hardware failure. Once the data is lost, it is useless. Moreover, the data itself is the most core resource of the database application system, and the principle of never losing it is also beyond doubt.

The best way to ensure high availability and data security principles is through redundancy mechanisms. All hardware and software devices remove single point of danger, and all data have multiple copies. Only in this way can this principle be better ensured. In terms of technology, we can achieve this through MySQL Replication, MySQL Cluster and other technologies.

conclusion

No matter how we design our architecture and how our extensibility changes, some of the principles mentioned in this chapter are important. Whether it is the principle of solving certain problems, or the principle of ensuring sex, whether it is the principle of ensuring availability, or the principle of ensuring data security, we should always pay attention to and remember in the design.

The reason why MySQL database is so popular in the Internet industry is that in addition to its open source characteristics and easy to use, another very important factor is its great advantage in scalability. The different storage engines have their own features to cope with different application scenarios. Features such as Replication and Cluster are very effective for scaling.