This article is in Java Server, which contains my entire Series of Java articles for study or interview
(a) what is the sub-database sub-table
The name means to divide a table into multiple tables, which may be a table in a library, or a table in a library. The main problem to solve is the performance bottleneck of database. Even with indexes, query efficiency is bound to decline when the amount of data exceeds a certain value. Big data can be split into multiple tables or libraries by dividing databases and tables.
But sub – database sub – table is not a very good way, in the case of not need sub – database sub – table, try not to blindly split database, do not over split database.
(2) The way of dividing databases and tables
2.1 Horizontal Split
For example, if there are 40 million data in a table, the I/O count increases and the query efficiency is low even if indexes are created. At this point, we can split the 40 million data table into four 10 million data tables.
2.2 Vertical Split
Vertical splitting is usually based on services. Vertical splitting is to divide a table into multiple tables by field, and each table stores some of the fields.
For example, when we build an e-commerce system, the basic information and detailed description of goods are stored in the commodity list originally, but in actual use, detailed description of goods is only needed when users click into the commodity, so we can separate it out.
(3) the problem of database and table
It is not recommended to split database and table when it is not necessary. This operation may cause some problems, such as:
1, horizontal split primary key problem:
In horizontal split, data is divided into multiple libraries or tables, so when querying or adding data, it is necessary to ensure that the primary key of the data is located exactly.
2. Vertical split join problem:
In vertical splitting, we split the table into multiple columns, but when we need data, we inevitably need multiple table joins to query.
3, paging query problems:
This can be cumbersome when we split the data into multiple tables and then need paging queries sorted by some criteria. For example, if I want to query all the data in paging chronological order, you take the data from each table, do a level of sorting in memory, and return it to the interface caller.
4. Deduplicate functions
In a single table, you can use distinct to remove duplicate data, but if the data is stored in multiple tables, these functions can be difficult to operate. The same applies to groups, the sum function, and so on.
In addition, the database and table can cause many other problems.
(4) Components of sub-database and sub-table
Since there are so many problems with the database and table, but we have to go to the database, what else can we do? There are several components that can help us to make database and table partitioning more convenient and efficient.
At present, there are mainstream ShardingSphere, Mycat, Tddl and so on. Shardingsphere is currently a top-level project for Apache and is by far the most worthwhile component to use.
(5) Summary
In this paper, some basic concepts and problems of sub-database and sub-table are explained. Whether it is sub-database or sub-table, it is necessary to do a good job of technical reserve and reasonable split scheme in the real scene. Next, I will publish several articles about the use of shardingSphere, I am Java fish boy, and we will see you next time!