The following article is from Ctrip Technology, the author ruihua
Author’s brief introduction
Rui Hua, senior backend development engineer of Ctrip, focuses on system architecture, database and table, micro services, high availability, etc.
One, foreword
With the rapid development of international train ticket business and the rapid growth of orders, the bottleneck of single database has gradually emerged, and the conventional database optimization has been unable to achieve the desired effect. At the same time, the original design of the underlying database, there are some problems left over from history, such as some useless fields, table through self-increasing primary key association and each application directly connected to the database and other problems.
Therefore, after discussion, we decided to divide the order database into different databases and different tables, and reconstruct the order table at the same time, so as to fundamentally solve these problems.
Second, problems and challenges
After the goal is determined, the practice is not easy, there are a lot of problems and challenges. Here are some typical problems, which can be roughly divided into two categories: general problems of database and table, and specific business association problems. General problem of database and table
- How do you slice it, vertically or horizontally? How do I pick the key for the shard?
- How to route to corresponding libraries and tables by key value?
- What middleware is used, proxy or middleware?
- Cross-library operations, such as cross-library transactions and cross-library associations?
- Data capacity expansion problem. How to perform subsequent capacity expansion?
Specific service association problems
- How to solve the problem of directly connected data of each application?
- How to make a smooth transition?
- How can historical data be migrated properly?
3. Program selection
3.1 How to slice
Generally, there are four types of segmentation: vertical segmentation, vertical segmentation, horizontal segmentation and horizontal segmentation. How to choose the segmentation is generally decided according to one’s own business needs. Our goal is to fundamentally solve the large amount of data, single machine performance problems and other problems, vertical mode can not meet the demand, so we choose the horizontal branch library + horizontal branch table segmentation.
3.2 Fragment key selection
Generally, fields are selected as sharding keys according to their actual business. At the same time, hot issues and distribution of data can be considered. For example, an order system cannot be sharded based on country fields, otherwise there may be many order records in some countries and almost no order records in some countries, resulting in uneven data distribution. Relatively correct way, such as order class system, can select the order ID; Member system, you can select member ID.
3.3 Routing
Once the shard key is selected, the next question to explore is how to route to specific databases and specific tables. Take the fragment key route to a specific database as an example. The common routes are as follows:
Mapping the routing
To map routes, add a library, create a route mapping table, and store the mapping relationship between fragment key values and corresponding libraries. For example, the key value 1001 is mapped to the database DB01, as shown below:
The mapping mode has the advantage that the mapping mode can be adjusted arbitrarily and the capacity expansion is simple. However, there is a serious deficiency that the data volume of the mapping table in the mapping library is extremely large. Our original goal was to implement the function of library and table, but now, the map library map table is equivalent to returning to the state before library and table. Therefore, we do not take this approach in practice.
Packet routing
Packet routing refers to grouping the key values of fragments into groups. Each group corresponds to a specific database. For example, if the key value is 1000 to 2000, the database will be stored in db01, as shown in the following figure:
Grouping method has the advantages of simple expansion and implementation, but it also has a serious shortcoming of data distribution hotspot. For example, if the sharding key value is 2001 at a certain time, all data traffic will be sent to a certain library (DB02) in the future. This problem is also more serious in the Internet environment. For example, in some promotional activities, the order volume will have an obvious surge. At this time, each database cannot achieve the effect of allocating traffic, only one library is receiving traffic, and it will return to the state before the database and table. So we didn’t do it that way.
Hash routing
Hash routing, that is, hash the shard key, and then, based on the hash result, map it to a specific database. For example, if the key value is 1000 and the result of hashing it is 01, it will be stored in db01, as shown in the following figure:
Hashing has the advantage of uniform distribution and no hot spot problems, but on the contrary, data expansion is more troublesome. Because the hash function needs to be adjusted during capacity expansion, a data migration problem arises. In the Internet environment, it is often difficult to stop service during the migration process, so it is necessary to carry out the transition in the way of multi-library double-write, which is quite troublesome. Therefore, this approach is not adopted in practice.
Packet hash routing
Grouping hash routes, that is, the key values of fragments are grouped first and then hashed. As shown below:
In practice, we combined the previous several ways, draw lessons from their advantages and disadvantages, and adopted this way. Because of the grouping mode, it is very convenient to expand capacity, which solves the problem of data expansion. Hashing method can solve the problem of relatively uniform distribution and no single point database hot spot.
3.4 Technical Middleware
Sub-database sub-table middleware selection, in the industry’s scheme is more, the company has its own implementation. According to the different implementation modes, it can be divided into proxy and non-proxy modes. The following lists some common middleware in the industry, as shown in the following table (as of 2021-04-08) :
Why did we end up in the ShardingSphere? Mainly from the following factors:
Technology environment
- Our team is based on Java system and has some preference for Java middleware
- More lightweight components, components that can be studied in depth
- Some customization may be required
A professional degree
- It depends on which team the middleware is maintained by, whether it is built by a master, whether it is an industry benchmark
- Update iteration frequency, preferably with relatively frequent updates and active maintenance
- In terms of popularity, middleware with wide popularity and active community is preferred
- Performance issues. Performance meets our requirements
Use cost
- Cost of learning, cost of entry, and cost of customization
- Weak immersion, less business immersion
- The migration cost under the existing technology stack, our current technology stack is under the SSM system
Operational costs
- High availability and stability
- Reduce hardware resources, do not want to introduce a separate proxy middleware, and consider the operation and maintenance costs
- Rich buried points, perfect monitoring
Iv. Business practice
In business practice, we have experienced the design of new database and new table, sub-database and sub-table self-built agent, service closing, upstream order application migration, historical data migration and other processes.
4.1 New table model
In order to establish the relationship between sub-database and sub-table, and a more reasonable and effective structure, we applied for several new order sub-database and sub-table libraries, and designed a set of new table structure. Table names end with year, normalized table fields, appropriately added and deleted some fields, and used unique service keys instead of self-added primary keys for association. The table structure is shown as follows:
4.2 Service closure
Dal-sharding, a service agent that built a sub-database sub-table database. Each service that needs to operate the order library needs to operate the database through the proxy service to achieve a closing effect of the service. At the same time, the complexity of sub-database sub-table is shielded, and the basic method of adding, deleting, changing and checking database is standardized.
4.3 Smooth Transition
In the process of application migration, we added some synchronization logic to ensure smooth application transition, which has no impact on the application before and after application migration. The unmigrated application can read the order data written by the migrated application. The migrated application can read the order data written by the unmigrated application. At the same time, the unified implementation of this logic, reduce the cost of each application migration.Both old and new libraries readAs the name implies, when reading, both libraries may have to read, that is, read the new library first, if the record can be read, directly return; Otherwise, the old library record is read again and the result is returned. The basic process of double-reading is as follows:This ensures low cost of reading data during application migration. Upstream applications do not need to care about whether data comes from the new library or the old library, but only need to care about data reading. This reduces the logic of switching between the new library and the database and table, greatly reducing the workload of migration. In practice, we realize the double-read logic through the section, and put the double-read logic into the section to reduce the intrusion of the new library reading logic, which is convenient to remove and adjust the double-read logic later. In addition, some new configurations are added. For example, you can control which tables need to be read and which tables do not need to be read.New and old libraries double writeThe new and old library double write, that is, after the success of writing to the new library, write to the old library asynchronously. Double-write allows the order data to exist in both old and new libraries, allowing applications that have not yet been migrated to operate the database through proxy services to function properly. The basic process of double-writing is as follows:
In fact, there are more schemes for double write, such as database based log, by listening to parse database log synchronization; You can also double write through section; You can also perform synchronization through scheduled tasks. In addition, combined with our own order business, we can double write through order events (such as successful order creation, successful ticket issue, successful ticket refund, etc.) and synchronize data to the old database. At present, after consideration, we do not use database log to achieve, because this is equivalent to the logic sink to the database level, from the implementation is not flexible, at the same time, may also involve some permissions, scheduling and other issues. In practice, we double write in three other ways, in complementary form. Asynchronous section double write, to ensure maximum timeliness; Order events ensure the consistency of core nodes; Scheduled tasks ensure final consistency. Like double-reading, we also support configuration control over which tables need to be double-written, which tables do not need to be double-written, and so on.Migration of transitionWith the previous double-read/double-write as the basis, the migration is relatively easy to implement. We adopt a one-by-one migration approach, for example, by service, by channel and by supply. The migration work is disintegrated to reduce the impact area and pursue robustness. Generally, there are three steps: 1) In the first stage, the new database should be migrated among the newly connected suppliers, because the newly online suppliers have the least amount of orders, and the previous business will not be affected even if there are problems. 2) There are some orders for online businesses with a small amount of migration, but they are stable and cannot be affected by switching to a new library. Therefore, this kind of business is carried out in the second stage. 3) The last step is to gradually migrate large businesses to the new database. Such businesses can only be migrated after the previous guarantee, so as to ensure the normal progress of orders.
4.4 Data Migration
Data migration, that is, data migration from the old database to the new database, is a necessary process of switching between the old database and the new database. The general idea of migration is to migrate each table one at a time. We did not do this for business, but migrated from the order dimension. For example, if the Order library has the Order table, OrderStation table, and OrderFare table, instead of migrating each table separately, we synchronize the information for each Order based on the Order number. The general process is as follows: 1) Start a scheduled task, query the order list, and obtain the basic order information such as the order number. 2) According to the order number, query other information of the order to obtain a complete order information. 3) Check whether the order has been synchronized. If the synchronization has been completed before, it will be skipped directly; otherwise, the next order number will continue to be executed. 4) Map the complete order information of the old library into the corresponding model of the new library. 5) Synchronously write the new order information into each table of the new library. 6) Continue to execute the next order number until all order numbers are fully synchronized.
4.5 Finished Effect
After a new reconstruction of the order library, it has been running stably on the line and achieved the desired effect.
- Service closure, the branch library and table logic, closed into a service;
- Unified interface management, unified encryption of sensitive fields;
- Flexible function, provide rich functions, support customization;
- Library and table routing transparent, and based on mainstream technology, easy to use;
- Complete monitoring, support to table dimension monitoring;
Five, FAQ Summary
5.1 Typical problems of database and table
Question 1: How to conduct cross-library operations, associated queries, and cross-library transactions?
Answer: For cross-library operations, in the order main flow application, we currently prohibit such operations as cross-library query and cross-library transaction. For cross-library transactions, there will be no cross-library transactions because the routes are routed to the same database by order number and year of creation. The same for cross-library associated query, also does not exist, are often based on the order to query. At the same time, it can also be properly redundant, for example, store a station name field while storing the station code.
Question 2: How do I make paging queries?
Answer: At present, in the order main process application of paging query, we directly use Sharding-JDBC to provide the most primitive way of paging, directly according to the normal paging SQL, to query paging. Reason: The main process order service, such as the ticket issuing system, is often to query the order of the first few pages, directly query, there will not be a deep page turning. Of course, for higher requirements of paging query, you can go to the implementation of secondary query, to achieve more efficient paging query.
Question 3: How do I support very complex statistical queries? Answer: specially added a wide table, to meet the needs of those very complex queries, the commonly used query information, all fall into this table, and then can quickly get the results of these complex queries.
5.2 API method Problems
Q: How to meet different search conditions after the service is closed? Answer: Our API method is relatively fixed. The general query class has only two methods, query by order number and query by Condition. For all kinds of different query conditions, it can be realized by adding the field attribute of Condition instead of adding various query methods.
5.3 Uniformity problem
Question: Data in different groups is not evenly distributed, and there are hot spots? Answer: Yes, for example, after 5 years of operation, we have expanded to 3 groups with 3 libraries in each group, so the third group should be read and write most. However, such issues of uneven distribution and hot spots are acceptable, which are equivalent to the previous two groups and can be used as historical archive groups. Currently, the third group is mainly used. As the business develops, you can make the allocation. For example, if the business develops rapidly, the relatively reasonable allocation is usually not three libraries per group, but more libraries in the group. At the same time, because there are multiple libraries in each group, the hot issues of a library are essentially different from those of the previous one, and there is no need to worry about the single database bottleneck problem, which can be extended by adding libraries.
5.4 Intra-Group Routing Faults
Question: For order number only queries, does the routing process within a group read all tables within the group? Answer: According to the current design, yes. At present, the order number is grouped by year, and other information is not stored in the order number, which is generated in a unified way by Ctrip. That is, if we query according to the order number, we do not know which table it exists in, we need to query all tables in the group. For this kind of problem, it is generally recommended to increase the factor appropriately and store the year of creation in the order number so that you know which table is corresponding; It can also be extended appropriately every year, such as dividing tables every five years. After such adjustment, there should be relatively few tables in a group, which can greatly speed up query performance.
5.5 Asynchronous Double-write Problem
Question: Why did the double write process adopt a combination of multiple ways?
Answer: First of all, section mode can maximize the timeliness of order synchronization. However, in practice, we found that asynchronous slice double write, there will be multithreading concurrency problems. Because in the old library, the association relation of the table depends on the increment ID of the database, and depends on the insert order of the table, there will be association failure. Therefore, it is not enough to rely solely on section synchronization. A more robust approach, namely scheduled tasks (order events are unreliable message events, that is, there may be loss), is needed to ensure the consistency of the database.
About the author
We are ctrip train ticket R&D team, responsible for train ticket business development and innovation. Train ticket research and development is constantly exploring and innovating in the direction of multiple transportation routes and algorithms, multiple transportation vehicles one-stop reservation and high concurrency, continuously optimizing user experience and improving efficiency, and is committed to buying global train tickets for people all over the world.