Distributed primary key issues
In traditional database software development, automatic generation of primary keys is a basic requirement. Each database also provides corresponding support for this requirement, such as MySQL’s increment key, Oracle’s increment sequence and so on. After data sharding, generating globally unique primary keys from different data nodes is a very tricky problem. Self-increment keys between different actual tables in the same logical table are not aware of each other, resulting in duplicate primary keys. Although collisions can be avoided by constraining autoincrement primary key initial values and steps, additional operation and maintenance rules need to be introduced, resulting in a lack of integrity and scalability of the solution.
There are a number of third-party solutions that perfectly solve this problem, such as UUID, which relies on a specific algorithm to generate non-repeating keys, or service generation by introducing primary keys. In order to facilitate the use of users and meet the needs of different users in different scenarios, Apache ShardingSphere not only provides built-in distributed primary key generators, such as UUID and SNOWFLAKE, but also removes the interface of distributed primary key generators to facilitate users to implement self-defined auto-increment primary key generators.
Refer to the previous article: do you want to use auto-increment ids?
Shard key
Database fields used for sharding are key fields for horizontal splitting of databases (tables). For example, if the mantissa of the order primary key in the order table is modulo sharded, the order primary key is a sharding field. If there is no fragment field in SQL, full routing is performed, resulting in poor performance. In addition to single-sharding field support, Apache ShardingSphere also supports sharding by multiple fields.
routing
Broadcast routing
For SQL that does not carry sharding keys, broadcast routes are used.
SELECT * FROM t_order WHERE good_prority IN (1.10);
Copy the code
Such an SQL broadcast to the full database table query
SELECT * FROM t_order_0 WHERE good_prority IN (1.10);
SELECT * FROM t_order_1 WHERE good_prority IN (1.10);
SELECT * FROM t_order_2 WHERE good_prority IN (1.10);
SELECT * FROM t_order_3 WHERE good_prority IN (1.10);
Copy the code
Unicast routing
Unicast routing is used to obtain real table information. It only needs to obtain data from any real table in any library.
practice
Since the development direction of our database is something like this:
Therefore, we follow the trend to do the operation practice of single library sub-table first, and then do sub-library sub-table.
To declare a version, the following practices use ShardingSphere5.0 Beta
Single library sub table – New business
Let’s assume that in an ideal world, the business is split before it even starts.
Since ShardingSphere does not automatically create tables according to the table splitting rules, we first create tables manually. Of course, only the table names are different, but the table structure is the same. The following table building sentences are used for testing:
CREATE TABLE `t_order_sharding_0` (
`order_id` bigint(20) NOT NULL,
`user_id` bigint(20) DEFAULT NULL,
`create_time` datetime DEFAULT CURRENT_TIMESTAMP COMMENT 'Creation time',
`update_time` datetime DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT 'Update Time'.PRIMARY KEY (`order_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
CREATE TABLE `t_order_sharding_1` (
`order_id` bigint(20) NOT NULL,
`user_id` bigint(20) DEFAULT NULL,
`create_time` datetime DEFAULT CURRENT_TIMESTAMP COMMENT 'Creation time',
`update_time` datetime DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT 'Update Time'.PRIMARY KEY (`order_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
Copy the code
To illustrate, we simply create two tables T_ORDER_SHARDING_0 and T_order_sharDING_1
Here is my yamL configuration file for the sub-table section:
Configure the database table
sharding:
sharding-algorithms:
table_inline:
props:
algorithm-expression: t_order_sharding_${order_id % 2}
type: INLINE
tables:
t_order_sharding:
Configure t_order table rules
actual-data-nodes: write-ds.t_order_sharding_$->{0.. 1}
table-strategy: # table splitting strategy
standard:
sharding-algorithm-name: table_inline
sharding-column: order_id
Copy the code
Note that you can’t run this configuration for two reasons:
Table -inline 2 T_order_sharDING_ ${order_id % 2} t_order_sharding_$->{order_id % 2}Copy the code
The last normal configuration is:
Configure the database table
sharding:
sharding-algorithms:
table-inline:
props:
algorithm-expression: t_order_sharding_$->{order_id % 2}
type: INLINE
tables:
t_order_sharding:
Configure t_order table rules
actual-data-nodes: write-ds.t_order_sharding_$->{0.. 1}
table-strategy: # table splitting strategy
standard:
sharding-algorithm-name: table-inline
sharding-column: order_id
Copy the code
As you can see from the configuration, my policy is simple. Since there are only two tables, modulo the value of the order_ID column with 2, and the values can only be 0 and 1, which happen to be the suffixes of my two tables.
If you want to divide more tables, or have other custom table partitioning policies and algorithms, you can refer to the official documentation to set them up.
Next we write the application side of the program, using the interface request to add some data, to see if we follow our rules into a different table. User_id and order_id were created using Snowflake using a native utility class.
Of course, ShardingSphere configuration can also help us automatically generate:
Configure the database table
sharding:
key-generators:
snowflake:
type: SNOWFLAKE
props:
worker:
id: 123
sharding-algorithms:
table-inline:
props:
algorithm-expression: t_order_sharding_$->{order_id % 2 }
type: INLINE
tables:
t_order_sharding:
Configure t_order table rules
actual-data-nodes: write-ds.t_order_sharding_$->{0.. 1}
table-strategy: # table splitting strategy
standard:
sharding-algorithm-name: table-inline
sharding-column: order_id
# Main build build strategy
key-generate-strategy:
key-generator-name: snowflake
column: order_id
Copy the code
Keep an eye on the key-generators configuration
We used the business interface to access and then query four pieces of data and found that they were inserted into different tables. Importantly, the business program does not need to make any changes. It is the same logic as before using a table, such as using SpringBoot+MybatisPlus for fast business implementation. ShardingSphere will automatically do the relevant things for you when configured, so don’t worry.
{
"code": 100000."msg": ""."data": [{"orderId": 654340378203258881."userId": 12."createTime": "The 2021-10-11 15:15:03"."updateTime": "The 2021-10-11 15:15:03"
},
{
"orderId": 1447456383522967551."userId": 12."createTime": "The 2021-10-11 14:59:14"."updateTime": "The 2021-10-11 14:59:14"
},
{
"orderId": 1447457650144055296."userId": 12."createTime": "The 2021-10-11 15:02:50"."updateTime": "The 2021-10-11 15:02:50"
},
{
"orderId": 1447457651482038272."userId": 12."createTime": "The 2021-10-11 15:02:52"."updateTime": "The 2021-10-11 15:02:52"}}]Copy the code
According to the configuration of the table will be finished, there is no need to modify the business code, the original how to write how to write, very perfect, but need to pay attention to your SQL, the previous writing method support need to be according to the official document query, under normal circumstances ordinary SQL is ok, some special not, Of course, these special may be lazy development without reasonable design of the program just a SINGLE SQL fix, not the most appropriate way. For example, a complex SQL can use multiple simple SQL through the program using memory calculation results.
Sub-database sub-table can be configured with read/write separation. The following is a complete configuration:
spring:
profiles:
include: common-local
shardingsphere:
props:
sql:
Set whether SQL is displayed
show: true
check:
table:
metadata:
enabled: false
datasource:
names: write-ds,read-ds-0
write-ds:
jdbcUrl: jdbc:mysql://mysql.local.test.myapp.com:23306/test?allowPublicKeyRetrieval=true&useSSL=false&allowMultiQueries=true&serv erTimezone=Asia/Shanghai&useSSL=false&autoReconnect=true&useUnicode=true&characterEncoding=utf-8&zeroDateTimeBehavior=co nvertToNull
type: com.zaxxer.hikari.HikariDataSource
driver-class-name: com.mysql.cj.jdbc.Driver
username: root
password: nicai
connectionTimeoutMilliseconds: 3000
idleTimeoutMilliseconds: 60000
maxLifetimeMilliseconds: 1800000
maxPoolSize: 50
minPoolSize: 1
maintenanceIntervalMilliseconds: 30000
read-ds-0:
jdbcUrl: jdbc:mysql://mysql.local.test.read1.myapp.com:23306/test?allowPublicKeyRetrieval=true&useSSL=false&allowMultiQueries=tru e&serverTimezone=Asia/Shanghai&useSSL=false&autoReconnect=true&useUnicode=true&characterEncoding=utf-8&zeroDateTimeBehav ior=convertToNull
type: com.zaxxer.hikari.HikariDataSource
driver-class-name: com.mysql.cj.jdbc.Driver
username: root
password: nicai
connectionTimeoutMilliseconds: 3000
idleTimeoutMilliseconds: 60000
maxLifetimeMilliseconds: 1800000
maxPoolSize: 50
minPoolSize: 1
maintenanceIntervalMilliseconds: 30000
rules:
readwrite-splitting:
data-sources:
glapp:
write-data-source-name: write-ds
read-data-source-names:
- read-ds-0
load-balancer-name: roundRobin # Load balancing algorithm name
load-balancers:
roundRobin:
type: ROUND_ROBIN ROUND_ROBIN ROUND_ROBIN ROUND_ROBIN
Configure the database table
sharding:
key-generators:
snowflake:
type: SNOWFLAKE
props:
worker:
id: 123
sharding-algorithms:
table-inline:
props:
algorithm-expression: t_order_sharding_$->{order_id % 2 }
type: INLINE
tables:
t_order_sharding:
Configure t_order table rules
actual-data-nodes: write-ds.t_order_sharding_$->{0.. 1}
table-strategy: # table splitting strategy
standard:
sharding-algorithm-name: table-inline
sharding-column: order_id
key-generate-strategy:
key-generator-name: snowflake
column: order_id
Copy the code
Single library sub-table – old business
Listen next time
reference
- Shardingsphere.apache.org/document/5….