Data fragmentation of ShardingSphere | 🏆 nuggets technical essay | double festival special articles

I. Core concepts

Traditional focuses data stored in a single node of the shortage of the solution: 1. Performance: B + tree index, USES mostly due to a relational database under the condition of data amount exceeds the threshold, the depth of the index will also make the disk access IO number increases, resulting in query performance decline, high concurrent requests at the same time also makes the centralized database system is the biggest bottleneck 2. Availability: 3. Operation and maintenance: When the data of a single database instance reaches the threshold, the time cost of data backup and recovery will become increasingly uncontrollable with the size of the data

Data sharding means that data stored in a single database is distributed to multiple databases or tables according to a certain dimension to improve performance bottlenecks and availability. Data fragmentation is divided into vertical fragmentation and horizontal fragmentation

Vertical sharding: Vertical sharding is called vertical sharding, also called vertical sharding. The core concept is dedicated to the database. Vertical splitting can alleviate problems caused by data volume and access, but it cannot cure them

Horizontal sharding: It is also called horizontal sharding. Instead of dividing data according to business logic, data is divided into multiple libraries or tables by a certain field (some fields) according to certain rules. Each shard contains only a part of the data.

1.1 Core concepts of data fragmentation

1.1.1 SQL

Logical table
Truth table
Data nodes
The binding table

If the binding table is not set between the primary table and the sub-table, the query process will be presented as a Cartesian product association. Otherwise, when the sub-table and the primary table are routed, all routing calculations will only use the policy of the primary table

1.1.2 shard

Shard key

Database fields for sharding

Subdivision algorithm

1. PreciseShardingAlgorithm: it is used to process the sharding scenario with In when a single key is used as the sharding key and is used In conjunction with StandardShardingStrategy

2. Composite lamination: corresponding ComplexKeysShardingAlgorithm, used for processing using multiple key as a shard key shard scenario, cooperate ComplexShardingStrategy use

3. Range sharding: corresponds to RangeShardingAlgorithm, which is used to process the sharding scenario in which BETWEEN AND, >, <, <=, AND >= with a single key as the sharding key. It is used in conjunction with StandardShardingStrategy

Hint shard: Corresponds to HintShardingAlgorithm, which is used to process scenarios using Hint line shard. It is configured together with HintShardingStrategy

Shard strategy

Contains the sharding key and the sharding algorithm

1. StandardShardingStrategy: corresponds to StandardShardingStrategy. Supports sharding operations for =, >, <, >=, <=, IN AND BETWEEN AND IN SQL statements

2. Compound sharding policy

3. The row expression sharding policy supports only a single sharding key.

4.Hint sharding strategy

5. No sharding policy

1.2 Data fragmentation kernel analysis

Two, code practice

2.1 Preparations

2.1.1 Database Construction Table construction

ds0

CREATE DATABASE `ds0`;

USE `ds0`;

CREATE TABLE `user0` (
  `user_id` BIGINT(20) NOT NULL,
  `user_name` VARCHAR(200) DEFAULT NULL,
  PRIMARY KEY (`user_id`)
) ENGINE=INNODB DEFAULT CHARSET=utf8;


CREATE TABLE `user1` (
  `user_id` BIGINT(20) NOT NULL,
  `user_name` VARCHAR(200) DEFAULT NULL,
  PRIMARY KEY (`user_id`)
) ENGINE=INNODB DEFAULT CHARSET=utf8;

Copy the code

ds1



CREATE DATABASE `ds1`;

USE `ds1`;

CREATE TABLE `user0` (
  `user_id` BIGINT(20) NOT NULL,
  `user_name` VARCHAR(200) DEFAULT NULL,
  PRIMARY KEY (`user_id`)
) ENGINE=INNODB DEFAULT CHARSET=utf8;


CREATE TABLE `user1` (
  `user_id` BIGINT(20) NOT NULL,
  `user_name` VARCHAR(200) DEFAULT NULL,
  PRIMARY KEY (`user_id`)
) ENGINE=INNODB DEFAULT CHARSET=utf8;

Copy the code

2.1.2 the introduction of pom

<dependency> <groupId>org.apache.shardingsphere</groupId> <artifactId>sharding-jdbc-spring-boot-starter</artifactId> The < version > 4.1.0 < / version > < / dependency >Copy the code

2.2 Code Practices

2.2.1 Configuration Guide (This section uses properties as an example)

2.2.1.1 Configure data source information by database and table

spring.shardingsphere.datasource.names= # Data source name. Multiple data sources are separated by commas

spring.shardingsphere.datasource.<data-source-name>.type= # Database connection pool class name
spring.shardingsphere.datasource.<data-source-name>.driver-class-name= # Database driver class name
spring.shardingsphere.datasource.<data-source-name>.url= # Database URL connection
spring.shardingsphere.datasource.<data-source-name>.username= # Database user name
spring.shardingsphere.datasource.<data-source-name>.password= # database password
spring.shardingsphere.datasource.<data-source-name>.xxx= # Other properties of the database connection pool
Copy the code

2.2.1.2 Configuring the fragmentation policy and algorithm

Configure data nodes for specific tables

Spring. Shardingsphere. Sharding. Tables. < logic - table - name >. The actual data - nodes = # is composed of data source name + name of the table, with the decimal separator. Multiple tables are separated by commas and inline expressions are supported. The default means that data nodes are generated using known data sources and logical table names for broadcast tables (that is, the same table is needed in each library for associated queries, mostly dictionary tables) or only libraries but not tables, and all libraries have exactly the same table structureCopy the code

Configure a sharding policy and algorithm for a table and a database. You can configure only one of the four policies for a database and a table

Single shard key

spring.shardingsphere.sharding.tables.<logic-table-name>.database-strategy.standard.sharding-column= # Shard column name
spring.shardingsphere.sharding.tables.<logic-table-name>.database-strategy.standard.precise-algorithm-class-name= # Precise sharding algorithm class name, used for = and IN. This class implements the Preciseshard Algorithm interface and provides a parameterless constructor
spring.shardingsphere.sharding.tables.<logic-table-name>.database-strategy.standard.range-algorithm-class-name= # Name of the range sharding algorithm class, used BETWEEN, optional. This class implements the RangeShardingAlgorithm interface and provides a parameterless constructor
Copy the code

Composite sharding of multiple sharding keys

spring.shardingsphere.sharding.tables.<logic-table-name>.database-strategy.complex.sharding-columns= # Shard column name, multiple columns separated by commas
spring.shardingsphere.sharding.tables.<logic-table-name>.database-strategy.complex.algorithm-class-name= # Name of the compound sharding algorithm class. The class needs to implement ComplexKeysShardingAlgorithm interfaces and provides a parameterless constructor
Copy the code

Line expression fragmentation

spring.shardingsphere.sharding.tables.<logic-table-name>.database-strategy.inline.sharding-column= # Shard column name
spring.shardingsphere.sharding.tables.<logic-table-name>.database-strategy.inline.algorithm-expression= # Sharding algorithm line expression, should conform to groovy syntax
Copy the code

Hint Sharding Strategy

spring.shardingsphere.sharding.tables.<logic-table-name>.database-strategy.hint.algorithm-class-name= Hint Indicates the class name of the sharding algorithm. This class implements the HintShardingAlgorithm interface and provides a parameterless constructor
Copy the code

Configure the default data source and the default database and table partitioning policy for the table routes for which the fragmentation rule is not configured

spring.shardingsphere.sharding.default-data-source-name= # Tables with no sharding rule configured will be located using the default data source
spring.shardingsphere.sharding.default-database-strategy.xxx= # Default database sharding policy, same as database sharding policy
spring.shardingsphere.sharding.default-table-strategy.xxx= # Default table sharding policy, same as table sharding policy
Copy the code

2.2.1.3 Configure the binding table and broadcast table

Binding-tables and broadcast-tables are sets

spring.shardingsphere.sharding.binding-tables[0]= # Binding table rule list
spring.shardingsphere.sharding.binding-tables[1]= # Binding table rule list
spring.shardingsphere.sharding.binding-tables[x]= # Binding table rule list

spring.shardingsphere.sharding.broadcast-tables[0]= # Broadcast table rule list
spring.shardingsphere.sharding.broadcast-tables[1]= # Broadcast table rule list
spring.shardingsphere.sharding.broadcast-tables[x]= # Broadcast table rule list
Copy the code

2.2.1.4 Other Configurations

Configure a distributed auto-increment primary key policy

Spring. Shardingsphere. Sharding. Tables. < logic - table - name >. The key to the generator. The column = # on the column name, Said by default. Do not use on the primary key generator spring shardingsphere. Sharding. Name < logic - table - name >. The key - generator. Type = # on the column value generator type, use the default default said on the column value generator. You can use a user-defined column value generator or select the built-in type: SNOWFLAKE/UUID spring. Shardingsphere. Sharding. Name < logic - table - name >. The key - generator. Props. < property name - > = # attribute configuration, pay attention to: Using SNOWFLAKE algorithm, need to configure the worker. The id with Max. Tolerate.. The time difference. Milliseconds properties. If the value generated by this algorithm is used as the fragment value, you are advised to configure the max.vibration.offset attributeCopy the code

The configuration enables the SQL parsing process to be displayed

Spring. Shardingsphere. Props. SQL. Show whether = # open SQL, according to the default value: falseCopy the code

2.2.2 Actual Configuration of Libraries and Tables (Yaml Configuration)

Here, the sharding strategy of the database and the table is both using row expression, and the sharding algorithm is to carry out modulo routing for 2 according to user_id

spring:
  shardingsphere:
    # Configure data source names. Multiple data sources are separated by commas
    datasource:
      names: ds0,ds1
    Configure the basic information of each data source
      ds0:
        # type: database connection pool name
        type: com.alibaba.druid.pool.DruidDataSource
        driver-class-name: com.mysql.jdbc.Driver
        url: jdbc:mysql://localhost:3306/ds0? serverTimezone=UTC
        username: root
        password: 1234
      ds1:
        type: com.alibaba.druid.pool.DruidDataSource
        driver-class-name: com.mysql.jdbc.Driver
        url: jdbc:mysql://localhost:3306/ds1? serverTimezone=UTC
        username: root
        password: 1234
    sharding:
      tables:
        user:
          actual-data-nodes: ds$->{0.. 1}.user$->{0.. 1}
          # Configure the database split policy
          database-strategy:
            inline:
              sharding-column: user_id # Configure the sharding key name
              algorithm-expression: ds$->{user_id % 2} # Configure the sharding algorithm line expression
          Configure the sub-table policy
          table-strategy:
             inline:
              sharding-column: user_id # Configure the sharding key name
              algorithm-expression: user$->{user_id % 2} # Configure the sharding algorithm line expression
    props:
      sql:
        show: true
Copy the code

2.2.2 Actual configuration of Libraries and Tables (Java Code configuration)

In real business, many business tables need to be divided into tables. Of course, it is impossible to configure every table through YAML, so it may be necessary to configure library and table through Java code. The following code will function the same as the above YAML configuration.

pom

<dependency> <groupId>org.apache.shardingsphere</groupId> <artifactId>sharding-jdbc-core</artifactId> The < version > 4.0.1 < / version > < / dependency >Copy the code

Configuration code

@Configuration
public class ShardingDataSourceConfiguration {
    @Bean
    public DataSource getShardingDataSource(a) throws SQLException {
        ShardingRuleConfiguration shardingRuleConfig = new ShardingRuleConfiguration();
        // Obtain the sharding rule configuration for the user table
        TableRuleConfiguration userInfoTableRuleConfiguration = getUserInfoTableRuleConfiguration();

        shardingRuleConfig.getTableRuleConfigs().add(userInfoTableRuleConfiguration);
        return ShardingDataSourceFactory.createDataSource(createDataSourceMap(), shardingRuleConfig, new Properties());
    }

    /** * Configure real data source *@returnData source map */
    private Map<String, DataSource> createDataSourceMap(a) {
        Map<String, DataSource> dataSourceMap = new HashMap<>();
        DruidDataSource druidDataSource1 = new DruidDataSource();
        druidDataSource1.setUrl("jdbc:mysql://localhost:3306/ds0? serverTimezone=UTC");
        druidDataSource1.setUsername("root");
        druidDataSource1.setDriverClassName("com.mysql.jdbc.Driver");
        druidDataSource1.setPassword("1234");

        DruidDataSource druidDataSource2 = new DruidDataSource();
        druidDataSource2.setUrl("jdbc:mysql://localhost:3306/ds1? serverTimezone=UTC");
        druidDataSource2.setUsername("root");
        druidDataSource2.setDriverClassName("com.mysql.jdbc.Driver");
        druidDataSource2.setPassword("1234");

        dataSourceMap.put("ds0",druidDataSource1);
        dataSourceMap.put("ds1",druidDataSource2);
        return dataSourceMap;
    }

    /** * Configure the sharding rule for the user table **@returnSer table sharding rule configuration object */
    private TableRuleConfiguration getUserInfoTableRuleConfiguration(a) {

        // Configure the data node for the user table
        TableRuleConfiguration ruleConfiguration = new TableRuleConfiguration("user"."ds${0.. 1}.user${0.. 1}");
        // Set the fragment key
        String shardingKey = "user_id";
        // Configure the database fragmentation policy and fragmentation algorithm for the User table
        ruleConfiguration.setDatabaseShardingStrategyConfig(new InlineShardingStrategyConfiguration(shardingKey, "ds${user_id % 2}"));
        // Configure the fragmentation policy and algorithm for the user table
        ruleConfiguration.setTableShardingStrategyConfig(new InlineShardingStrategyConfiguration(shardingKey, "user${user_id % 2}"));

        returnruleConfiguration; }}Copy the code

Complete actual code on Github:Github.com/xiaomaomiao…

2.3 Record of stepping pits

1. An error is reported when ShardingShpere and Druid data source are started

Failed to configure a DataSource: ‘url’ attribute is not specified and no embedded datasource could be configured.

www.freesion.com/article/265…

🏆 nuggets technical essay | double festival special articles