The 9 distributed primary key ID generation schemes of sub-database sub-table are quite complete

Sharding sharding sharding sharding Sharding Sharding Sharding Sharding Sharding Sharding Sharding Sharding Sharding Sharding

Introduction of any a kind of technology is a risk, depots table is no exception, of course, unless libraries, table data volume continues to increase, to some degree, so that the existing high availability architecture have been unable to support, otherwise don’t suggest you do depots table, because the data fragmentation, you will find yourself on the road to a segment on the pit, Distributed primary key IDS are the first pit encountered.

A logical table T_ORDER is split into multiple real tables T_ORDER_n, and then split into different shards db_0, db_1… , the self-added key of each real table cannot be perceived by each other, so duplicate primary keys will be generated. In this case, the self-added primary key of the database itself cannot meet the global unique requirement for primary keys of sub-tables.

 db_0--
    |-- t_order_0
    |-- t_order_1
    |-- t_order_2
 db_1--
    |-- t_order_0
    |-- t_order_1
    |-- t_order_2
Copy the code

Although we can solve the problem of ID duplication by strictly restricting each shard table to increase the initial value and step of the primary key, this will cause a steep increase in operation and maintenance costs and poor scalability. Once the number of shard tables is expanded, the data of the original table will change greatly, so this method is not desirable.

Step step = table number db_0 - | - t_order_0 ID:0,6,12,18.. |-- t_order_1 ID:1,7,13,19.. |-- t_order_2 ID:2,8,14,20.. db_1-- |-- t_order_0 ID:3,9,15,21.. |-- t_order_1 ID:4,10,16,22.. |-- t_order_2 ID:5,11,17,23..Copy the code

There are a number of third-place solutions that perfectly solve this problem, such as using specific algorithms to generate non-repeating keys based on UUID, SNOWFLAKE, and segment numbers, or directly referencing primary key generation services like Leaf and TinyId.

Sharding-jdbc has two built-in distributed primary key generation solutions, UUID and SNOWFLAKE. Moreover, it also removes the interface of distributed primary key generator, so that developers can implement custom primary key generator. Later, we will integrate TinyId primary key generation service into the customized generator.

To automatically generate a primary key ID for a field in Sharding-JDBC, you need to do the following in the application.properties file:

# primary key field spring. Shardingsphere. Sharding. Name t_order. Key - the generator. The column = # order_id primary key ID generation scheme Spring. Shardingsphere. Sharding. Tables. T_order. Key - generator. Type = UUID machine id # work spring.shardingsphere.sharding.tables.t_order.key-generator.props.worker.id=123
Copy the code

Key – the generator. The column said primary key field, key – generator. Type the primary key ID generation solution (built-in or custom), and the key – generator. Props. Worker. ID ID for the machines, The machine ID participates in bit-counting when the primary key generation scheme is set to SNOWFLAKE.

There are two things to note when using sharding-JDBC distributed primary keys:

Once theinsertIf the primary key field in the entity object of the insert operation has been assigned a value, the primary key generation scheme will be invalid even if configured. Finally, the data executed by SQL will be subject to the assigned value.
Do not set an increment attribute for a primary key field, otherwise the primary key ID will be the defaultSNOWFLAKEMethod generation. Such as:mybatis plus 的 @TableIdAnnotate to fieldorder_idSet the autoincrement primary key, then the configuration of which scheme, always according to the snowflake algorithm generation.

Below we from the source code analysis of sharding- JDBC built-in primary key generation scheme UUID, SNOWFLAKE is how to implement.

UUID

Open UUID type of primary key generation implementation class UUIDShardingKeyGenerator source code found, its generation rule only uuID.randomuuid () so a line of code, oh ~ heart silently to a god.

Although UUID can be globally unique, it is not recommended as a primary key because in our actual business, the primary key of either user_id or order_id is an integer, and UUID generates a 32-bit string.

Its storage and query cost MySQL a lot of performance, and the MySQL official also clearly recommends that the primary key should be as short as possible, as disorderly as possible, as the database primary key UUID will also lead to frequent changes in data location, seriously affecting performance.

public final class UUIDShardingKeyGenerator implements ShardingKeyGenerator {
    private Properties properties = new Properties();

    public UUIDShardingKeyGenerator() {
    }

    public String getType() {
        return "UUID"; } public synchronized Comparable<? >generateKey() {
        return UUID.randomUUID().toString().replaceAll("-"."");
    }

    public Properties getProperties() {
        return this.properties;
    }

    public void setProperties(Properties properties) {
        this.properties = properties; }}Copy the code

SNOWFLAKE

SNOWFLAKE is the default primary key generation scheme that generates 64-bit Long data.

The primary key generated by the snowflake algorithm in sharding-jdbc is mainly composed of four parts: 1bit symbol bit, 41bit timestamp bit, 10bit worker process bit and 12bit serial number bit.

Sign bit (1bit)

In Java, the highest bit of Long is the sign bit. The positive value is 0, and the negative value is 1

Timestamp bit (41bit)

The number of milliseconds that a 41-bit timestamp can hold is 2 to the 41st power, and the total number of milliseconds in a year is 1000L * 60 * 60 * 24 * 365, which is about 69 years, which is enough for my lifetime.

Math.pow(2, 41)/(365 * 24 * 60 * 60 * 1000L) = = 69 yearsCopy the code

Worker process bit (10bit)

Represents a unique work process id, the default value is 0, can be the key – generator. Props. Worker. The id attribute set.

spring.shardingsphere.sharding.tables.t_order.key-generator.props.worker.id=0000
Copy the code

Serial number bit (12bit)

Generate different ids in the same millisecond.

The clock back

After knowing the composition of primary key ID of Snowflake algorithm, it is not difficult to find that this algorithm is heavily dependent on server time, and those dependent on server time will encounter a tricky problem: clock callback.

Why does the clock go back?

Network Time Protocol (NTP) is a Network Time Protocol used to synchronize and calibrate the Time of computers on the Internet.

That’s why our phones don’t have to manually check the time, but everyone still has the same time.

Our hardware clock may become inaccurate (fast or slow) for various reasons, at this time, the NTP service is needed to do the time calibration, and the server clock will jump or dial back when doing the calibration.

How does snowflake algorithm solve clock callback

Server clock rollback results in duplicate ids. SNOWFLAKE improves the SNOWFLAKE algorithm by adding a maximum number of milliseconds for clock rollback.

If the clock rollback time exceeds the maximum allowable number of milliseconds, the program directly reports an error. If tolerated, the default distributed primary key generator waits until the clock is synchronized to the time when the last primary key was generated before resuming work.

Maximum tolerance of the clock back to dial the number of milliseconds, the default value is 0, but through property Max. Tolerate.. The time difference. Milliseconds Settings.

# maximum tolerance of the clock back to dial the number of milliseconds spring. Shardingsphere. Sharding. Name t_order. Key - generator. Max. Tolerate.. The time difference. The milliseconds = 5Copy the code

Below is a look at the source of the it implementation class SnowflakeShardingKeyGenerator, core processes about the following:

The last time the primary key was generated, lastMilliseconds, is compared to the current time currentMilliseconds, if lastMilliseconds > currentMilliseconds it means the clock has been retracted.

Then went on to determine whether two time difference (timeDifferenceMilliseconds) in setting up the biggest tolerance time threshold Max. Tolerate.. The time difference. Within milliseconds, In threshold term Thread. The Thread dormancy difference time sleep (timeDifferenceMilliseconds), or greater than the difference between direct quote exception.

 
/ * * *@author xiaofu* /
public final class SnowflakeShardingKeyGenerator implements ShardingKeyGenerator{
    @Getter
    @Setter
    private Properties properties = new Properties();
    
    public String getType() {
        return "SNOWFLAKE"; } public synchronized Comparable<? >generateKey() {
    	/** * Current system time in milliseconds */ 
        long currentMilliseconds = timeService.getCurrentMillis();
        /** * Determines whether to wait to tolerate the time difference. If so, wait for the time difference to pass and then get the current system time */ 
        if (waitTolerateTimeDifferenceIfNeed(currentMilliseconds)) {
            currentMilliseconds = timeService.getCurrentMillis();
        }
        /** * if the last millisecond is the same as the current system time, that is, within the same millisecond */
        if (lastMilliseconds == currentMilliseconds) {
        	/** * & bits and operators: If the corresponding bits are both 1, the result is 1; otherwise, the result is 0 * When the sequence is 4095, the new sequence after 4095+1 performs bit-sum operation with the mask and the result is 0 * When the sequence is other values, the bit-sum operation will not be 0 * that is, the maximum value 4096 has been used in this sequence. The next millisecond time value */ is taken
            if (0L == (sequence = (sequence + 1) & SEQUENCE_MASK)) { currentMilliseconds = waitUntilNextTime(currentMilliseconds); }}else {
        	/** * the last millisecond has passed, reset the sequence value to 1 */
            vibrateSequenceOffset();
            sequence = sequenceOffset;
        }
        lastMilliseconds = currentMilliseconds;
        
        /** * XX...... XX000000 00000000 00000000 time XX XX XX * * XXXXXX XXXX0000 00000000 machine ID XXXX XXXXXXXX serial number XX * | a or operation of three parts: The result is 0 if both corresponding bits are 0, otherwise 1 */
        return ((currentMilliseconds - EPOCH) << TIMESTAMP_LEFT_SHIFT_BITS) | (getWorkerId() << WORKER_ID_LEFT_SHIFT_BITS) | sequence;
    }
    
    /** * Determine whether to wait to tolerate the time difference */
    @SneakyThrows
    private boolean waitTolerateTimeDifferenceIfNeed(final long currentMilliseconds) {
    	/** * If the last time to obtain an ID is less than or equal to the current system time, which is normal, there is no need to wait */
        if (lastMilliseconds <= currentMilliseconds) {
            return false;
        }
        /** * ===> Clock rollback (sequence generation time is longer than the current system time), need to wait for the time difference */
        /** * The time difference between the last milliseconds when the ID was obtained and the milliseconds of the current system time */
        long timeDifferenceMilliseconds = lastMilliseconds - currentMilliseconds;
        /** * The time difference is smaller than the maximum tolerance time difference, that is, the current time is within the time difference of clock callback */
        Preconditions.checkState(timeDifferenceMilliseconds < getMaxTolerateTimeDifferenceMilliseconds(), 
                "Clock is moving backwards, last time is %d milliseconds, current time is %d milliseconds", lastMilliseconds, currentMilliseconds);
        /** * Thread sleep time difference */
        Thread.sleep(timeDifferenceMilliseconds);
        return true;
    }
    
    // The configured machine ID
    private long getWorkerId() {
        long result = Long.valueOf(properties.getProperty("worker.id".String.valueOf(WORKER_ID)));
        Preconditions.checkArgument(result >= 0L && result < WORKER_ID_MAX_VALUE);
        return result;
    }
    
    private int getMaxTolerateTimeDifferenceMilliseconds() {
        return Integer.valueOf(properties.getProperty("max.tolerate.time.difference.milliseconds".String.valueOf(MAX_TOLERATE_TIME_DIFFERENCE_MILLISECONDS)));
    }
    
    private long waitUntilNextTime(final long lastTime) {
        long result = timeService.getCurrentMillis();
        while (result <= lastTime) {
            result = timeService.getCurrentMillis();
        }
        returnresult; }}Copy the code

Order_id is a long integer of 18 digits. Order_id is a long integer of 18 digits. Order_id is a long integer of 18 digits. Don’t worry, the solution will be given later!

The custom

Sharding-jdbc extends the primary key generation rule by using the full Service Provider Interface (SPI) mechanism, which is a Service discovery mechanism by scanning files under the meta-INF /services project path. And automatically load the classes defined in the file.

Implementing a custom primary key generator is actually quite simple in two steps.

The first step is to implement the ShardingKeyGenerator interface and rewrite its internal methods, where the getType() method is the custom primary key production scheme type and the generateKey() method is the specific primary key generation rule.

The following code uses AtomicInteger to simulate implementing an ordered increment ID generation.

/ * * *@Author: xiaofu
 * @Description: Custom primary key generator */
@Component
public class MyShardingKeyGenerator implements ShardingKeyGenerator {


    private final AtomicInteger count = new AtomicInteger();

    /** * Custom generation scheme type */
    @Override
    public String getType() {
        return "XXX";
    }

    /** * core method - generate primary key ID */@Override public Comparable<? >generateKey() {
        return count.incrementAndGet();
    }

    @Override
    public Properties getProperties() {
        return null;
    }

    @Override
    public void setProperties(Properties properties){}}Copy the code

Second, since we are using the SPI mechanism to extend functionality, we need to configure the custom primary key generator class dynamics in the meta-INF /services file.

com.xiaofu.sharding.key.MyShardingKeyGenerator
Copy the code

After the above, let’s test, configure the defined primary key generation type XXX, and insert some data to see the effect.

spring.shardingsphere.sharding.tables.t_order.key-generator.column=order_id
spring.shardingsphere.sharding.tables.t_order.key-generator.type=XXX
Copy the code

SQL parsing the logs from the console shows that the order_ID field has been inserted in order increment, indicating that the configuration is fine.

For the nine

That can generate custom solutions, implementation of distributed primary key ideas, a lot of, think again before I wrote this article 9 kinds of distributed ids generation scheme, can find perfect compatibility, here to pick the drops (Tinyid) to practice it, because it is a separate distributed ids generated service, so I need to set up the environment.

The Tinyid service provides both Http and tinyid-client access. The tinyid-client service provides both Http and tinyid-client access. The tinyID-client service provides both Http and tinyid-client access.

The Tinyid service is set up

Pull the source code https://github.com/didi/tinyid.git first.

Since it is a distributed ID based on the number segment pattern, it depends on the database. The corresponding tables tiny_id_info and tiny_id_token are created and the default data is inserted.


CREATE TABLE `tiny_id_info` (
	`id` BIGINT (20) UNSIGNED NOT NULL AUTO_INCREMENT COMMENT 'Increment primary key',
	`biz_type` VARCHAR (63) NOT NULL DEFAULT ' ' COMMENT 'Business type, unique',
	`begin_id` BIGINT (20) NOT NULL DEFAULT '0' COMMENT 'Start ID. Only the initial value is recorded. Begin_id and max_id should be the same when initializing ',
	`max_id` BIGINT (20) NOT NULL DEFAULT '0' COMMENT 'Current maximum ID',
	`step` INT (11) DEFAULT '0' COMMENT 'step',
	`delta` INT (11) NOT NULL DEFAULT '1' COMMENT 'Each increment of ID',
	`remainder` INT (11) NOT NULL DEFAULT '0' COMMENT 'remainder',
	`create_time` TIMESTAMP NOT NULL DEFAULT '2010-01-01 00:00:00' COMMENT 'Creation time',
	`update_time` TIMESTAMP NOT NULL DEFAULT '2010-01-01 00:00:00' COMMENT 'Update Time',
	`version` BIGINT (20) NOT NULL DEFAULT '0' COMMENT 'Version number'.PRIMARY KEY (`id`),
	UNIQUE KEY `uniq_biz_type` (`biz_type`)
) ENGINE = INNODB AUTO_INCREMENT = 1 DEFAULT CHARSET = utf8 COMMENT 'ID information table';

CREATE TABLE `tiny_id_token` (
	`id` INT (11) UNSIGNED NOT NULL AUTO_INCREMENT COMMENT 'on the id',
	`token` VARCHAR (255) NOT NULL DEFAULT ' ' COMMENT 'token',
	`biz_type` VARCHAR (63) NOT NULL DEFAULT ' ' COMMENT 'Business type identifier accessible to this token',
	`remark` VARCHAR (255) NOT NULL DEFAULT ' ' COMMENT 'note',
	`create_time` TIMESTAMP NOT NULL DEFAULT '2010-01-01 00:00:00' COMMENT 'Creation time',
	`update_time` TIMESTAMP NOT NULL DEFAULT '2010-01-01 00:00:00' COMMENT 'Update Time'.PRIMARY KEY (`id`)
) ENGINE = INNODB AUTO_INCREMENT = 1 DEFAULT CHARSET = utf8 COMMENT 'Token Information Table';

INSERT INTO `tiny_id_token` (`id`, `token`, `biz_type`, `remark`, `create_time`, `update_time`) VALUES ('1'.'0f673adf80504e2eaa552f5d791b644c'.'order'.'1'.'the 2017-12-14 16:36:46'.'the 2017-12-14 16:36:48');

INSERT INTO `tiny_id_info` (`id`, `biz_type`, `begin_id`, `max_id`, `step`, `delta`, `remainder`, `create_time`, `update_time`, `version`) VALUES ('1'.'order'.'1'.'1'.'100000'.'1'.'0'.'the 2018-07-21 23:52:58'.'the 2018-07-22 23:19:27'.'1');

Copy the code

And configure the data source information of the upper table in Tinyid service

datasource.tinyid.primary.url=jdbc:mysql://47.936..e:3306/ds0? autoReconnect=true&useUnicode=true&characterEncoding=UTF- 8 -
datasource.tinyid.primary.username=root
datasource.tinyid.primary.password=root
Copy the code

Maven install, right click TinyIdServerApplication to start service, Tinyid distributed ID generation service is set up.

User-defined Tinyid primary key type

Add the tinyID. server and tinyID. token properties to the tinyID. server and tinyID. token properties.

ID # # tinyid distributed service address tinyid. Server = 127.0.0.1:9999 # business token tinyid. F673adf80504e2eaa552f5d791b644c token = 0Copy the code

It’s easier to get the ID in the code with just one line of code, and the business type Order is the pre-inserted data from the previous SQ L.

Long id = TinyId.nextId("order");
Copy the code

We started the custom Tinyid primary key generation type of implementation class TinyIdShardingKeyGenerator.

/ * * *@Author: xiaofu
 * @Description: Custom primary key generator */
@Component
public class TinyIdShardingKeyGenerator implements ShardingKeyGenerator {
    
    /** * Custom generation scheme type */
    @Override
    public String getType(a) {
        return "tinyid";
    }

    /** * core method - generate primary key ID */
    @Override
    publicComparable<? > generateKey() { Long id = TinyId.nextId("order");
        
        return id;
    }

    @Override
    public Properties getProperties(a) {
        return null;
    }

    @Override
    public void setProperties(Properties properties) {}}Copy the code

And enable Tinyid primary key generation type in the configuration file, to this configuration is completed, quickly test it.

# primary key field spring. Shardingsphere. Sharding. Name t_order. Key - the generator. The column = # order_id primary key ID generation scheme spring.shardingsphere.sharding.tables.t_order.key-generator.type=tinyidCopy the code

Test the Tinyid primary key

SQL > insert order records into SQL > insert order records into SQL > insert order records into SQL > insert order records into SQL > insert order records into SQL > insert order records into SQL > insert order records into SQL > insert order records into SQL > insert order records

conclusion

The following eight generation methods refer to the “9 Distributed ID generation schemes” on demand access, the overall relatively simple here will not be implemented in sequence.

GitHub address: github.com/chengxy-nds…

If useful to you, welcome to look at, like, forward, your recognition is my biggest motivation.

I sorted out hundreds of technical e-books and gave them to my friends. Concern public number reply 666 self claim. I set up a technology exchange group with some friends to discuss technology and share technical information, aiming to learn and progress together. If you are interested, please join us!