👨💻 blog home page: author home page 👍 If you think the article is good, you can click to like it and follow us
This is the second day of my participation in the First Challenge 2022
Description of production policies for common primary key IDS in database and table
Introducing any technology will have a certain risk, database and table will not be an exception. X_order_n = x_ORDER_N; x_ORDER_N = x_ORDER_N; x_ORDER_N = x_ORDER_N; x_ORDER_N = X_ORDER; There is no way to satisfy the globally unique primary key requirement of the sub-table.
Although this can be solved by using the auto-increment primary key initial value and step size of the shard table, it is not desirable because this will lead to higher operation and maintenance costs and poor scalability.
Common ID solutions in the industry
UUID: Very high performance, no network consumption. Disadvantages are unordered strings, which do not have the tendency to increase, and UUID is too long to store, wasting storage space, so many scenarios are not used
Redsi number generator: using Redis INCR and INCRBY to achieve atomic operation, thread safety, performance is stronger than Mysql. The disadvantage is that it needs to occupy network resources and increase the system complexity.
Snowflake algorithm: It is twitter open source distributed ID generation algorithm, code implementation is simple, and does not occupy broadband, data migration will not be affected, generated IDS contain time stamps, so generated ids increase according to the time, deploy multiple servers, need to ensure that the system time is the same. The disadvantage is that it relies on the system clock.
UUID
RandomUUID () : randomUUID() : randomUUID() : randomUUID()) Although UUID can be globally unique, it is not recommended to be used as a primary key, because we know that in real business, primary keys are generally integers and UUID is generated as a 32-bit string. It will cost MYSQL a lot of performance, and MYSQL also recommends that the primary key be as short as possible. The disorder of the UUID primary key will also lead to frequent changes in data location, which seriously affects MYSQL performance.
public final class UUIDShardingKeyGenerator implements ShardingKeyGenerator {
private Properties properties = new Properties();
public UUIDShardingKeyGenerator() {
}
public String getType() {
return "UUID";
}
public synchronized Comparable<?> generateKey() {
return UUID.randomUUID().toString().replaceAll("-", "");
}
public Properties getProperties() {
return this.properties;
}
public void setProperties(Properties properties) {
this.properties = properties;
}
}
Copy the code
Snowflake algorithm
Snowflake algorithm is the default primary key generation scheme, to generate a 64-bit long integer data. The primary key generated by the snowflake algorithm in sharding-jdbc is mainly composed of four parts: 1bit symbol bit, 41bit timestamp bit, 10bit machine id, and 12bit serial number.
Sign bit (1bit)
In Java, the highest bit of Long is the sign bit, that is, the positive number is 0, the negative number is 1, generally positive id generation, so the default is 0.
Timestamp bit (41bit)
The total number of milliseconds in a year is 1000L * 60 * 60 * 24 * 365. The total number of milliseconds in a year is 1000L * 60 * 60 * 24 * 365. The total number of milliseconds in a year is 1000L * 60 * 60 * 24 * 365.
Working machine ID (10bit)
Represents a unique work process, its default value is 0, can be the key – generator. Props. Worker. Id to be set
Serial number bit (12bit)
You can allow 2^12=4096 ids to be generated in the same millisecond, theoretically generating 4 million ids in a second.
The clock back
After understanding the composition of the primary key of the Snowflake algorithm, it can be found that it is a heavily dependent server time algorithm. At this time, relying on server time will encounter a tricky problem is – clock callback.
Why does the clock go back
As far as we know, there is a network time protocol on the Internet called NTP, which is used to synchronize or calibrate the time of the network computer. That’s why cell phones don’t have to manually check the time these days.
If the hardware time is fast or slow due to some reasons, the NTP service needs to be used to calibrate the time. During the calibration, the server clock may jump or dial back.
How does snowflake algorithm solve the clock callback problem
As mentioned above, the server clock rollback problem can cause duplicate ids to be generated, so the SNOWFLAKE solution improves the SNOWFLAKE algorithm by adding a maximum number of milliseconds for clock rollback to be tolerated.
If the clock callback time exceeds the maximum allowable milliseconds threshold, an error is reported. If it is within tolerable limits, a distributed primary key generator is used by default, waiting for the clock to synchronize to the last primary key generation time before continuing to work.
Maximum tolerance of the clock back to dial the number of milliseconds the default is 0, Max. Tolerate.. The time difference. The milliseconds to set.
public final class SnowflakeShardingKeyGenerator implements ShardingKeyGenerator{ @Getter @Setter private Properties properties = new Properties(); public String getType() { return "SNOWFLAKE"; } public synchronized Comparable<? > generateKey () {/ * * * the current system time milliseconds * / long currentMilliseconds = timeService. GetCurrentMillis (); /** * Determine whether the time difference needs to be tolerated. If so, wait for the time difference to pass. And then gets the current system time * / if (waitTolerateTimeDifferenceIfNeed (currentMilliseconds)) {currentMilliseconds = timeService.getCurrentMillis(); } /** * if (lastMilliseconds == currentMilliseconds) {/** * &bits and operators: If the corresponding bits are both 1, the result is 1; otherwise, the result is 0 * When the sequence is 4095, the new sequence after 4095+1 performs bit-sum operation with the mask and the result is 0 * When the sequence is other values, the bit-sum operation will not be 0 * that is, the maximum value 4096 has been used in this sequence. If (0L == (sequence = (sequence + 1) &sequence_mask)) {currentMilliseconds = waitUntilNextTime(currentMilliseconds); }} else {/** * the last millisecond has passed, reset the sequence value to 1 */ vibrateSequenceOffset(); sequence = sequenceOffset; } lastMilliseconds = currentMilliseconds; /** * XX...... XX000000 00000000 00000000 time XX XX XX * * XXXXXX XXXX0000 00000000 machine ID XXXX XXXXXXXX serial number XX * | a or operation of three parts: If the corresponding bits are 0, then the result is 0, Otherwise to 1 * / return ((currentMilliseconds - EPOCH) < < TIMESTAMP_LEFT_SHIFT_BITS) | (getWorkerId () < < WORKER_ID_LEFT_SHIFT_BITS) | sequence; } / * * * determine whether need to wait for tolerance time * / @ SneakyThrows private Boolean waitTolerateTimeDifferenceIfNeed (final long currentMilliseconds) If (lastMilliseconds <= currentMilliseconds) {return false; if (lastMilliseconds <= currentMilliseconds) {return false; } /** * === => clock rollback (sequence generation time is longer than the current system time), Need to wait for time * / / * * * to get ID last milliseconds when minus the current system time in milliseconds time * / long timeDifferenceMilliseconds = lastMilliseconds - currentMilliseconds; /** * The time difference is less than the maximum tolerance time difference, That is currently within the time of the clock back to dial * / Preconditions. The checkState (timeDifferenceMilliseconds < getMaxTolerateTimeDifferenceMilliseconds (), "Clock is moving backwards, last time is %d milliseconds, current time is %d milliseconds", lastMilliseconds, currentMilliseconds); / Thread to sleep time * * * * / Thread. Sleep (timeDifferenceMilliseconds); return true; } private Long getWorkerId() {long result = long.valueof (properties.getProperty("worker.id", String.valueOf(WORKER_ID))); Preconditions.checkArgument(result >= 0L && result < WORKER_ID_MAX_VALUE); return result; } private int getMaxTolerateTimeDifferenceMilliseconds() { return Integer.valueOf(properties.getProperty("max.tolerate.time.difference.milliseconds", String.valueOf(MAX_TOLERATE_TIME_DIFFERENCE_MILLISECONDS))); } private long waitUntilNextTime(final long lastTime) { long result = timeService.getCurrentMillis(); while (result <= lastTime) { result = timeService.getCurrentMillis(); } return result; }}Copy the code
You can see that the last primary key generated milliseconds is compared to the currentMilliseconds. If the generated time is larger than the currentMilliseconds, it indicates a clock callback. Then it will judge the difference between the two times and see if it is set within the maximum tolerance time range. If the time of the sleep difference is greater than the difference, it will directly report the abnormality.
Baidu UidGenerator
UidGenerator algorithm is an improved version of the snowflake algorithm. The UidGenerator consists of: sign(1bit)+ Delta seconds(28bits)+ Worker node ID (22bits)+ Sequence (13bits).
The UidGenerator guarantees that a given machine is unique to a concurrent sequence at the same time and generates a 64bits unique ID.
Unlike the Snowflake algorithm, UidGenerator can customize the timestamp, the number of bits in each part of the working machine ID and serial number for different scenarios.
1. Sign: indicates a fixed symbol identifier. The generated UID is a positive number.
2. Delta seconds: indicates the current time
3. Worker ID: machine ID. The built-in implementation is allocated by the database at startup, and the default allocation strategy is: discard after use.
4. Sequence: indicates the sequence of concurrent requests per second. The 13bits support 8192 concurrent requests per second.
UidGenerator two modes
DefaultUidGenerator: Implemented by DefaultUidGenerator. The clock callback problem is relatively simple, and the number of bits occupied by the field is adjusted according to service conditions.
CachedUidGenerator: CachedUidGenerator is a refinement of DefaultUidGenerator that takes advantage of RingBuffer, which is essentially an array in which each item is called slot. The CachedUidGenerator is designed to have two ringBuffers, one for unique ids and one for flags.
The CachedUidGenerator circumvent clock issues and enhance uniqueness in the following centralized ways
1. Autoincrement column: The workerID is initialized at each restart and is the database’s autoincrement ID. This is perfect for each instance to obtain a different WorkerID and does not cause conflicts
2. RingBuffer: it is not necessary to calculate the distributed ID each time the ID is obtained, but to use the RingBuffer data structure to generate several distributed ids in advance to save
3. Increasing time: Snowflake algorithms like currentTimeMillis take the time and compare, which is very server dependent, whereas CachedUidGenerator takes the time type AtomicLong and takes the next time by incrementAndGet. This eliminates server time dependency, and the clock rollback problem can be solved.
The CachedUidGenerator uses caching to pre-generate a list of unique ids. This takes time away from unique ids, but the downside is that it takes memory to cache a portion of the ID. If there is not a lot of traffic, The timestamp of a pre-generated UID may be very old.
🤞 author Xiao SAN is just graduated from the full stack engineer, wrote the technical article is basically in the process of learning notes sorted out, you can give little brother a little praise if you like after reading. 🎁 fan benefits: learning materials, resume templates are all click to receive