Abstract: Original source wechat public account “craftsman little pig technology world” welcome to reprint, keep the abstract, thank you!

This paper is mainly based on the real experience of pit treading

  • 1. Exception Overview
  • 2. Cause analysis
    • 2.1 How Snowflake works
    • 2.2 Fault Locating
    • 2.3 Excluding Clock Callback
    • 2.4 research workerid
    • 2.5 suspects
  • 3. Solutions
    • 3.1 HostNameKeyGenerator
    • 3.2 IPSectionKeyGenerator
    • 3.3 Use the globally unique ID service developed by our team
    • 3.4 Other Ideas
  • 4. Thank you
  • 666. The eggs

🙂🙂🙂 Follow ** wechat public account: [Artisan piglet technology world] ** Have welfare:

  1. Any questions you may have about the source code will be answered carefully. Even do not know how to read the source can also ask oh.
  2. New source code parsing articles are notified in real time. It will be updated every two weeks or so.

1. Exception Overview

On the afternoon of January 26, 2018, an exception occurred in the database insertion operation of the student feedback service of the credit team of the business side, and the abnormal information showed that the primary key of the database was repeated:

After careful analysis of the user’s duplicate primary key ID, machine list, and snowflake algorithm, machine 55 was dropped and the exception was resolved.

The anomaly seems ordinary, but a careful analysis of the consequences may be more serious. (1) A wide range of influence, a long time. At present, a large number of businesses have adopted the primary key generation strategy of The Snowflake algorithm. If business and operation and maintenance students do not know the snowflake algorithm, it will take a lot of time to analyze and troubleshoot this problem, resulting in certain business losses. (2) There are potential risks. In addition to such Workid problems, Snowflake algorithm also relies heavily on the machine clock. If the clock is dialed back on the machine, the number will be repeated or the service will be unavailable.

2. Cause analysis

Why duplicate database primary keys? To answer this question, we need to explain how the snowflake algorithm, one of the main algorithms for primary key generation strategies, works.

2.1 How Snowflake works

For distributed ID generation, the Flake algorithm represented by Twitter Snowflake is an algorithm that divides the namespace into parallel generation. The generated data is 64-bit long data, and the value should be stored in the database with numeric type fields greater than or equal to 64-bit. For example, BIGINT should be used in MySQL.

On June 1, 2010 (less than four months after the Flickr post), Ryan King wrote on Twitter’s Blog:

  • The Ticket Servers scheme lacks a guarantee of order
  • UUID was considered, but 128-bit was too long
  • E has also considered using what ZooKeeper providesUnique NamingDescription Unique Naming provided by Seuence Nodes, but the performance is insufficient. (Sequence Nodes is designed to solve the problem of distributed locks, but not solve the problem of ID generation, which requires high performance. It is a Hack behavior to directly apply it.)

In this case, Twitter gives a 64-bit long Snowflake, which has the following structure:

  • E1-bit reserved
  • E41-bit timestamp
  • E10-bit machine id
  • E12-bit sequence

Less than four years later, on May 31, 2014, Twitter updated Snowflake’s README, stating two easily overlooked truths:

“We have retired the initial release of Snowflake …”

“… heavily relies on existing infrastructure at Twitter to run. “

As you can see, the minimum granularity supported by this solution is “millisecond * thread”, and the capacity of a single thread (Snowflake’s equivalent is Worker) is 12 bits per second, which is close to 4096.

Snowflake is not only a solution, but also a time-dependent self-increasing sequence of ids based on Long length. As a result, many companies have adapted it to their own scenarios. Other algorithms in the Snowflake family include Instagram Snowflake, Simpleflake, Boundary Flake and more.

At present, the industry uses Dangdang Liangge sharding-JDBC, and generally adopts its built-in Snowflake algorithm. As for the secondary transformation, I will cite an example proposed by 58 Shen Jian in the “Path of Architects” series.

2.2 Fault Locating

After receiving the feedback from the business side, I immediately asked the business side three questions in succession:

  • Is there anything special about your service?
  • Did it happen during the reboot?
  • Can you check the machine that corresponds to the duplicate record and see if the duplicate only occurs in this IP segment?

The business side also gave feedback at the first time

  • Just plain old microservices, plain old machines
  • It didn’t happen during reboot
  • Sure enough, there was a problem on both machines!

Ok, then I located the problem of workID and immediately advised the business side to take down one of them. Why workID instead of clock callback etc., let me explain.

2.3 Excluding Clock Callback

We all know the disadvantages of the Snowflake algorithm:

  • It depends on the machine clock. If the machine clock is rolled back, duplicate ids will be generated
  • On single machine is increasing, but due to the design to the distributed environment, the clock on each machine can’t completely in sync, can appear sometimes not increasing global situation (this drawback can think doesn’t matter, generally distributed ID requirements increasing trend, only will not strict demands increasing to 90% of the demand are increasing trend)

We are using sharding-JDBC 1.4.2 (sJ before 1.5 is not mature, and 2.x is highly recommended). Can directly use the com. Dangdang. Ddframe. RDB. Sharding. Id. The generator. Self. IPIdGenerator primary key generation, The serial number is generated using com.dangdang.ddframe.rdb.sharding.id.generator.self.Com monSelfIdGenerator, special recommend readers here read comments in front of these two classes, written very clearly.

/** * Obtain the working process Id based on the IP address of the online machine. If the last 10 bits of the IP address of the online machine are not the same, use this method *. For example, if the IP address of the online machine is 192.168.1.108, the binary IP address is 11000000 10101000 00000001 01101100 * * @author DonneyYoung */ public class IPIdGenerator implements IdGenerator {Copy the code

The above comment is very useful and is posted here and covered in more detail later. IPIdGenerator is used for serial number generation. Dangdang’s implementation algorithm is as follows:

/** * self-generated Id generator. ** <p> * the length is 64bit (from highest to lowest) * </p> ** <pre> * 1bit symbol bit * 41bits indicates the time offset from 00:00 on November 1, 2016 to now * 10bits Worker process Id * 12bits Self-increment in the same millisecond * </pre> * * <p> * Worker process Id get priority: System variables {@ code SJDBC. Self. Id. The generator. The worker. The id} {@ code SJDBC_SELF_ID_GENERATOR_WORKER_ID} * is greater than the environment variables, also can call @ {@ code CommonSelfIdGenerator. SetWorkerId} < / p > set * * * @ author gaohongtao * / @ Getter @ Slf4j public class CommonSelfIdGenerator implements IdGenerator { public static final long SJDBC_EPOCH; Private static final Long SEQUENCE_BITS = 1; private static final Long SEQUENCE_BITS = 1; Private static final Long WORKER_ID_BITS = 10L; // Process ID Bit private static final Long SEQUENCE_MASK = (1 << SEQUENCE_BITS) -1; Private static final Long WORKER_ID_LEFT_SHIFT_BITS = SEQUENCE_BITS; private static final Long WORKER_ID_LEFT_SHIFT_BITS = SEQUENCE_BITS; Private static final Long TIMESTAMP_LEFT_SHIFT_BITS = WORKER_ID_LEFT_SHIFT_BITS + WORKER_ID_BITS; Private static final Long WORKER_ID_MAX_VALUE = 1L << WORKER_ID_BITS; / / work process ID maximum @ Setter private static AbstractClock clock. = AbstractClock systemClock (); @Getter private static long workerId; Static {Calendar = calendar.getInstance (); calendar.set(2016, Calendar.NOVEMBER, 1); calendar.set(Calendar.HOUR_OF_DAY, 0); calendar.set(Calendar.MINUTE, 0); calendar.set(Calendar.SECOND, 0); calendar.set(Calendar.MILLISECOND, 0); SJDBC_EPOCH = calendar.getTimeInMillis(); initWorkerId(); } private long sequence; Private long lastTime; // Generate the numbered timestamp in milliseconds static voidinitWorkerId() {
        String workerId = System.getProperty("sjdbc.self.id.generator.worker.id");
        if(! Strings.isNullOrEmpty(workerId)) {setWorkerId(Long.valueOf(workerId));
            return;
        }
        workerId = System.getenv("SJDBC_SELF_ID_GENERATOR_WORKER_ID");
        if (Strings.isNullOrEmpty(workerId)) {
            return;
        }
        setWorkerId(Long.valueOf(workerId)); ** @param workerId Id of a workerprocess */ public static voidsetWorkerId(final Long workerId) { Preconditions.checkArgument(workerId >= 0L && workerId < WORKER_ID_MAX_VALUE); CommonSelfIdGenerator.workerId = workerId; } /** * generates Id. ** @returnReturn @{@link Long} type Id */ @override public synchronized NumbergenerateId() {// Ensure that the current time is greater than the last time. Id long time = clock.millis(); Preconditions.checkState(lastTime <= time,"Clock is moving backwards, last time is %d milliseconds, current time is %d milliseconds", lastTime, time); // Get the serial numberif (lastTime == time) {
            if (0L == (sequence = ++sequence & SEQUENCE_MASK)) {
                time = waitUntilNextTime(time); }}else{ sequence = 0; } // Set the last timestamp lastTime = time;if (log.isDebugEnabled()) {
            log.debug("{} - {} - {}", new SimpleDateFormat("yyyy-MM-dd HH:mm:ss.SSS").format(new Date(lastTime)), workerId, sequence); } // Generate the numberreturn((time - SJDBC_EPOCH) << TIMESTAMP_LEFT_SHIFT_BITS) | (workerId << WORKER_ID_LEFT_SHIFT_BITS) | sequence; } // Keep getting time until it is greater than the last time private longwaitUntilNextTime(final long lastTime) {
        long time = clock.millis();
        while (time <= lastTime) {
            time = clock.millis();
        }
        returntime; }}Copy the code

Clock is moving backwards balabalabala IllegalStateException It also does waitUntilNextTime. If the standalone is a clock rollback, the application is restarted or the time is rolled back. However, the online server is running fine and no one has touched it, so it is not a standalone clock rollback problem. To consider whether the cluster callback, in the same way, if the probability of different workerid duplicate primary key basic impossible, and I also carefully compare the two machines this problem time basic consistent, so although the machine of our company is a gradual time management (it is said that some companies operations made a synchronization script directly, if the clock is not synchronized script, It is asymptotically synchronized and can produce duplicate ids), and should not be the cause of a large number of such problems (users have reported to me dozens of duplicate primary key ids). Debug mode allows you to see your workerId, sequence, and so on, but when running online, which system will give you the primary key ID? Online chicken ribs, chicken ribs chicken ribs, tasteless food, abandon the flavor. For how to get a workId, see the next section.

2.4 the WorkId

As shown in the figure below, this is the entire snowflake algorithm we discussed:

  • First bit reserved
  • The timestamp, 41 bits, the number of milliseconds from zero on November 1, 2016 to the present, will last for 2156 years and will run out in more than 100 years
  • Machine ID. 10 bits. This machine ID must be unique for each business
  • The serial number is 12 bits, and each machine can generate up to 4096 ids per millisecond, until the next millisecond

We detail the implementation of the snowflake algorithm in Section 2.3:

 public synchronized Number generateId() {
        long time = clock.millis();
        Preconditions.checkState(lastTime <= time, "Clock is moving backwards, last time is %d milliseconds, current time is %d milliseconds", lastTime, time);
        if (lastTime == time) {
            if (0L == (sequence = ++sequence & SEQUENCE_MASK)) {
                time = waitUntilNextTime(time); }}else {
            sequence = 0;
        }
        lastTime = time;
        if (log.isDebugEnabled()) {
            log.debug("{} - {} - {}", new SimpleDateFormat("yyyy-MM-dd HH:mm:ss.SSS").format(new Date(lastTime)), workerId, sequence);
        }
        return ((time - SJDBC_EPOCH) << TIMESTAMP_LEFT_SHIFT_BITS) | (workerId << WORKER_ID_LEFT_SHIFT_BITS) | sequence;
    }
Copy the code
  • TIMESTAMP_LEFT_SHIFT_BITS Indicates that the timestamp is shifted 22 bits to the left
  • WORKER_ID_LEFT_SHIFT_BITS Work machine ID moved 12 bits to the left
  • Last 12-digit serial number

Take a closer look at the implementation of workID based on IPIdGenerator

CommonSelfIdGenerator.setWorkerId((long) (((ipAddressByteArray[ipAddressByteArray.length - 2] & 0B11) << Byte.SIZE) + (ipAddressByteArray[ipAddressByteArray.length - 1] & 0xFF)));
Copy the code

You can clearly see that this is the last two segments of the IP, 10 bits, exactly as I emphasized in the comments

Obtain the working process Id based on the IP address of the online machine. If the last 10 bits of the IP address of the online machine are not the same, use this method. For example, if the IP address of the online machine is 192.168.1.108, the binary IP address is 11000000 10101000 00000001 01101100, and the last 10 bits are cut off 01 01101100. Convert to decimal 364. Set workerId to 364.

Then we converted our troubled machines XXX.XXX.209.55, XXX.XXX.161.55(for safety data desensitization) into workID and found that they were all the same. Of course, you can also run out the workID yourself using the following code.

@Test
    public void generateId3() throws UnknownHostException, NoSuchFieldException, IllegalAccessException {
        String ipv4s =
//                "161.55 \ n XXX. XXX."
                "209.55 \ n XXX. XXX."
//                "209.126 \ n XXX. XXX." +
//                "XXX. XXX. 209.127 \ t \ n" +
//                "208.227 \ n XXX. XXX." +
//                "148.134 \ n XXX. XXX." +
//                "XXX. XXX. 148.135 \ t \ n" +
//                "XXX. XXX. 148.132 \ t \ n" +
//                        "XXX. XXX. 148.133";
        ;
        for (String ipv4 : ipv4s.split("\n")) {
            ipv4 = ipv4.replaceAll("\t"."").trim();

            byte[] ipv4Byte = new byte[4];
            String[] ipv4StingArray = ipv4.split("\ \.");
            for (int i = 0; i < 4; i++) {
                ipv4Byte[i] = (byte) Integer.valueOf(ipv4StingArray[i]).intValue();
            }
            address = InetAddress.getByAddress("dangdang-db-sharding-dev-233", ipv4Byte);
            PowerMockito.mockStatic(InetAddress.class);
            PowerMockito.when(InetAddress.getLocalHost()).thenReturn(address);

            IPKeyGenerator.initWorkerId();
//            IPSectionKeyGenerator.initWorkerId();
            Field workerIdField = DefaultKeyGenerator.class.getDeclaredField("workerId");
            workerIdField.setAccessible(true);
            System.out.println(ipv4 + "\t"+ workerIdField.getLong(DefaultKeyGenerator.class)); }}Copy the code

Alternatively, we can invert the workerID based on all the duplicate ids provided by the user

psvm+sout+(ID/4096%1024)
Copy the code

/4096 = 2 ^ 12; %1024 = 2 ^ 10; /4096 = 2 ^ 12; %1024 = 2 ^ 10 At this point, we can conclude that workerID duplication causes primary key duplication online

2.5 suspects

How could the 41bit timestamp and 12bit serial number collide so badly? There were nearly 100 collisions in two days. If this collision is the same as the hashMap collision, then it must be high concurrency and low probability. Why is it so frequent? So, I asked the business side another question:

  • How concurrent are you?

The feedback I received from the business side was “not much concurrency.”

This sequence is used to generate different ids in the same millisecond. If the number of ids generated in this millisecond exceeds 4096(2 ^ 12), the generator will wait until the next millisecond. From the component of THE Id, different processes must have different ids. The same process is guaranteed not to repeat through the time bit first, and if the time is the same, it is guaranteed by the sequence bit. At the same time, because the time bit is monotonically increasing, and if each server performs time synchronization roughly, the generated Id can be considered as overall order in distributed environment.

through

psvm+sout+(ID%4096)
Copy the code

Verify that the serial number of all duplicate primary key ids is 0, which completely confirms my guess.

So, since the serial numbers are incremented from 0, the conclusion is that as long as the workerID is the same and requests are made on both machines at the same time, there will be duplication, or as long as the IP ends on the line are the same, there will be duplication!!

3. Solutions

Operation and maintenance solved this problem by rolling out 55 duplicate old machines. We learned that the IP of the company’s machines were all serial numbers without such a parallel form, but this time it was caused by the coexistence of old and new machines. There are other solutions, of course.

3.1 HostNameKeyGenerator

Gets the worker process number based on the numeric number at the end of the machine name. If there is a unified specification for naming online machines, it is recommended to use this method. For example, if the HostName of the machine is dangdang-db-sharding-dev-01(company name – department name – service name – environment name – id), the last number of the HostName 01 will be truncated as the workId.

3.2 IPSectionKeyGenerator

Improved VERSION of IP generation policy.

After viewing the rules generated by the IPKeyGenerator worker process number, I find that the last 10 digits (especially IPV6) of the server IP address are relatively constrained. The optimization idea is as follows: Because the maximum number of working process is 2^10, we can generate the project process number as long as it is less than 1024. 1. For IPV4:…. The MAXIMUM IP address is 255.255.255.255. And (255+255+255+255) < 1024. . Therefore, IP segment values can be added to generate a unique workerId, which is not limited by IP bits.

  1. In view of the IPV6:… IP FFFF: largest FFFF: FFFF: FFFF: FFFF: FFFF: FFFF: FFFF… To ensure that the project process number generated by the sum is less than 1024, the idea is to add the last six bits of each Bit. To some extent, this can also satisfy the workerId non-duplication problem. To use this method of generating worker process numbers from IP, it is necessary to ensure that IP segments do not add up repeatedly

For IPV6:2^ 6 = 64. 64 x 8 = 512 < 1024.

// IPSectionKeyGenerator.java
static void initWorkerId(a) {
   InetAddress address;
   try {
       address = InetAddress.getLocalHost();
   } catch (final UnknownHostException e) {
       throw new IllegalStateException("Cannot get LocalHost InetAddress, please check your network!");
   }
   byte[] ipAddressByteArray = address.getAddress();
   long workerId = 0L;
   // IPV4
   if (ipAddressByteArray.length == 4) {
       for (byte byteNum : ipAddressByteArray) {
           workerId += byteNum & 0xFF;
       }
   // IPV6
   } else if (ipAddressByteArray.length == 16) {
       for (byte byteNum : ipAddressByteArray) {
           workerId += byteNum & 0B111111; }}else {
       throw new IllegalStateException("Bad LocalHost InetAddress, please check your network!");
   }
   DefaultKeyGenerator.setWorkerId(workerId);
}
Copy the code

For example, 254.255 and 255.254 will repeat 172.16.x.x, even though this segment is also 2 ^ 16, but still can be used, etCD or ZK based on the number is reasonable. Or as long as the company machine number is correct, HostNameKeyGenerator is very reliable.

3.3 Use the globally unique ID service developed by our team

Place to do

  • The high availability and short ID service allows multiple independent environments to be deployed. Each environment has a different ID. The client can perform failover to any environment
  • High performance, millions of ids per second, and no blocking
  • Based on the approximate order of time, the acquired ids will basically get bigger and bigger. There is no guarantee of strict order. For example, the ID obtained an hour ago should be smaller than the ID obtained an hour later
  • Consistency, absolutely guaranteed not to get duplicate ids (server guarantee)

Where you can’t

  • Strict global order cannot be guaranteed
  • There is no guarantee of time order
  • There is no guarantee that every ID is not wasted

3.4 Other Ideas

  • Meituan’s open source Leaf has managed to avoid a repeat of zooKeeper’s snowflake algorithm, which has already been used in leap seconds in 2017.
  • Another way to avoid callback is to use one or two machines with the NTP clock turned off as backup
  • Baidu’s SnakeFlow uses pre-allocation of ids to avoid this situation. Although it can’t be completely avoided, pre-allocation is actually a reasonable way
  • There are other solutions, such as adding a version number after the repeated time, and making a special school-time server, because the cluster gets the machine time and there is a time window problem, causing the generated Id number to be repeated
  • The best solution is to use the database generated ones (distributed ids generated by database number segments) and use SnakeFlow if it has little impact on the business. Since InnoDB tables can be written in the same order as the leaf nodes of the B+ tree index, the access efficiency is highest. Therefore, when considering primary keys, we usually need to increment, or trend increment.
  • Host-based, as mentioned above, is also a good solution

4. Thank you

This is the second article of the public number “the technical world of craftsman pig pig”, thanks to “taro road source code” taro in technology, typesetting and other aspects to give strong guidance!

666. The eggs

The first article of the public number, if there is a place to write incorrect, but also hope to point out, thank you.

For more information on distributed ids, read the following article:

  • Meituan — Leaf — Meituan Dianping Distributed ID Generation System
  • 58 — Distributed ID Generation in Detail
  • Impression source code — “Talk about ID”
  • Dr. Yanjiong Wang — “3 Ideas for generating globally Unique ids, from a senior architect’s summary”