This article has participated in the good article call order activity, click to see: back end, big front end double track submission, 20,000 yuan prize pool for you to challenge!

Distributed unique ID

  • When using RocketMQ, you need to use distributed unique ids
  • Messages can be duplicated, so to be idempotent on the consumer side, the producer must have a uniqueness in order to achieve business idempotentID,The following conditions must be met:
    • The same business scenario must be globally unique
    • The ID must be generated at the sender of the message to send to MQ
    • The consumer uses the ID to determine whether it is repeated to ensure idempotentiality
  • Where the ID is generated and the consumer is judged to be idempotent is independent of the ID. The characteristics of this ID need to be guaranteed:
    • Locally or even globally unique
    • Increasing trend

Snowflake algorithm

  • Snowflake is Twitter’s open source distributed ID generation algorithm, and the result is aLongType ID, the core idea is:
    • Use the 1 bit as the sign bit, which is determined to be 0 to indicate positive
    • Use 41 bits as the number of milliseconds
    • Use 10 bits for the machine ID: the high 5 bits are the data center ID and the low 5 bits are the machine ID
    • Using 12 bits as the sequence number for milliseconds means that each node can generate 4096(2^12^) ids per second

The algorithm is implemented by binary operation, which can be generated at most in a single machine per second1000 * 2 ^ (12),409.6Wan an ID

SnowflakeIdWorker

  • The Snowflake algorithm is implemented in Java.
/** * Twitter_Snowflake

* SnowFlake :

* 0-0000000000 0000000000 0000000000 0000000000 0-00000 -















Instead of storing the current time slice, the 41-bit time slice stores the difference between the current time slice and the start time slice *. In this case, the start time slice is usually the start time of our ID generator. Specified by our program (see the startTime attribute of the IdWorker class below). T = (1L << 41)/(1000L * 60 * 60 * 24 * 365) = 69

* 10-bit data machine bit, can be deployed on 1024 nodes, Supports 4096 datacenterId numbers per millisecond (same machine, same time) generated by each node

* adds up to exactly 64 bits and is a Long type.

* The advantages of SnowFlake are that it is an overall time-increment-ordered system with no ID collisions (differentiated by data center ID and machine ID) in the entire distributed system, and it is highly efficient. In tests, SnowFlake can generate around 260,000 ids per second. * /
public class SnowflakeIdWorker { // ==============================Fields=========================================== /** Start time (2015-01-01) */ private final long twepoch = 1420041600000L; /** The number of digits in the machine ID */ private final long workerIdBits = 5L; /** The number of bits of the data id */ private final long datacenterIdBits = 5L; /** The maximum machine ID supported is 31 (this shift algorithm can quickly calculate the maximum decimal number that can be represented by several binary digits) */ private final long maxWorkerId = -1L ^ (-1L << workerIdBits); /** The maximum data id supported is 31 */ private final long maxDatacenterId = -1L ^ (-1L << datacenterIdBits); /** The number of bits in the id */ private final long sequenceBits = 12L; /** Move the machine ID to the left by 12 bits */ private final long workerIdShift = sequenceBits; /** Move the data id 17 bits (12+5) to the left */ private final long datacenterIdShift = sequenceBits + workerIdBits; /** Shift the time slice to the left by 22 bits (5+5+12) */ private final long timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits; /** Generate the mask of the sequence, which is 4095 (0b111111111111= 0xFFf =4095) */ private final long sequenceMask = -1L ^ (-1L << sequenceBits); /** ID of machine (0~31) */ private long workerId; /** Data center ID(0 to 31) */ private long datacenterId; /** Sequence in milliseconds (0~4095) */ private long sequence = 0L; /** The last time the ID was generated */ private long lastTimestamp = -1L; //==============================Constructors===================================== /** * constructor *@paramWorkerId Indicates the job ID (0 to 31)@paramDatacenterId indicates the datacenterId (0 to 31) */ public SnowflakeIdWorker(long workerId, long datacenterId) { if (workerId > maxWorkerId || workerId < 0) { throw new IllegalArgumentException(String.format("worker Id can't be greater than %d or less than 0", maxWorkerId)); } if (datacenterId > maxDatacenterId || datacenterId < 0) { throw new IllegalArgumentException(String.format("datacenter Id can't be greater than %d or less than 0", maxDatacenterId)); } this.workerId = workerId; this.datacenterId = datacenterId; } // ==============================Methods========================================== /** * get the next ID (this method is thread safe) *@return SnowflakeId */ public synchronized long nextId(a) { long timestamp = timeGen(); // If the current time is less than the timestamp generated by the last ID, the system clock has rolled back this time should throw an exception if (timestamp < lastTimestamp) { throw new RuntimeException( String.format("Clock moved backwards. Refusing to generate id for %d milliseconds", lastTimestamp - timestamp)); } // If they are generated at the same time, the sequence in milliseconds is performed if (lastTimestamp == timestamp) { sequence = (sequence + 1) & sequenceMask; // Sequence overflow in milliseconds if (sequence == 0) { // Block until the next millisecond to get the new timestamptimestamp = tilNextMillis(lastTimestamp); }}// The timestamp changes and the sequence resets within milliseconds else { sequence = 0L; } // The last time the ID was generated lastTimestamp = timestamp; // Shift and put together by the or operation to form a 64-bit ID return ((timestamp - twepoch) << timestampLeftShift) // | (datacenterId << datacenterIdShift) // | (workerId << workerIdShift) // | sequence; } /** * block until the next millisecond until a new timestamp is obtained *@paramLastTimestamp Timestamp of the last ID generation *@returnCurrent timestamp */ protected long tilNextMillis(long lastTimestamp) { long timestamp = timeGen(); while (timestamp <= lastTimestamp) { timestamp = timeGen(); } return timestamp; } /** * Returns the current time in milliseconds@returnCurrent time (ms) */ protected long timeGen(a) { return System.currentTimeMillis(); } //==============================Test============================================= / * * * / test public static void main(String[] args) { SnowflakeIdWorker idWorker = new SnowflakeIdWorker(0.0); for (int i = 0; i < 1000; i++) { longid = idWorker.nextId(); System.out.println(Long.toBinaryString(id)); System.out.println(id); }}}Copy the code
  • Advantages:
    • Rapid formation
    • Simple implementation, no redundant dependencies
    • Can adjust each bit segment according to the actual situation, convenient and flexible
  • Disadvantages:
    • It can only trend up
    • Machine time dependent. A callback may result in duplicate ids being generated

SnowFlake algorithm time callback question:

  • Time callback Cause:
    • Due to business needs, the machine needs to synchronize the time server
  • Time callback problem solution:
    • When the callback time is less than 15ms, you can wait until the callback time catches up
    • When the callback time is longer than 15ms, you can solve the callback problem by replacing the workId to generate an Id that has not been generated before
  • Steps:
    • First, adjust the number of bits in the workId to 15

    • thenSnowflakeIdWorkerRealize adjustment bit segment
      • Use the 1 bit as the sign bit, that is, the generated distributed I only d is positive
      • Use the 38-bit timestamp to represent the increment of the current time relative to the initial time in milliseconds
      • Use 15 bits as machine ID and support up to 32,800 nodes
      • Using 10 bits as sequence numbers for milliseconds, it is theoretically possible to generate 2^10^ sequence numbers
    • Because of the stateless relationship of the service, normallyworkIdWill not be configured in a specific configuration file, you can choose centralized hereRedisAs central storage:
      • The redundant 30,000 workids after the workId bit adjustment are placed in a Queue based on Redis for centralized management of workids
      • Each time the node starts, it checks to see if there is a workId on the local node. If there is a workId, it is used as the workId. If not, the workId is removed from the queue and used as a workId
      • When it finds that the callback is too much, it queues another workId when the new workId is used, and stores the workId from the previous callback situation in the queue. Because the queue is removed from the beginning and inserted from the end each time, this avoids the possibility that the workId just used by machine A will be obtained by machine B again
      • If you use Redis, you will encounter a new small problem: how to guarantee the consistency of Redis? What if Redis dies? How to synchronize?
  • From the base component perspective, when the SnowflakeIdWorker algorithm encounters a callback problem, it only needs to throw an exception, which keeps the implementation of the algorithm simple
  • You can also refer to the UID-Generator method: take a batch of workids at a time, and then batch them after the centralization, which can solve the performance problem of each node accessing the centralized machines.