This article has participated in the good article call order activity, click to see: back end, big front end double track submission, 20,000 yuan prize pool for you to challenge!

Distributed unique ID

When using RocketMQ, you need to use distributed unique ids
Messages can be duplicated, so to be idempotent on the consumer side, the producer must have a uniqueness in order to achieve business idempotentID,The following conditions must be met:
- The same business scenario must be globally unique
- The ID must be generated at the sender of the message to send to MQ
- The consumer uses the ID to determine whether it is repeated to ensure idempotentiality
Where the ID is generated and the consumer is judged to be idempotent is independent of the ID. The characteristics of this ID need to be guaranteed:
- Locally or even globally unique
- Increasing trend

Snowflake algorithm

Snowflake is Twitter’s open source distributed ID generation algorithm, and the result is aLongType ID, the core idea is:
- Use the 1 bit as the sign bit, which is determined to be 0 to indicate positive
- Use 41 bits as the number of milliseconds
- Use 10 bits for the machine ID: the high 5 bits are the data center ID and the low 5 bits are the machine ID
- Using 12 bits as the sequence number for milliseconds means that each node can generate 4096(2^12^) ids per second

The algorithm is implemented by binary operation, which can be generated at most in a single machine per second1000 * 2 ^ (12), 即409.6Wan an ID

SnowflakeIdWorker

The Snowflake algorithm is implemented in Java.

/** * Twitter_Snowflake

 * SnowFlake :

 * 0-0000000000 0000000000 0000000000 0000000000 0-00000 - 















 Instead of storing the current time slice, the 41-bit time slice stores the difference between the current time slice and the start time slice *. In this case, the start time slice is usually the start time of our ID generator. Specified by our program (see the startTime attribute of the IdWorker class below). T = (1L << 41)/(1000L * 60 * 60 * 24 * 365) = 69

 * 10-bit data machine bit, can be deployed on 1024 nodes, Supports 4096 datacenterId numbers per millisecond (same machine, same time) generated by each node 

 * adds up to exactly 64 bits and is a Long type. 

 * The advantages of SnowFlake are that it is an overall time-increment-ordered system with no ID collisions (differentiated by data center ID and machine ID) in the entire distributed system, and it is highly efficient. In tests, SnowFlake can generate around 260,000 ids per second. * /
public class SnowflakeIdWorker {

    // ==============================Fields===========================================
    /** Start time (2015-01-01) */
    private final long twepoch = 1420041600000L;

    /** The number of digits in the machine ID */
    private final long workerIdBits = 5L;

    /** The number of bits of the data id */
    private final long datacenterIdBits = 5L;

    /** The maximum machine ID supported is 31 (this shift algorithm can quickly calculate the maximum decimal number that can be represented by several binary digits) */
    private final long maxWorkerId = -1L ^ (-1L << workerIdBits);

    /** The maximum data id supported is 31 */
    private final long maxDatacenterId = -1L ^ (-1L << datacenterIdBits);

    /** The number of bits in the id */
    private final long sequenceBits = 12L;

    /** Move the machine ID to the left by 12 bits */
    private final long workerIdShift = sequenceBits;

    /** Move the data id 17 bits (12+5) to the left */
    private final long datacenterIdShift = sequenceBits + workerIdBits;

    /** Shift the time slice to the left by 22 bits (5+5+12) */
    private final long timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits;

    /** Generate the mask of the sequence, which is 4095 (0b111111111111= 0xFFf =4095) */
    private final long sequenceMask = -1L ^ (-1L << sequenceBits);

    /** ID of machine (0~31) */
    private long workerId;

    /** Data center ID(0 to 31) */
    private long datacenterId;

    /** Sequence in milliseconds (0~4095) */
    private long sequence = 0L;

    /** The last time the ID was generated */
    private long lastTimestamp = -1L;

    //==============================Constructors=====================================
    /** * constructor *@paramWorkerId Indicates the job ID (0 to 31)@paramDatacenterId indicates the datacenterId (0 to 31) */
    public SnowflakeIdWorker(long workerId, long datacenterId) {
        if (workerId > maxWorkerId || workerId < 0) {
            throw new IllegalArgumentException(String.format("worker Id can't be greater than %d or less than 0", maxWorkerId));
        }
        if (datacenterId > maxDatacenterId || datacenterId < 0) {
            throw new IllegalArgumentException(String.format("datacenter Id can't be greater than %d or less than 0", maxDatacenterId));
        }
        this.workerId = workerId;
        this.datacenterId = datacenterId;
    }

    // ==============================Methods==========================================
    /** * get the next ID (this method is thread safe) *@return SnowflakeId
     */
    public synchronized long nextId(a) {
        long timestamp = timeGen();

        // If the current time is less than the timestamp generated by the last ID, the system clock has rolled back this time should throw an exception
        if (timestamp < lastTimestamp) {
            throw new RuntimeException(
                    String.format("Clock moved backwards. Refusing to generate id for %d milliseconds", lastTimestamp - timestamp));
        }

        // If they are generated at the same time, the sequence in milliseconds is performed
        if (lastTimestamp == timestamp) {
            sequence = (sequence + 1) & sequenceMask;
            // Sequence overflow in milliseconds
            if (sequence == 0) {
                // Block until the next millisecond to get the new timestamptimestamp = tilNextMillis(lastTimestamp); }}// The timestamp changes and the sequence resets within milliseconds
        else {
            sequence = 0L;
        }

        // The last time the ID was generated
        lastTimestamp = timestamp;

        // Shift and put together by the or operation to form a 64-bit ID
        return ((timestamp - twepoch) << timestampLeftShift) //
                | (datacenterId << datacenterIdShift) //
                | (workerId << workerIdShift) //
                | sequence;
    }

    /** * block until the next millisecond until a new timestamp is obtained *@paramLastTimestamp Timestamp of the last ID generation *@returnCurrent timestamp */
    protected long tilNextMillis(long lastTimestamp) {
        long timestamp = timeGen();
        while (timestamp <= lastTimestamp) {
            timestamp = timeGen();
        }
        return timestamp;
    }

    /** * Returns the current time in milliseconds@returnCurrent time (ms) */
    protected long timeGen(a) {
        return System.currentTimeMillis();
    }

    //==============================Test=============================================
    / * * * / test
    public static void main(String[] args) {
        SnowflakeIdWorker idWorker = new SnowflakeIdWorker(0.0);
        for (int i = 0; i < 1000; i++) {
            longid = idWorker.nextId(); System.out.println(Long.toBinaryString(id)); System.out.println(id); }}}Copy the code

Advantages:
- Rapid formation
- Simple implementation, no redundant dependencies
- Can adjust each bit segment according to the actual situation, convenient and flexible
Disadvantages:
- It can only trend up
- Machine time dependent. A callback may result in duplicate ids being generated

SnowFlake algorithm time callback question:

Time callback Cause:

Due to business needs, the machine needs to synchronize the time server

Time callback problem solution:

When the callback time is less than 15ms, you can wait until the callback time catches up

When the callback time is longer than 15ms, you can solve the callback problem by replacing the workId to generate an Id that has not been generated before

Steps:

First, adjust the number of bits in the workId to 15

thenSnowflakeIdWorkerRealize adjustment bit segment

Use the 1 bit as the sign bit, that is, the generated distributed I only d is positive

Use the 38-bit timestamp to represent the increment of the current time relative to the initial time in milliseconds

Use 15 bits as machine ID and support up to 32,800 nodes

Using 10 bits as sequence numbers for milliseconds, it is theoretically possible to generate 2^10^ sequence numbers

Because of the stateless relationship of the service, normallyworkIdWill not be configured in a specific configuration file, you can choose centralized hereRedisAs central storage:

The redundant 30,000 workids after the workId bit adjustment are placed in a Queue based on Redis for centralized management of workids

Each time the node starts, it checks to see if there is a workId on the local node. If there is a workId, it is used as the workId. If not, the workId is removed from the queue and used as a workId

When it finds that the callback is too much, it queues another workId when the new workId is used, and stores the workId from the previous callback situation in the queue. Because the queue is removed from the beginning and inserted from the end each time, this avoids the possibility that the workId just used by machine A will be obtained by machine B again

If you use Redis, you will encounter a new small problem: how to guarantee the consistency of Redis? What if Redis dies? How to synchronize?

From the base component perspective, when the SnowflakeIdWorker algorithm encounters a callback problem, it only needs to throw an exception, which keeps the implementation of the algorithm simple
You can also refer to the UID-Generator method: take a batch of workids at a time, and then batch them after the centralization, which can solve the performance problem of each node accessing the centralized machines.

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Distributed ID generation solution selection! “Snowflake” is an algorithm that uses Snowflake

Distributed unique ID

Snowflake algorithm

SnowflakeIdWorker

Distributed ID generation solution selection! “Snowflake” is an algorithm that uses Snowflake

Distributed unique ID

Snowflake algorithm

SnowflakeIdWorker

Related Posts

What are the advantages and disadvantages of recommendation algorithms commonly used in short video system development

Recursion. – I don’t understand, but I’m impressed?

Win10 Permissions management and Multi-user Remote Login (multi-scheme)