An article to thoroughly understand the Snowflake algorithm and Baidu Meituan best practices

The words written in the front

At the mention of distributed ids automatically generated solutions, you certainly are very familiar with, and can immediately tell their own specialty of several schemes, indeed, ID as an important identity of system data, the importance is self-evident, and various solutions is also more generations, optimization, please allow me to use this camera classification scheme of distributed ids generated automatically:

implementation

Completely dependent on the data source approach

ID generation rules and read control are completely controlled by data sources, such as the self-growing ID and serial number of the database, or the sequence number generated by the INCR/INCRBY atomic operation of Redis.

Semi-dependent data source approach

Some generation factors must be controlled by the data source (or configuration information), such as the Snowflake algorithm.

Does not depend on the data source mode

The ID generation rule is completely calculated by machine information independently, independent of any configuration information and data records, such as common UUID, GUID, etc

practices

The practical scheme applies to the three implementation methods mentioned above and can be used as a supplement to them to improve system throughput, but the limitations of the original implementation method still exist.

Real-time acquisition scheme

As the name implies, this is generated in real time every time an ID is fetched. Simple and quick, ids are continuous, but throughput may not be the highest.

Pregeneration scheme

A batch of pre-generated ids can be stored in the datapool, which can be simply generated by self-growth, or set step size and batch generation. These pre-generated data need to be stored in storage containers (JVM memory, Redis, database tables can be). It can greatly improve the throughput, but temporary storage space needs to be created. The existing ID may be lost and the ID may be interrupted after the power failure.

Project introduction

The following is a brief introduction to the current popular distributed ID schemes

The ID of the database grows automatically

It is the most commonly used ID generation method, which is completely dependent on data source, and all ids are stored in the database. It has been widely used in the single application period. When establishing data tables, auto_INCREMENT of the database is used as the primary key, or sequences are used to fulfill some self-increasing ID requirements of other scenarios.

Advantages: very simple, sequential increment, convenient paging and sorting.
Disadvantages: After database and table are divided, the self-increasing ID of the same data table is easy to repeat and cannot be used directly (step size can be set, but the limitation is obvious); The overall performance and throughput is low. If a single database is designed to achieve data uniqueness of distributed applications, even if the pre-generation scheme is used, the high concurrency scenarios are prone to single point bottlenecks due to transaction locks.
Application scenario: Table ID of a single database instance (including primary/secondary synchronization), and some serial numbers counted by day. Do not apply to the database and table scenarios or the system-wide unique ID scenarios.

Redis generated ID

Redis INCR/INCRBY autoincrement atomic operation command can ensure that the generated ID must be unique and orderly, which is essentially consistent with the database.

Advantages: The overall throughput is higher than the database.
Disadvantages: It can be difficult to retrieve the latest ID value when the Redis instance or cluster is down.
Application scenario: Suitable for counting scenarios, such as user access, order serial number (date + serial number), etc.

UUID, GUID Generation ID

UUID: Calculated according to OSF standards, using Ethernet card address, nanosecond time, chip ID code, and many possible numbers. A combination of the following parts: Current date and time (the first part of the UUID is time dependent, if you generate a UUID a few seconds later, the first part is different and the rest is the same), clock sequence, globally unique IEEE machine id (obtained from a network card if there is one, obtained by other means if no network card exists)

GUID: Microsoft’s implementation of the UUID standard. There are other implementations of UUID, not just GUids, but not all of them.

These two are truly globally unique ids that do not rely on data sources

Advantages: independent of any data source, self-calculation, no network ID, super fast, and the world unique.
Disadvantages: Not sequential, and relatively long (128bit), as the database primary key, index will lead to lower index efficiency, occupy more space.
Application scenario: This application applies to all devices that do not have strict requirements on storage space, such as link tracing and log storage.

4. The Snowflake algorithm generates ids

Semi-dependent data source, which uses Long (64-bit) to fill in according to certain rules: Time (milliseconds) + cluster ID+ machine ID+ SERIAL number. The number of bits occupied by each part can be allocated as required. The cluster ID and machine ID depend on external parameter configurations or database records in actual application scenarios.

Advantages: High performance, low latency, decentralized, ordered by time
Disadvantages: requires machine clock synchronization (up to second level)
Application scenario: Data primary key in distributed application environment

Does the Snowflake ID algorithm sound particularly suitable for distributed architecture scenarios? As far as it goes, we’ll focus on its principles and best practices.

Implementation principles of the Snowflake algorithm

Snowflake algorithm is derived from Twitter, implemented in Scala language, and implemented RPC interface calls using Thrift framework. The initial project was caused by the migration of database from mysql to Cassandra. Cassandra has no available ID generation mechanism, which gave birth to this project. Existing Github source code interested can go to see.

The characteristics of Snowflake algorithm are orderly, unique, and require high performance and low latency (each machine generates at least 10K data pieces per second, and the response time is less than 2ms). It needs to be used in a distributed environment (multi-cluster, cross-machine room). Therefore, the ID obtained by Snowflake algorithm is divided into sections:

The time difference from the specified date (milliseconds), 41 bits, enough for 69 years
Cluster ID + machine ID: a string of 10 characters, supporting a maximum of 1024 machines
Sequence, 12 bits, each machine generates up to 4096 serial numbers per millisecond

As shown in the figure:

1bit: indicates the sign bit. The value is 0 for this parameter, indicating that all ids are positive integers
41bit: indicates the time difference from the specified date, which is 69 years. We know that the timestamp denoted by Long is from 1970-01-01 00:00:00. The timestamp can specify the date, such as 2019-10-23 00:00:00
10bit: indicates the machine ID. This parameter can be configured for remote deployment or multiple clusters. You need to plan the ID of each equipment room, cluster, and instance
12bit: a maximum of 4096 sequence ids are supported if the preceding values are the same

The above bit allocation is only the official suggestion, we can allocate according to the actual needs, for example, the number of application machines is only a few dozen at most, but the concurrency is large, we can reduce the 10bit to 8bit, sequence part from 12bit to 14bit and so on

The meaning of each part are also free to replace, of course, such as the middle section of the machine ID, if it is cloud computing, container deployment environment, expanded at any time, reduce the operation of the machine, through offline planning to configure the instance ID is not reality, can replace every restart, for instance with a growth since the ID as that part of the content, will explain below.

Github also has the most basic implementation of Snowflake in Java. Here is the source code: Snowflake Java Edition

/** * @author beyond * @date 2016/11/26 */ public class snowflake {/** ** start time */ private final static long START_STMP = 1480166465631L; Private final static long SEQUENCE_BIT = 1; private final static long SEQUENCE_BIT = 1; Private final static long MACHINE_BIT = 5; Private final static long DATACENTER_BIT = 5; Private final static long MAX_DATACENTER_NUM = -1l ^ (-1l << DATACENTER_BIT); // Datacenter_num = -1l ^ (-1l << DATACENTER_BIT); private final static long MAX_MACHINE_NUM = -1L ^ (-1L << MACHINE_BIT); private final static long MAX_SEQUENCE = -1L ^ (-1L << SEQUENCE_BIT); Private final static long MACHINE_LEFT = SEQUENCE_BIT; private final static long MACHINE_LEFT = SEQUENCE_BIT; private final static long DATACENTER_LEFT = SEQUENCE_BIT + MACHINE_BIT; private final static long TIMESTMP_LEFT = DATACENTER_LEFT + DATACENTER_BIT; private long datacenterId; // Private long machineId; Private long sequence = 0L; Private long lastStmp = -1l; Public SnowFlake(long datacenterId, long machineId) { if (datacenterId > MAX_DATACENTER_NUM || datacenterId < 0) { throw new IllegalArgumentException("datacenterId can't be greater than MAX_DATACENTER_NUM or less than 0"); } if (machineId > MAX_MACHINE_NUM || machineId < 0) { throw new IllegalArgumentException("machineId can't be greater than MAX_MACHINE_NUM or less than 0"); } this.datacenterId = datacenterId; this.machineId = machineId; } public synchronized long nextId() {long currStmp = getNewstmp(); if (currStmp < lastStmp) { throw new RuntimeException("Clock moved backwards. Refusing to generate id"); } if (currStmp == lastStmp) {if (currStmp == lastStmp) {if (currStmp == lastStmp) { If (sequence == 0L) {currStmp = getNextMill(); }} else {// In different milliseconds, set the sequence number to 0; } lastStmp = currStmp; Return (currStmp - START_STMP) < < TIMESTMP_LEFT / / part timestamp | datacenterId < < DATACENTER_LEFT/part/data center | machineId < < MACHINE_LEFT / / machine identifier portion | sequence; } private long getNextMill() {long mill = getNewstmp(); while (mill <= lastStmp) { mill = getNewstmp(); } return mill; } private long getNewstmp() { return System.currentTimeMillis(); } public static void main(String[] args) { SnowFlake snowFlake = new SnowFlake(2, 3); for (int i = 0; i < (1 << 12); i++) { System.out.println(snowFlake.nextId()); }}}Copy the code

Basically by displacement operation, each section of the meaning of the numerical, moved to the corresponding position, such as machine ID here is composed of data center + machine identifier, so the machine left shift 12, is its location, the serial number of the data center moved to the left 17, timestamp values left shift 22, every part of their own position, each are not interference, This makes up a complete ID value.

This is the basic implementation of Snowflake. If you don’t remember the basics of Java, check it out: binary -1 is 0xFFFF (full of 1s), << is left shift, -1<<5 is -32, xor -1 ^ (-1 <<5) is 31, etc.

Understanding the basic implementation of Snowflake can be done by planning the machine logo in advance, but the current distributed production environment, using a variety of cloud computing, containerization technology, the number of instances change at any time, and the need to deal with the server instance clock back issues. It is not feasible to fix the planning ID and then configure it to use Snowflake. It usually starts and stops automatically, adds and subsets machines, and thus requires some modifications to Make Snowflake work in a production environment.

Baidu UID-Generator project

The UidGenerator project is implemented based on the Snowflake principle, only changing the definition of the machine ID part (the number of times the instance restarts), and the allocation of 64-bit bits supports configuration. The default allocation mode provided by the official is as follows:

Snowflake algorithm description: specifies the machine & the same time & a certain concurrent sequence, which is unique. This generates a 64-bit unique ID (long).

Sign (1bit) specifies the 1bit identifier. That is, the generated UID is a positive number.
Delta seconds (28 bits) Indicates the current time, expressed in seconds. A maximum of 8.7 years is supported
Worker id (22 bits) machine id, which supports a maximum of 420w machine starts. The built-in implementation is allocated by the database at startup, the default allocation policy is deprecated after use, and the subsequent reuse policy can be provided.
Sequence (13 bits) Indicates the sequence of concurrent requests per second. The 13-bit sequence supports 8192 concurrent requests per second.

There are two specific implementations, one is real-time generation of ID, the other is pre-generation of ID

DefaultUidGenerator

During startup, insert the IP address, Port and other information of the current instance into the WORKER_NODE table of the database, and then obtain the self-growing ID of the data as part of the machine ID. The simple flow chart is as follows:

Provides a method to obtain the ID and check whether clock rollback occurs. If clock rollback occurs, an exception is thrown. The current version does not support clock drift. The simple flow chart is as follows:

The core code is as follows:

    /**
     * Get UID
     *
     * @return UID
     * @throws UidGenerateException in the case: Clock moved backwards; Exceeds the max timestamp
     */
    protected synchronized long nextId() {
        long currentSecond = getCurrentSecond();

        // Clock moved backwards, refuse to generate uid
        if (currentSecond < lastSecond) {
            long refusedSeconds = lastSecond - currentSecond;
            throw new UidGenerateException("Clock moved backwards. Refusing for %d seconds", refusedSeconds);
        }

        // At the same second, increase sequence
        if (currentSecond == lastSecond) {
            sequence = (sequence + 1) & bitsAllocator.getMaxSequence();
            // Exceed the max sequence, we wait the next second to generate uid
            if (sequence == 0) {
                currentSecond = getNextSecond(lastSecond);
            }

        // At the different second, sequence restart from zero
        } else {
            sequence = 0L;
        }

        lastSecond = currentSecond;

        // Allocate bits for UID
        return bitsAllocator.allocate(currentSecond - epochSeconds, workerId, sequence);
    }Copy the code

CachedUidGenerator

The method of obtaining machine ID is the same as the previous one, which is to generate a batch of ids in advance and put them in a RingBuffer array for the client to use. When the available data is lower than the threshold, the batch generation method is called again, which belongs to the method of exchanging space for time and can improve the throughput of the whole ID.

Compared with DefaultUidGenerator, there is more logic to fill the RingBuffer ring array during initialization. The simple flow chart is as follows:

Core code:

/** * Initialize RingBuffer & RingBufferPaddingExecutor */ private void initRingBuffer() { // initialize RingBuffer int bufferSize = ((int) bitsAllocator.getMaxSequence() + 1) << boostPower; this.ringBuffer = new RingBuffer(bufferSize, paddingFactor); LOGGER.info("Initialized ring buffer size:{}, paddingFactor:{}", bufferSize, paddingFactor); // initialize RingBufferPaddingExecutor boolean usingSchedule = (scheduleInterval ! = null); this.bufferPaddingExecutor = new BufferPaddingExecutor(ringBuffer, this::nextIdsForOneSecond, usingSchedule); if (usingSchedule) { bufferPaddingExecutor.setScheduleInterval(scheduleInterval); } LOGGER.info("Initialized BufferPaddingExecutor. Using schdule:{}, interval:{}", usingSchedule, scheduleInterval); // set rejected put/take handle policy this.ringBuffer.setBufferPaddingExecutor(bufferPaddingExecutor); if (rejectedPutBufferHandler ! = null) { this.ringBuffer.setRejectedPutHandler(rejectedPutBufferHandler); } if (rejectedTakeBufferHandler ! = null) { this.ringBuffer.setRejectedTakeHandler(rejectedTakeBufferHandler); } // fill in all slots of the RingBuffer bufferPaddingExecutor.paddingBuffer(); // start buffer padding threads bufferPaddingExecutor.start(); }Copy the code

public synchronized boolean put(long uid) { long currentTail = tail.get(); long currentCursor = cursor.get(); // tail catches the cursor, means that you can't put any cause of RingBuffer is full long distance = currentTail - (currentCursor == START_POINT ? 0 : currentCursor); if (distance == bufferSize - 1) { rejectedPutHandler.rejectPutBuffer(this, uid); return false; } // 1. pre-check whether the flag is CAN_PUT_FLAG int nextTailIndex = calSlotIndex(currentTail + 1); if (flags[nextTailIndex].get() ! = CAN_PUT_FLAG) { rejectedPutHandler.rejectPutBuffer(this, uid); return false; } // 2. put UID in the next slot // 3. update next slot' flag to CAN_TAKE_FLAG // 4. publish tail with sequence increase  by one slots[nextTailIndex] = uid; flags[nextTailIndex].set(CAN_TAKE_FLAG); tail.incrementAndGet(); // The atomicity of operations above, guarantees by 'synchronized'. In another word, // the take operation can't consume the UID we just put, until the tail is published(tail.incrementAndGet()) return true; }Copy the code

Since the buffer array RingBuffer exists, the ID can be obtained directly from RingBuffer. Meanwhile, when RingBuffer checks itself, it can trigger batch generation again. The obvious difference between the ID value obtained here and DefaultUidGenerator is that CachedUidGenerator gettimestamp (ID);} DefaultUidGenerator gettimestamp (ID);} CachedUidGenerator gettimestamp (ID);} The simple flow chart is as follows:

Core code:

public long take() { // spin get next available cursor long currentCursor = cursor.get(); long nextCursor = cursor.updateAndGet(old -> old == tail.get() ? old : old + 1); // check for safety consideration, it never occurs Assert.isTrue(nextCursor >= currentCursor, "Curosr can't move back");  // trigger padding in an async-mode if reach the threshold long currentTail = tail.get(); if (currentTail - nextCursor < paddingThreshold) { LOGGER.info("Reach the padding threshold:{}. tail:{}, cursor:{}, rest:{}", paddingThreshold, currentTail, nextCursor, currentTail - nextCursor); bufferPaddingExecutor.asyncPadding(); } // cursor catch the tail, means that there is no more available UID to take if (nextCursor == currentCursor) { rejectedTakeHandler.rejectTakeBuffer(this); } // 1. check next slot flag is CAN_TAKE_FLAG int nextCursorIndex = calSlotIndex(nextCursor); Assert.isTrue(flags[nextCursorIndex].get() == CAN_TAKE_FLAG, "Curosr not in can take status"); // 2. get UID from next slot // 3. set next slot flag as CAN_PUT_FLAG. long uid = slots[nextCursorIndex]; flags[nextCursorIndex].set(CAN_PUT_FLAG); // Note that: Step 2,3 can not swap. If we set flag before get value of slot, the producer may overwrite the // slot with a new UID, and this may cause the consumer take the UID twice after walk a round the ring return uid; }Copy the code

In addition, RingBuffer data is stored in arrays. Considering the CPU Cache structure, tail and CURSOR variables may be cached in the same Cache line if they are used directly with the native AtomicLong type. Multiple threads reading this variable may cause RFO requests from the CacheLine, which may affect performance. In order to prevent false sharing problems, 6 long member variables are specially filled, plus the value member variable of long. It fills a CacheLine (a Java object with an 8byte header) and fills a CacheLine (a Java object with 8byte headers).

public class PaddedAtomicLong extends AtomicLong { private static final long serialVersionUID = -3415778863941386253L; /** Padded 6 long (48 bytes) */ public volatile long p1, p2, p3, p4, p5, p6 = 7L; /** * Constructors from {@link AtomicLong} */ public PaddedAtomicLong() { super(); } public PaddedAtomicLong(long initialValue) { super(initialValue); } /** * To prevent GC optimizations for cleaning unused padded references */ public long sumPaddingToPreventOptimization() { return p1 + p2 + p3 + p4 + p5 + p6; }}Copy the code

The above is the main description of uID-Generator project of Baidu. We can find that snowflake algorithm has some changes when landing, which is mainly reflected in obtaining machine ID, especially in distributed cluster environment. Automatic scaling of instances and some technologies of Docker container enable static configuration of project ID. Instance ids are not feasible, so these are converted to be identified by startup times.

Meituan ECP-UID project

In terms of uidGenerator, Meituan’s project source code directly integrates baidu’s source code, slightly changing some Lambda expressions into native Java syntax, for example:

/ / com. Myzmds. Ecp. Core. Uid. Baidu. Impl. InitRingBuffer CachedUidGenerator class () method source this. / / baidu bufferPaddingExecutor = new BufferPaddingExecutor(ringBuffer, this::nextIdsForOneSecond, usingSchedule); / / Meituan source enclosing bufferPaddingExecutor = new bufferPaddingExecutor (ringBuffer, new BufferedUidProvider() { @Override public List<Long> provide(long momentInSecond) { return nextIdsForOneSecond(momentInSecond); } }, usingSchedule);Copy the code

In addition, Zookeeper and Redis components are introduced to enrich the generation and acquisition of machine ID. Instance numbers can be stored and used repeatedly, instead of the monotonous growth of the database.

conclusion

This paper briefly introduces the principle of Snowflake algorithm and the transformation in the process of falling into the ground. In this paper, WE learn excellent open source code and select some simple examples. Meituan’s ECP-UID project not only integrates baidu’s existing UidGenerator algorithm, the original Snowflake algorithm, It also includes excellent Leaf segment algorithm, which is not described in detail due to space. Please comment on any inaccuracies or indetails in this article. Thank you.

Focus on Java high concurrency, distributed architecture, more technical dry goods to share and experience, please pay attention to the public account: Java architecture community

An article to thoroughly understand the Snowflake algorithm and Baidu Meituan best practices

The words written in the front

implementation

practices

Project introduction

Implementation principles of the Snowflake algorithm

Baidu UID-Generator project

Meituan ECP-UID project

conclusion

Related Posts

Python object serialization

Basic concepts of SpringCloud

Mysql backup and recovery