Abstract

This paper mainly introduces the principle of Meituan’s distributed ID framework Leaf. Aiming at some issues in the original Leaf project, the function enhancement, problem repair and optimization improvement of Leaf project are carried out. The improved project address is as follows:

Leaf project improvement plan github.com/NotFound9/L…

Leaf principle analysis

Snowflake generates ID patterns

7849276-4d1955394baa3c6d.png

1 bit sign bit +41 bit timestamp +10 bit workID+12 bit serial number

That adds up to 64 binary bits, exactly the same number of long bits in Java.

Meituan’s Leaf framework makes some tweaks to the Snowflake algorithm. The bit allocation looks like this:

Maximum 41-bit time difference + 10-bit workID+ 12-bit serialization

Although look at the United States of The Leaf introduction article said

The leaf-Snowflake solution follows the Snowflake bit design entirely, namely "1+41+10+12" assembling ID numbers.

If timestamp is too large, resulting in the time difference occupying 42 binary bits, when the first digit of the time difference is 1, the generated ID may be negative after converting to decimal:

//timestampLeftShift is 22, workerIdShift is 12
long id = ((timestamp - twepoch) << timestampLeftShift) | (workerId << workerIdShift) | sequence;
Copy the code

What is the time difference?

Because the timestamp is January 1, 1970 at 00 00 00 seconds as a starting point, in fact, we usually take time stamp is the starting point to the present time, if we can make sure we get some point of time after time, you can change the starting point for the timestamp to this point in time, the Leaf project, If the start time is not set, the default is 09:42:54 November 4, 2010. This increases the maximum time supported by the Leaf framework, which is 69 years after the start point.

How to assign workIDS?

Leaf uses Zookeeper as the registry, and reads the list of child nodes under the Zookeeper specific path /forever/ every time the machine is started. Each child node stores IP:Port and its corresponding workId, and traverses the list of child nodes. If there is a workId corresponding to the current IP:Port, If the workId stored in the node information does not exist, create a permanent ordered node, use the serial number as the workId, and write the workId information into the local cache file workerid. properties for reading when Zookeeper fails to connect during startup. Because workId is assigned only 10 bits, the value ranges from 0 to 1023.

How is the serial number generated?

The serial number is 12 binary bits, ranging from 0 to 4095, to ensure the uniqueness of the ID generated by the same LEAF service in the same millisecond. The sequence number is generated as follows: 1. The current timestamp is within the same millisecond as the timestamp of the previous ID. If sequence+1 exceeds 4095, wait until the next millisecond to generate the ID. 2. If the current timestamp is not in the same millisecond as the timestamp of the previous ID, select a random number within 100 as the sequence number.

if (lastTimestamp == timestamp) {
        sequence = (sequence + 1) & sequenceMask;
        if (sequence == 0) {
            // If seq is 0, randomize seq in the next millisecond
            sequence = RANDOM.nextInt(100); timestamp = tilNextMillis(lastTimestamp); }}else {
    // If a new MS starts
       sequence = RANDOM.nextInt(100);
}
lastTimestamp = timestamp;
Copy the code

Segment Generates the pattern of ids

5e4ff128.png

The general process is that each Leaf service has two Segment instances in memory, and each seINTERFACE stores the ID of a Segment.

A Segment is currently used to allocate ids, with a value attribute holding the maximum ALLOCATED ID for the Segment, and a Max attribute holding the maximum allocated ID for the Segment.

The other set of interfaces is reserved, and when the one set is used up, the interface will be switched over, and the other set of interfaces will be used.

When the utilization of one seinterface ID reaches 10%, the other seinterface will set to DB to obtain the interface ID, and then initialize the interface ID for future use.

Segment { private AtomicLong value = new AtomicLong(0); private volatile long max; private volatile int step; } SegmentBuffer { private String key; private Segment[] segments; // private volatile int currentPos; // Index of the currently used segment private volatile Boolean nextReady; // Whether the next segment is toggable private volatile Boolean initOk; Private final AtomicBoolean threadRunning; // Whether the thread is running private final ReadWriteLock lock; private volatile int step; private volatile int minStep; private volatile long updateTimestamp; }Copy the code

Leaf project improvement

The current problem with Leaf is this

Snowflake generates ID related:

1. The registry supports only Zookeeper

For small companies or project teams that do not use Zookeeper, maintaining a Zookeeper cluster in order to deploy the Leaf service is too costly. Therefore, there was an issue in the original project asking “how to support non-ZooKeeper registries”. Since MySQL is more likely to be used in general projects, the function of using MySQL as the registry and local configuration as the registry is added.

2. Potential clock rollback problems

The workerID in the local cache file workerid. properties is used when Zookeeper fails to connect to the server. The maximum timestamp generated by the workerID is not verified, which may cause ID duplication.

3. If the time difference is too large, the id is negative

Due to the lack of time difference verification, when the time difference is too large and the binary number exceeds 41 bits, it will cause overflow in ID generation, making the sign bit 1 and ID negative.

Seinterfaces generate ID related:

There were not too many problems, but the code was optimized for performance based on some issues.

Specific improvements are as follows:

Snowflake generates ID related improvements:

1. For the original Leaf projectissue#84, add zK_recycle mode (registry is ZK, workId is recycled)

2. For the original Leaf projectissue#100MySQL > add MySQL mode (MySQL registry)

3. For the original Leaf projectissue#100, add Local mode (registry is configured for Local projects)

4. For the original Leaf projectissue#84Fixed a startup clock rollback problem

5. For the original Leaf projectissue#106To fix the problem where the time difference is too large and exceeds 41 bits, causing the generated ID to overflow

Seinterfaces generate ID related improvements:

1. For the original Leaf projectissue#68And optimize the SegmentIDGenImpl. UpdateCacheFromDb () method.

2. For the original Leaf projectissue#88, using bit operation & replace modulus operation

Snowflake algorithm ID generation related improvements

The original registry mode of Leaf project (we temporarily command zK_normal mode) uses Zookeeper as the registry, and reads the list of child nodes in the specific path of Zookeeper every time the machine is started. If there is a workId corresponding to the current IP address :Port, If the workId stored in the node information does not exist, create a permanent ordered node, use the serial number as the workId, and write the workId information into the local cache file workerid. properties for reading when Zookeeper fails to connect during startup.

1. For the original Leaf projectissue#84, add zK_recycle mode (registry is ZK, workId is recycled)

Question details:

Issue# 84: does workid support reclamation?

Does workID support reclamation in SnowflakeService mode? In a distributed environment, each redeployment may change an IP address, and the 1024 machine ids will be used up quickly if there is no recycling. Why not use temporary nodes to store workIDS so that they can be dynamically aware of the service coming online and managing the recycling of workIDS?

Solution:

Zk_recycle mode has been developed, aiming at the technical solution of using Snowflake to generate distributed ID. Originally, Zookeeper was used as the registry to allocate a fixed workId to each service according to IP:Port. The generated range of workId is 0 to 1023, and workId cannot be recycled. So in the original project of Leaf someone proposed an issue#84 workid does it support recycling? , because when the IP address and Port of the service where Leaf is deployed are not fixed, if the workId does not support reclamation, when the workId exceeds the maximum value, the generated distributed ID will be repeated. So we added the workId recycle mode zK_RECYCLE.

How do I use the ZK_RECYCLE mode?

In the Leaf/Leaf – server/SRC/main/resources/Leaf. Add the following configuration properties

// Enable the Snowflake service leaf.snowflake. Enable =trueWorkId leaf.snowflake. Port = // Set snowflake mode to zk_recycle, and the registry is Zookeeper. And workerId reusable leaf. Snowflake. Mode = zk_recycle. / / they address the leaf snowflake. Zk. Address = localhost: 2181Copy the code

Start LeafServerApplication and call/API/Snowflake /get/test to get the distributed ID generated in this mode.

curl domain/api/snowflake/get/test
1256557484213448722
Copy the code

Implementation principles of the ZK_RECYCLE mode

After following the above configuration in Leaf.properties,

if(mode.equals(SnowflakeMode.ZK_RECYCLE)) {// The registry is zk, and the workId assigned to IP :port is the class recycling mode
     String    zkAddress = properties.getProperty(Constants.LEAF_SNOWFLAKE_ZK_ADDRESS);
     RecyclableZookeeperHolder holder    = new RecyclableZookeeperHolder(Utils.getIp(),port,zkAddress);
     idGen = new SnowflakeIDGenImpl(holder);
     if (idGen.init()) {
     logger.info("Snowflake Service Init Successfully in mode " + mode);
     } else {
     throw new InitException("Snowflake Service Init Fail"); }}Copy the code

The SnowflakeIDGenImpl use holder is RecyclableZookeeperHolder instance, workId is recyclable, RecyclableZookeeperHolder workflow is as follows: 1. First in unused workId pool (zookeeper path for/snowflake/leaf. The name/recycle/notuse /) generated in all workId. 2. Then every server startup is not used to come up with a new workId workId pool, and then into are using workId pool (zookeeper path for/snowflake/leaf. Name/recycle/inuse), to use the workId Id generated, In addition, time stamps are periodically reported to update zooKeeper node information. 3. Periodically check the workId pool in use. If a workId whose timestamp has not been updated exceeds the maximum time, the workId will be removed from the workId pool in use and placed in the unused workId pool for the workId to recycle. 4. When the server that is using the workId that has not been updated for a long time finds that it has exceeded the maximum time and has not reported the timestamp successfully, it will stop the ID generation service to prevent the workId from being reused by other servers, resulting in id duplication.

2. For the original Leaf projectissue#100MySQL > add MySQL mode (MySQL registry)

Question details:

Issue# 100: how do I use a non-zk registry?

Solution:

The mysql schema was developed with a mysql registry and a fixed workID for each IP :port.

How to use this mysql schema?

Need leaf_workerid_alloc in a database project. The first SQL, complete the table, and then in the Leaf, the Leaf – server/SRC/main/resources/Leaf. Add the following configuration properties

// Enable the Snowflake service leaf.snowflake. Enable =trueWorkId leaf.snowflake. Port = // Set snowflake mode to mysql, where the registry is Zookeeper, WorkerId fixed leaf.snowflake. Mode =mysql //mysql database address leaf.jdbc.url= leaf.jdbc.username= leaf.jdbc.password=Copy the code

Start LeafServerApplication and call/API/Snowflake /get/test to get the distributed ID generated in this mode.

curl domain/api/snowflake/get/test
1256557484213448722
Copy the code

Realize the principle of

After using the above configuration, the holder used by SnowflakeIDGenImpl is an instance of SnowflakeMySQLHolder. The implementation principle is similar to the default mode of Leaf original project, which uses Zookeeper as the registry, and the workID of each IP :port is fixed. The implementation principle is similar, except that registration, obtaining the workID, and updating the timestamp interact with MySQL instead of Zookeeper.

if (mode.equals(SnowflakeMode.MYSQL)) {// The registry is mysql
		DruidDataSource dataSource = new DruidDataSource();
		dataSource.setUrl(properties.getProperty(Constants.LEAF_JDBC_URL));
dataSource.setUsername(properties.getProperty(Constants.LEAF_JDBC_USERNAME));
dataSource.setPassword(properties.getProperty(Constants.LEAF_JDBC_PASSWORD));
		dataSource.init();
		// Config Dao
		WorkerIdAllocDao dao = new WorkerIdAllocDaoImpl(dataSource);
		SnowflakeMySQLHolder holder = new SnowflakeMySQLHolder(Utils.getIp(), port, dao);
		idGen = new SnowflakeIDGenImpl(holder);
		if (idGen.init()) {
				logger.info("Snowflake Service Init Successfully in mode " + mode);
		} else {
				throw new InitException("Snowflake Service Init Fail"); }}Copy the code

3. For the original Leaf projectissue#100, add Local mode (registry is configured for Local projects)

Question details:

Issue# 100: how do I use a non-zk registry?

Solution:

The local mode is developed, which is applicable to the situation that the IP and Port of Leaf service are basically unchanged. That is, certain IP is explicitly configured in the configuration file leaf.properties of Leaf project: which workId does certain Port correspond to. Add this configuration to the project when IP:Port is added, then the project will read the configuration in Leaf. properties at startup, write the local cache file workid. json after reading, and directly read the workid. json at next startup. The maximum timestamp is also synchronized each time to the cache file workid.json on the machine.

How to use this local mode?

In the Leaf/Leaf – server/SRC/main/resources/Leaf. Add the following configuration properties

// Enable the Snowflake service leaf.snowflake. Enable =true// The port of the leaf service used to generate workId leaf.snowflake. Port =# the registry is local
#leaf.snowflake.mode=local
#leaf.snowflake.local.workIdMap=
# workIdMap format is such a {" Leaf service IP: port ":" fixed workId} ", for example: {" 10.1.46.33:8080 ": 1," 10.1.46.33:8081 ": 2}
Copy the code

Start LeafServerApplication and call/API/Snowflake /get/test to get the distributed ID generated in this mode.

curl domain/api/snowflake/get/test
1256557484213448722
Copy the code

4. For the original Leaf projectissue#84Fixed a startup clock rollback problem

Question details:

Issue# 84: Because when the default mode is used (we temporarily command zk_normal mode) and the registry is Zookeeper, the workId cannot be reused. The workflow of this mode is described above. When Leaf service is started, the connection to Zookeeper fails. Properties file will be read from the local cache and the workId will be read for use. However, because the workId information is only stored in the workerID.properties file and the maximum timestamp reported last time is not stored, the timestamp judgment is not performed. So if the machine’s current time is changed to earlier, it may cause the generated ID to duplicate.

Solution:

The workId in the workerid.properties cache is used to verify the timestamp, and the workId in the workerid.properties cache is used to verify the timestamp. This workerId is used when the current system timestamp < the timestamp in the cache.

// Failed to connect, use the workerID in the local workerid.properties and verify the timestamp.
try {
		Properties properties = new Properties();
		properties.load(new FileInputStream(new File(PROP_PATH.replace("{port}", port + ""))));
		Long maxTimestamp = 				 Long.valueOf(properties.getProperty("maxTimestamp"));
		if(maxTimestamp! =null && System.currentTimeMillis() <maxTimestamp) 		{
				throw new CheckLastTimeException("init timestamp check error,forever node timestamp gt this node time");
		}
		workerID = Integer.valueOf(properties.getProperty("workerID"));
		LOGGER.warn("START FAILED ,use local node file properties workerID-{}", workerID);
} catch (Exception e1) {
		LOGGER.error("Read file error ", e1);
		return false;
}      

// The scheduled task executes the updateNewData() method every 3s and calls updateLocalWorkerID() to update the cache file workerid.properties
void updateNewData(CuratorFramework curator, String path) {
      try {
          if (System.currentTimeMillis() < lastUpdateTime) {
            	return;
          }
          curator.setData().forPath(path, buildData().getBytes());
          updateLocalWorkerID(workerID);
          lastUpdateTime = System.currentTimeMillis();
      } catch (Exception e) {
        	LOGGER.info("update init data error path is {} error is {}", path, e); }}Copy the code

5. For the original Leaf projectissue#106To fix the problem where the time difference is too large and exceeds 41 bits, causing the generated ID to overflow

Question details:

Because the Leaf framework uses Snowflake’s bit assignment of the maximum 41-bit time difference + 10-bit workID+ 12-bit serialization, but because Snowflake requires the first sign bit to be 0, otherwise the generated ID will be re-tested after it is converted to decimal, but there is no time difference verification in Leaf project. If the timestamp is too large or the custom twepoch is set too small, the calculated time difference will be too large. When the timestamp is converted to base 2, the time difference exceeds 41 bits and the first digit is 1, the generated id of long type will be negative. For example, when timestamp = twepoch+2199023255552L, At this time, when id is generated, timestamp – twepoch will be equal to 2199023255552,2199023255552 will be 1+41 zeros after it is converted into binary. At this time, the generated id will be negative -9223372036854775793 because the sign bit is 1

 long id = ((timestamp - twepoch) << timestampLeftShift) | (workerId << workerIdShift) | sequence;
Copy the code

Solution:

// Start with maxTimeStamp
this.maxTimeStamp = ~(-1L << timeStampBits) + twepoch;
// Then verify when generating the ID
if (timestamp>maxTimeStamp) {
    throw new OverMaxTimeStampException("current timestamp is over maxTimeStamp, the generate id will be negative");
}
Copy the code

Distributed ID related improvements for SEINTERFACES generation

1. For the original Leaf projectissue#68And optimize the SegmentIDGenImpl. UpdateCacheFromDb () method

For optimal solution, issue# 68 inside to Segement of the Buffer cache data and the working process of the DB data synchronization has carried on the further optimization, mainly for the SegmentIDGenImpl. UpdateCacheFromDb () method is optimized.

1. Traverse cacheTags to remove elements from insertTagsSet, a copy of dbTags, so that insertTagsSet only has tag 2 added to db. Iterate over the insertTagsSet to add these new elements to the cache 3. Iterating through dbTags removes elements that exist in removeTagsSet, a cacheTags copy, leaving the removeTagsSet with only the out-of-date tag 4 in the cache. We use two hashsets to store newly added tags in db and expired tags in cache, respectively. In order to filter out newly added tags, expired tags in cache, we use two hashsets to store newly added tags. There are two deletions for each tag currently in use,

The original scheme code is as follows:

    List<String> dbTags = dao.getAllTags();
    if (dbTags == null || dbTags.isEmpty()) {
        return;
    }
    List<String> cacheTags = new ArrayList<String>(cache.keySet());
    Set<String> insertTagsSet = new HashSet<>(dbTags);
    Set<String> removeTagsSet = new HashSet<>(cacheTags);
    // add new tags to db to cache
    for(int i = 0; i < cacheTags.size(); i++){
        String tmp = cacheTags.get(i);
        if(insertTagsSet.contains(tmp)){ insertTagsSet.remove(tmp); }}for (String tag : insertTagsSet) {
        SegmentBuffer buffer = new SegmentBuffer();
        buffer.setKey(tag);
        Segment segment = buffer.getCurrent();
        segment.setValue(new AtomicLong(0));
        segment.setMax(0);
        segment.setStep(0);
        cache.put(tag, buffer);
        logger.info("Add tag {} from db to IdCache, SegmentBuffer {}", tag, buffer);
    }
    // Invalid tags in the cache were removed from the cache
    for(int i = 0; i < dbTags.size(); i++){
        String tmp = dbTags.get(i);
        if(removeTagsSet.contains(tmp)){ removeTagsSet.remove(tmp); }}for (String tag : removeTagsSet) {
        cache.remove(tag);
        logger.info("Remove tag {} from IdCache", tag);
    }
Copy the code

In fact, we don’t need these intermediate procedures. Now we have a workflow: we just need to iterate through dbTags and determine if the key exists in the cache. Iterate over cacheTags to determine if the key exists in a dbSet, or if it does not, then delete it.

Existing scheme code:

    List<String> dbTags = dao.getAllTags();
    if (dbTags == null || dbTags.isEmpty()) {
        return;
    }
    // Add the new tag to the cache. Check whether it exists in the cache by iterating through dbTags. If not, add it to the cache
    for (String dbTag : dbTags) {
        if (cache.containsKey(dbTag)==false) {
            SegmentBuffer buffer = new SegmentBuffer();
            buffer.setKey(dbTag);
            Segment segment = buffer.getCurrent();
            segment.setValue(new AtomicLong(0));
            segment.setMax(0);
            segment.setStep(0);
            cache.put(dbTag, buffer);
            logger.info("Add tag {} from db to IdCache, SegmentBuffer {}", dbTag, buffer);
        }
    }
    List<String> cacheTags = new ArrayList<String>(cache.keySet());
    Set<String>  dbTagSet     = new HashSet<>(dbTags);
    // Remove an invalid tag from the cache. Check whether it exists in the dbTagSet by traversing cacheTags. If it does not, the tag has expired
    for (String cacheTag : cacheTags) {
        if (dbTagSet.contains(cacheTag) == false) {
            cache.remove(cacheTag);
            logger.info("Remove tag {} from IdCache", cacheTag); }}Copy the code

Comparison of the two schemes:

  • Compared with the original scheme, which requires two Hashsets, this scheme only needs one HashSet, and the space complexity is lower.
  • The total number of traversals will be less than the original, and the time complexity will be lower. Because it is judged to be new, the expiration will be handled directly, and there is no need to perform separate traversal afterwards. Moreover, there is no need to delete the intersection between cache and DBtag. Delete the existing elements from a copy of dbSet.
  • The original solution was 4 for loops, totaling 35 lines of code. The current solution is 2 for loops, totaling 25 lines of code, which is more concise.

2. For the original Leaf projectissue#88, using bit operation & replace modulus operation

This update is in response to issue#88, which uses the bit operation & instead of the modulo % operation for more efficient execution. The original code:

public int nextPos(a) {
        return (currentPos + 1) % 2;
}
Copy the code

Modern code:

public int nextPos(a) {
        return (currentPos + 1) & 1;
}
Copy the code

Refer to the article

Tech.meituan.com/2017/04/21/…

Dry Goods content review:

How to ensure the consistency between cache and database in high concurrency scenario?

How do you clean expired keys?

How does MySQL solve the illusion problem?

How to execute a MySQL update statement?

What is your understanding of MySQL locks?

What is your understanding of Redis persistence?

What do you think of synchronized locks?

Talk about your understanding of HashMap.