Upgraded snowflake algorithm number generator

The background,

The dual-buffer Distributed ID Generator is useful in most scenarios, but there are three disadvantages to this pattern:

Strongly dependent on business libraries
Restart wastes sequence segments
In the instantaneous burst traffic scenario, capacity expansion and switchover cannot be performed

The first two problems are not big, but as a transmitter if it can not cope with sudden traffic, then it must be a fatal disadvantage, is not very acceptable. Then we need to consider designing a transmitter that can cope with sudden traffic.

Distributed ID generator based on centralization

The current application architecture promotes distributed multi-machine deployment, and each node in the cluster cannot communicate by default. That is to say, the current popular ID generator of snowflake algorithm is standalone, so different nodes will generate repeated ids at the same time in the case of sufficient concurrency. We propose two concepts and consider a question:

Centralization and distribution (also known as decentralization)
Why distributed

Centralization can be simply and rudely understood as single-point deployment, in which all services and functions share all configuration items and two-party and three-party capabilities. Distributed deployment means splitting large nodes into multiple independent units according to the business model. The adoption and promotion of distributed deployment is nothing more than the separation of business modules and traffic scattered business units. In fact, distribution cannot exist independently without centralization. For example, RPC services should be uniformly registered with the centralized node for routing scheduling. Return to our theme, the traditional algorithms of snow won’t solve distributed concurrent id more machine deployment conflicts, because each node in the cluster is no way to determine its unique identity in cluster, such as snow algorithm of computer room is the same, so by single algorithm of machine a id could be the same as the other nodes in the cluster, so there will be two When two requests are made to two nodes with the same machine ID at the same time, id conflicts occur. So how to solve this problem, the answer is to introduce the concept of centralization, and when the cluster node is started, communicate with the centralized node, obtain their unique identity, and then fill into the machine bits of the Snowflake algorithm.Upgraded snowflake algorithm transmitter support capabilities and design thinking:

Support db and REDis two centralized nodes (zK can be considered later).
Centralized nodes are used to ensure that each machine in the distributed cluster has a unique identity so as to ensure that snowflake algorithm machines are globally unique.
Open and close principle ensures that the design implementation mode is closed and the centralized mode is open.
Application startup uses centralized nodes to determine the unique identity of each machine (Redis uses lua script for INCR with timeout, DB uses auto-increment primary key)

When the application is closed, the hook method is used to recover the centralized node resources (Redis with timeout can not be processed,db uses the primary key ID generated at startup to delete this record).

Original Snowflake algorithm machine bit 031. In order to ensure that the machine ID generated based on the central node is within this range (whether redis or DB, it is easy to exceed 31 as long as the startup times are enough), the ID obtained by each machine from the central node is complementary to 31, ensuring that it falls at 0Within 31.
Why recycle central node resources? Redis since there is only one key, so the question is not big, but the db is every machine start to generate a sequence of record, if enough start times, projects online long enough will cause the sequence table there are a large number of records (temporary) didn’t affect, takes database space, and these records are disposable, used only when the machine start, You can use Runtime.addShutdownHook or Spring’s DisposableBean interface to take effect only when the application is shut down or restarted in the normal manner, if the process is currently exiting abnormally (Kill -9, JVM crashes, etc.) Hook methods cannot be executed and the central node is not recycled, but the problem is not significant, the application exit probability is not high, and the DB sequence table has a partial record.

Iii. Implementation of core design

Is the core design ideas to give the ability to use annotations to open inspired, and according to the corresponding configuration to enable users to choose their own mode and bean injection, leave extension points to the user custom to realize centralized data interaction ability, also known as the open closed principle of software design patterns, spring related components in a large number of applications. Sequence diagram of transmitter enabling:Sequence diagram of transmitter initialization and destruction:

1. EnableIdWorker annotation

@Target(ElementType.TYPE)
@Retention(RetentionPolicy.RUNTIME)
@Documented
@Import(IdWorkerConfigurationSelector.class)
public @interface EnableIdWorker {

    CenterModel mode(a) default CenterModel.DB;

}
Copy the code

Mark open heurizer ability,model property to choose the central node mode, DB and REDis can support two modes, the default is DBCopy the code

2. IdWorkerConfigurationSelector selector

@Slf4j
public class IdWorkerConfigurationSelector implements ImportSelector {
    public static final String DEFAULT_ADVICE_MODE_ATTRIBUTE_NAME = "mode";
    
    @Override
    publicString[] selectImports(AnnotationMetadata importingClassMetadata) { Class<? > annType = EnableIdWorker.class; Assert.state(annType ! =null."Unresolvable type argument for IdWorkerConfigurationSelector");
        AnnotationAttributes attributes = AnnotationAttributes.fromMap(importingClassMetadata.getAnnotationAttributes(annType.getName(), false));
        if (attributes == null) {
            throw new IllegalArgumentException(String.format(
                    "@%s is not present on importing class '%s' as expected",
                    annType.getSimpleName(), importingClassMetadata.getClassName()));
        }
        CenterModel centerModel = attributes.getEnum(getModeAttributeName());
        switch (centerModel) {
            case DB:
                log.info("IdWorkerConfigurationSelector.selectImports use DBIdWorker........");
                return new String[] {DBIdWorkerConfig.class.getName()};
            case REDIS:
                log.info("IdWorkerConfigurationSelector.selectImports use RedisIdWorker........");
                return new String[] {RedisIdWorkerConfig.class.getName()};
            default:
                return null; }}protected String getModeAttributeName(a) {
        returnDEFAULT_ADVICE_MODE_ATTRIBUTE_NAME; }}Copy the code

The model attribute of the EnableIdWorker annotation determines which centralized configuration to useCopy the code

3. Centralized configuration classes

@Role(BeanDefinition.ROLE_INFRASTRUCTURE)
@Slf4j
public class DBIdWorkerConfig {

    @Bean("dbIdWorker")
    public DBIdWorker dbIdWorker(a) {
        DBIdWorker idWorker = new DBIdWorker();
        log.info("DBIdWorkerConfig.dbIdWorker init success....");
        returnidWorker; }}Copy the code

If DB centralization is selected in 2, then the DBIdWorker number generator is injected, as is the corresponding RedisCopy the code

4. Snowflake algorithm abstraction

@Slf4j
public abstract class AbstractIdWorker implements InitializingBean.DisposableBean {
    /** start timestamp (2020-01-01), used to subtract this timestamp from the current timestamp to calculate the offset **/
    private final long startTime = 1577808000000L;
    /** The number of digits in the machine ID */
    private final long workerIdBits = 5L;

    protected final long workerIdCount = (1 << workerIdBits) - 1;
    /** ID of the working machine (0~31) */
    private long workerId;
    /** * Get the centralized machine ID *@return* /
    protected abstract long getWorkerId(a);
    /** * get the next ID (this method is thread-safe) *@return SnowflakeId
     */
    public synchronized long nextId(a) {
        / /...
    }

    @Override
    public void afterPropertiesSet(a) throws Exception {
        this.workerId = this.getWorkerId();
        log.info("AbstractIdWorker.afterPropertiesSet init workerId success; workerId={}".this.workerId);
    }

    @Override
    public void destroy(a) throws Exception {
        this.doDestroyWorkerId();
    }

    /** * Centralize recycling when the application is closed */
    protected abstract void doDestroyWorkerId(a);
}
Copy the code

Make use of the hen egg principle and transform it based on the mature snowflake algorithm, realize the interaction between InitializingBean interface and the central node when the application is started, calculate the unique identity of the machine, and recycle the central node when the application is closed or restarted by DisposableBean interfaceCopy the code

5. Implementation

public class DBIdWorker extends AbstractIdWorker{
    private long seq;
    @Autowired
    @Qualifier("dbSequenceManager")
    private ISequenceManager iSequenceManager;

    @Override
    protected long getWorkerId(a) {
        this.seq = this.iSequenceManager.borrowSeq(null);
        return (this.seq % this.workerIdCount);
    }
    @Override
    protected void doDestroyWorkerId(a) {
        this.iSequenceManager.returnSeq(this.seq); }}Copy the code

When the application is started, the sequence record is inserted by the db primary key increment attribute and returned to the primary key. The unique identity of the machine in the cluster is calculated by the mod of the machine bit, the primary key value returned by DB is stored, and the sequence record is deleted when the application is shut down. The above code data framework layer content, do not need to use the code transformation, and here left an extension point to the user's own implementation,ISequenceManager is a sequence manager abstract definition, different centralized mode The implementation varies, but following the DB pattern requires that the implementation implementation class be registered with the Spring container and named dbSequenceManager, while Redis is called redisSequenceManager.Copy the code

Four, use mode

Introduce transmitter dependencies

Pom depends onCopy the code

Add annotations to the start class

    @EnableIdWorker(model=xxx)
Copy the code

Implement the sequence manager

For example, DB modeCopy the code

@Component("dbSequenceManager")
@Slf4j
public class DbSequenceManager implements ISequenceManager<Long> {
    @Autowired
    private IdWorkerSeqMapper idWorkerSeqMapper;
    @Override
    public long borrowSeq(Long seq) {
        String hostIp = NetUtil.getHostIp();
        IdWorkerSeqDO seqDO = new IdWorkerSeqDO();
        seqDO.setHostId(hostIp);
        seqDO.setCreateTime(new Date());
        this.idWorkerSeqMapper.insert(seqDO);
        log.info("DbSequenceManager.borrowSeq success,seqId={}",seqDO.getId());
        return seqDO.getId();
    }
    @Override
    public void returnSeq(Long seq) {
        log.info("DbSequenceManager.returnSeq,seqId={}",seq);
        this.idWorkerSeqMapper.deleteByPrimaryKey(seq); }}Copy the code

Redis modeCopy the code

@Component("redisSequenceManager")
@Slf4j
public class RedisSequenceManager extends AbstractRedisSequenceManager {
    protected RedisSerializer keySerializer = new StringRedisSerializer();
    protected RedisSerializer valueSerializer = new Jackson2JsonRedisSerializer(Object.class);
    protected static final long expireTimes = 120 * 60L;
    @Autowired
    @Qualifier("redisMaster")
    protected StringRedisTemplate stringRedisTemplate;
    @Override
    public long borrowSeq(String seqKey) {
        DefaultRedisScript<Long> redisScript = new DefaultRedisScript<>(LUA_BORROW_SCRIPT, Long.class);
        List<String> keys = new ArrayList<>(2);
        keys.add(seqKey);
        Long seq = this.stringRedisTemplate.execute(redisScript
                , this.valueSerializer
                , this.keySerializer
                , keys
                , expireTimes);
        log.info("RedisSequenceManager.borrowSeq success,seqId={}", seq);
        return seq;
    }
    @Override
    public void returnSeq(String seqKey) {
        log.info("RedisSequenceManager.returnSeq,seqKey={}", seqKey); }}Copy the code

Use a number transmitter

Inject the corresponding transmitter into the business class according to the selected centralized mode

@Autowired
private DBIdWorker dBIdWorker;

或者

@Autowired
private RedisIdWorker redisIdWorker;
Copy the code

conclusion

1. The advantages

Solve the problem that dual buffer mode cannot cope with instantaneous burst flow

2. The shortcomings

The generated ID sequence has poor service performance and readability, which is not as intuitive as yyyMMdd + ID + userId, and has the genes of library and tableCopy the code

Dependency on centralized nodes

In fact, we can upgrade it again, to ensure that the distributed transmitter global unique, in fact, the essence is to ensure that the machine is unique, other locations can be customized.