Introduction of Butterfly

Butterfly is an ultra high performance transmitter frame. By introducing a variety of new solutions, the framework not only solves all the problems existing in the snowflake algorithm, but also provides higher performance than the snowflake algorithm. With a theoretical QPS of 51.2(w/s) for the stand-alone version, the new scheme can reach 1200(w/s) or higher on some machines. To name Butterfly, there are no identical Butterfly wings in the world to show the uniqueness of the algorithm. The Snowflake algorithm is a distributed ID generator proposed by Twitter, but it has three problems, the first two of which are common in the industry:

  • Time callback problem
  • Machine ID allocation and recycling issues
  • Upper limit of machine ID

The industry has its own solutions to the first two problems, but they are not perfect, or not completely solved. Here we start from new ideas, through the transformation of snowflake algorithm and other related ways to completely solve the above three problems. This scheme is a perfect way to implement snowflake algorithm. Please refer to the scheme introduction

The new plan

For the above three problems, we outline our plan here.

  1. Time callback problem

Here we adopt a new scheme: the general idea is: the start time stamp adopts the “historical time”, each request only increases the sequence value, the sequence value increases, and then the “historical time” increases by 1, the sequence value is recalculated. 2. Machine ID allocation and reclamation There are two schemes for machine ID allocation and reclamation: ZooKeeper and DB. Theoretically, allocation scheme ZK is through hashing and expanding machines, while DB is through lookup mechanism. In the reclamation scheme, ZK uses a permanent node. The node stores the next expiration time, and the client periodically reports the expiration time (setting heartbeat). Db adds the expiration time field and determines the expiration field during searching. 3. Upper limit of machine ID This node adopts the modified Snowflake + ZooKeeper ID allocation scheme as the server, the client adopts double Buffer+ asynchronous acquisition to improve performance, and the server adopts the timestamp increment of each request by 1.

Framework of indicators

Globally unique: The most basic requirement that is currently unique to a business is ultra high performance: pure memory operation, especially high performance. Time reservation is adopted to solve the clock back problem, and QPS can be higher, theoretically up to 51.2W /s ~ more (different machines have different upper limit, their laptops can be up to 1200(W /s)) trend increasing: Overall increment, the use of Mysql such as b+ tree as an index structure can improve performance information security: self-increment in high, ID is not completely continuous, prevent malicious external data crawling ease of use: development access is very simple

Quick start

There are three ways to allocate the machine id:

  • (Single-server version) ZooKeeper assigns a workerId
  • (Single-server version) DB allocates the workerId
  • (Distributed version) Distribute Distributes the workerId

It is now published to the Maven central repository

Zookeeper distribution workerId

<dependency>
  <groupId>com.github.simonalong</groupId>
  <artifactId>butterfly-zookeeper-allocator</artifactId>
  <! -- replace with a specific version number -->
  <version>${last.version.release}</version>
</dependency>
Copy the code

Use the sample

@Test
public void test(a){
    ZkButterflyConfig config = new ZkButterflyConfig();
    config.setHost("localhost:2181");

    ButterflyIdGenerator generator = ButterflyIdGenerator.getInstance(config);
    // Set the start time. If the start time is not set, the start time is February 22, 2020 by default
    generator.setStartTime(2020.5.1.0.0.0);
            
    // Add a business space, or register it if it does not exist
    generator.addNamespaces("test1"."test2");
    Long uuid = generator.getUUid("test1");
    System.out.println(uuid);
}
Copy the code

Distribution of db workerId

<dependency>
  <groupId>com.github.simonalong</groupId>
  <artifactId>butterfly-allocator-db</artifactId>
  <! -- replace with a specific version number -->
  <version>${last.version.release}</version>
</dependency>
Copy the code

Use the sample

Create tables in the corresponding public library

CREATE TABLE `butterfly_uuid_generator` (
  `id` int(11) NOT NULL AUTO_INCREMENT COMMENT 'primary key id',
  `namespace` varchar(128) DEFAULT ' ' COMMENT 'Namespace',
  `work_id` int(16) COMMENT 'job id',
  `last_expire_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT 'Next expiry time',
  `uid` varchar(128) DEFAULT '0' COMMENT 'Unique ID of this startup',
  `ip` varchar(20) NOT NULL DEFAULT '0' COMMENT 'ip',
  `process_id` varchar(128) NOT NULL DEFAULT '0' COMMENT 'the process id'.PRIMARY KEY (`id`),
  UNIQUE KEY `idx_name_work` (`namespace`,`work_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='Transmitter table';
Copy the code

Write db tests

@Test
public void test(a){
    DbButterflyConfig config = new DbButterflyConfig();
    config.setUrl("JDBC: mysql: / / 127.0.0.1:3306 / neo? useUnicode=true&characterEncoding=UTF-8&useSSL=false&&allowPublicKeyRetrieval=true");
    config.setUserName("neo_test");
    config.setPassword("neo@Test123");

    ButterflyIdGenerator generator = ButterflyIdGenerator.getInstance(config);
    // Set the start time. If the start time is not set, the start time is February 22, 2020 by default
    generator.setStartTime(2020.5.1.0.0.0);
            
    // Add a business space, or register it if it does not exist
    generator.addNamespaces("test1"."test2");
    Long uuid = generator.getUUid("test1");
    System.out.println(uuid);
}
Copy the code

Distributed mode

A distributed mode client, but the corresponding server (code is very simple can also write their own server) side has two:

  • In Dubbo mode: The server obtains the workerId and time fields in Butterfly – Allocator -zk mode and sends the corresponding fields to the client
  • Restful: The server obtains the workerId and time fields in Butterfly – Allocator -db mode and sends the corresponding fields to the client
<dependency>
  <groupId>com.github.simonalong</groupId>
  <artifactId>butterfly-allocator-distribute</artifactId>
  <! -- replace with a specific version number -->
  <version>${last.version.release}</version>
</dependency>
Copy the code

Obtain the value in dubbo mode

Start the butterfly-server module on the server first, and then use the following on the client side

@Test
public void test(a){
    DistributeDubboButterflyConfig config = new DistributeDubboButterflyConfig();
    config.setZkHoseAndPort("localhost:2181");

    ButterflyIdGenerator generator = ButterflyIdGenerator.getInstance(config);
    // Set the start time. If the start time is not set, the start time is February 22, 2020 by default
    generator.setStartTime(2020.5.1.0.0.0);
            
    // Add a business space, or register it if it does not exist
    generator.addNamespaces("test1"."test2");
    Long uuid = generator.getUUid("test1");
    System.out.println(uuid);
}
Copy the code

This parameter is obtained in restful mode

Start the butterfly-server module on the server first, and then use the following on the client side

@Test
public void test(a){
    DistributeRestfulButterflyConfig config = new DistributeRestfulButterflyConfig();
    config.setHostAndPort("localhost:8800");

    ButterflyIdGenerator generator = ButterflyIdGenerator.getInstance(config);
    // Set the start time. If the start time is not set, the start time is February 22, 2020 by default
    generator.setStartTime(2020.5.1.0.0.0);
            
    // Add a business space, or register it if it does not exist
    generator.addNamespaces("test1"."test2");
    Long uuid = generator.getUUid("test1");
    System.out.println(uuid);
}
Copy the code

More content

For a detailed description, see the documentation Butterfly description documentation