This article is from meituan technical team “Zhao Dong” share, the original title “Leaf — Meituan review distributed ID generation system”, the instant message website has correction, revision and re-type, thank the original author of the share.


1, the introduction



Wechat Technology Sharing: Practice of generating serial number of wechat’s Massive IM Chat Messages (Algorithm Principle)
Rongyun technology sharing: decrypting the chat message ID generation strategy of Rongyun IM products






This article shares two ID generation algorithms being used in The Meituan system:





  • 1) Leaf-segment scheme: generate globally unique and globally ordered IDS;
  • 2) Leaf-Snowflake solution: it can generate globally unique and locally ordered ids.









Friendly tips:






Disclaimer:









2. About the author



According to the east:
Meituan large distributed link tracking system Mtrace


3. Related articles


  • Rongyun Technology Sharing: Decoding the Chat Message ID Generation Strategy of Rongyun IM Products
  • Wechat Technology Sharing: Practice of Generating Serial number of wechat’s Massive IM Chat Messages (Part of Algorithm Principle)
  • Wechat Technology Sharing: Practice of Generating serial Numbers of wechat’s Massive IM Chat Messages (Disaster Recovery Solution)


4. Overview of the text









In summary, what are the requirements of ID numbers for business systems?





  • 1) Global uniqueness: no duplicate ID number can appear. Since it is a unique identifier, this is the most basic requirement;
  • 2) Increasing trend: The MySQL InnoDB engine uses clustered indexes. Since most RDBMS use b-tree data structure to store index data, we should try to use ordered primary keys to ensure write performance.
  • 3) Monotonic increment: ensure that the next ID must be greater than the previous ID, such as transaction version number, incremental messages in IM chat, sorting and other special requirements;
  • 4) Information security: if the ID is continuous, it is very easy to snatch malicious users, directly download the specified URL in order; If it is the order number, it is even more dangerous, the competitor can directly know our daily order. Therefore, in some application scenarios, irregular and irregular ids are required.















Therefore, the next distributed ID generation system should do the following:





  • 1) The average delay and TP999 delay should be as low as possible;
  • 2) Availability 5 nines;
  • 3) High QPS.


5. Why didn’t Meituan use the UUID?



A Universally Unique IDentifier (UUID) URN Namespace












Advantages:









Disadvantages:


  • 1) Not easy to store: UUID is too long, 16 bytes 128 bits, usually expressed as a string of 36 lengths, which is not applicable to many scenarios;
  • 2) Information insecurity: The algorithm for generating UUids based on MAC addresses may cause MAC addresses to leak, a vulnerability that was used to find the creator of Melissa’s virus.



When ID is used as the primary key, there are some problems in certain environments. For example, when DB is used as the primary key, UUID is not suitable:





  • [4] The UUID length of 36 characters does not meet the requirements:

    All indexes other than the clustered index are known as secondary indexes. In InnoDB, each record in a secondary index contains the primary key columns for the row, as well as the columns specified for the secondary index. InnoDB uses this primary key value to search for the row in the clustered index.*** If the primary key is long, the secondary indexes use more space, so it is advantageous to have a short primary key***.

  • (2) Bad for MySQL index: If it is used as the primary key of database, the disorder of UUID in InnoDB engine may cause frequent changes of data location, which seriously affects performance.





6. Why doesn’t Meituan just use Snowflake?


6.1 Principles of the SnowFlake algorithm



SnowFlake












SnowFlake ID composition:










Sample SnowFlake ID:










To give you an example, like the following 64-bit long:





  • 1) The first part is a bit: 0, which is meaningless;
  • 2) The second part, 41 bits: represents the timestamp;
  • 3) In the third part, there are 5 bits: the machine room ID is 10001;
  • 4) In the fourth part, there are 5 bits: 1 1001;
  • 5) The fifth part is 12 bits: the serial number, which is the serial number of the ID generated at the same time in a certain machine room in this millisecond, 0000 00000000.



1) 1 bit:












(2) 41 bit:












(3) 10 bit:












(4) 12 bit:

















  • 1) The SnowFlake algorithm system must know the machine room and machine where it is located, for example, machine ID = 17, machine ID = 12;
  • 2) When the SnowFlake algorithm receives the request, it first generates a 64-bit long ID using binary arithmetic. The first of the 64 bits is meaningless.
  • 3) After 41 bits, you can use the current timestamp (in milliseconds), then 5 bits to set the machine room ID, and 5 bits to set the machine ID;
  • 4) Finally, determine the number of requests in this millisecond on the machine in the current machine room, and add a sequence number to the request to generate the ID as the last 12 bits.



The result is a 64-bit ID that looks something like this:




























Easy to understand: How to design a database architecture that can support millions of concurrent transactions?





6.2 Advantages and disadvantages of SnowFlake algorithm for Meituan









Excellent advantages:


  • 1) The number of milliseconds is high, the increment sequence is low, and the whole ID is trending upward;
  • 2) Independent of third-party systems such as databases, deployed in the way of services, with higher stability and high performance of ID generation;
  • 3) It can allocate bits according to its own service characteristics, which is very flexible.



Has disadvantages:









► Application Examples — Mongdb’s objectID:



ObjectID


7. The self-increasing ID of the database is not suitable for Meituan





begin;
REPLACE INTO Tickets64 (stub) VALUES ('a');
SELECT 1536667;
commit;Copy the code
















Advantages:





  • 1) Very simple, using the functions of the existing database system, low cost, DBA professional maintenance;
  • 2) Monotonic increment of ID number can realize some services with special requirements on ID.



Disadvantages:





  • 1) Strong dependence on DB. When DB is abnormal, the whole system is unavailable, which is a fatal problem. Configuring master/slave replication can increase availability as much as possible, but data consistency cannot be guaranteed in special cases. Inconsistency in primary/secondary switchover may lead to repeated number issuance.
  • 2) THE ID sending performance bottleneck is limited to the read and write performance of a single MySQL.



To solve MySQL performance problems, you can use the following solutions:
Ticket Servers: Distributed Unique Primary Keys on the Cheap


TicketServer1:
auto-increment-increment = 2
auto-increment-offset = 1
 
TicketServer2:
auto-increment-increment = 2
auto-increment-offset = 2Copy the code



Suppose we want to deploy N machines and set the step size to N, starting with 0,1,2… N-1 Then the entire architecture looks like the following:










This must-read architecture seems to meet performance requirements, but has the following drawbacks:





  • 1) It is difficult to extend the system horizontally: for example, after defining the step size and the number of machines, what should we do if we want to add machines? Suppose there is only one machine with the issuing number of 1,2,3,4,5 (step size is 1). At this time, one machine needs to be expanded. You can do this by setting the initial value of the second machine much higher than that of the first machine, such as 14 (assuming that the first machine cannot deliver 14 during the expansion period), and setting the step size to 2, so that the machine will deliver even numbers after 14. Then remove the first one, leave the ID value odd, such as 7, and change the step size of the first one to 2. To make it conform to the number segment criteria that we defined, in this case, is to make the first one only produce odd numbers. Does the expansion plan look complicated? That seems fine. Now imagine if we had 100 machines online, what would we do if we wanted to expand? It’s a nightmare. So the system horizontal expansion scheme is complicated and difficult to implement;
  • 2) ID has no monotonically increasing feature: only trend increasing, which is not important for general business needs and can be tolerated;
  • 3) The database is still under a lot of pressure: every time you get an ID, you have to read and write the database, so you can only rely on the heap machine to improve performance.





8, Meituan leaf-segment scheme: can generate globally unique, globally ordered ID


8.1 Basic Principles









Meituan’s leaf-segment makes the following changes to the database auto-increment ID scheme:





  • 1) The original scheme has to read and write the database every time to obtain the ID, resulting in great pressure on the database. Use proxy server to obtain the value of one segment at a time. After use, go to the database to obtain a new number segment, can greatly reduce the pressure of the database;
  • 2) The biz_TAG field is used to distinguish the issuing requirements of different services. The ID of each BIZ -tag is isolated from each other and does not affect each other. If you need to expand the database to meet performance requirements, you do not need to perform the complex capacity expansion operations described above. You only need to divide the biz_TAG database into different tables.



Database table design is as follows:


+-------------+--------------+------+-----+-------------------+-----------------------------+ | Field | Type | Null | Key | Default | Extra | +-------------+--------------+------+-----+-------------------+-----------------------------+ | biz_tag | varchar(128) | NO | PRI | | | | max_id | bigint(20) | NO | | 1 | | | step | int(11) | NO | | NULL | | | desc |  varchar(256) | YES | | NULL | | | update_time | timestamp | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP | +-------------+--------------+------+-----+-------------------+-----------------------------+Copy the code



Description of important fields:


Biz_tag: identifies services. Max_id: indicates the maximum number segment allocated to biz_tag. Step: indicates the number segment length allocated each time.









The frequency of reading and writing to the database decreases from 1 to 1/step, as shown below:
















The max_id of biz_tag will be changed from 3000 to 4000. The SQL statement for the update number segment is as follows:


Begin
UPDATE table SET max_id=max_id+step WHERE biz_tag=xxx
SELECT tag, max_id, step FROM table WHERE biz_tag=xxx
CommitCopy the code









Advantages:





  • 1) Leaf service can be easily linear expansion, and its performance can fully support most business scenarios;
  • 2) THE ID number is a trend increasing 8-byte 64-bit number, which meets the primary key requirements of the database.
  • 3) High disaster recovery: Leaf service has internal number segment cache. Even if DB goes down, Leaf can still provide external services in a short time.
  • 4) You can customize the size of max_ID, which is very convenient for services to migrate from the original ID mode.



Disadvantages:





  • 1) THE ID number is not random enough to reveal the information of the number, which is not secure;
  • 2) TP999 data fluctuates greatly. When the number segment is used up, it will still hang on the I/O update database, and TG999 data will have occasional spikes;
  • 3) DB breakdown may cause the whole system to be unavailable.


8.2 Dual-buffer optimization



As for the second disadvantage mentioned above, some optimization has been made in leaf-segment scheme, which can be simply said as follows:


The timing of retrieving the number segment from Leaf is when the number segment is exhausted, which means that the ID delivery time of the critical point of the number segment depends on the time of retrieving the number segment from DB next time, and the incoming request during this period will also cause the thread to block because the DB number segment is not retrieved. If the network and DB performance is stable, this situation has little impact on the system, but if the network jitter occurs when the DB is taken, or the DB slow query will lead to the response time of the whole system is slow.









Detailed implementation is shown in the figure below:
















The main features are as follows:





  • 1) Each biz-Tag has consumption speed monitoring. It is usually recommended to set the segment length to 600 times of QPS issued during service peak hours (10 minutes), so that Leaf can continue to issue numbers for 10-20 minutes without being affected even if DB goes down;
  • 2) Every time a request comes, the state of the next number segment will be judged and this number segment will be updated. Therefore, occasional network jitter will not affect the update of the next number segment.


8.3 HA Dr



For the third point “DB availability” problem of the above shortcomings of leaf-segment algorithm:
Semi-synchronous mode
DBProxy






MySQL Group Replication



























9. Meituan’s leaf-Snowflake solution: generates globally unique, locally ordered ids



As stated in the preceding section:





9.1 Basic Principles






















Leaf-snowflake starts with the following steps:





  • 1) Start the Leaf-Snowflake service, connect to Zookeeper, and check whether you are registered under the parent node of Leaf_forever (whether there are children nodes in this order);
  • 2) If you have registered your workerID (int type ID generated by zK sequence node), start the service;
  • 3) If not registered, create a persistent sequence node under the parent node, and then retrieve the sequence number as its own workerID number to start the service.









9.2 Weakly Dependent on ZooKeeper








9.3 Solving the Clock Problem
















See the whole startup flow chart above. When the service is started, it first checks whether it has written the ZooKeeper leaf_forever node:





  • 1) If yes, compare the system time with the time recorded by the leaf_forever/${self} node. If the time is less than the time recorded by the leaf_forever/${self} node, the machine time will be considered as a long callback, and the service will fail to start and alarm;
  • Leaf_forever /${self} and write its own system time. Then comprehensively compare the system time of other Leaf nodes to determine whether its own system time is accurate. The specific method is to obtain the service IP address: Port of all temporary nodes (all leaf-Snowflake nodes in operation) under Leaf_TEMPORARY, then obtain the system time of all nodes through RPC request, and calculate sum(time)/nodeSize.
  • 3) If abs(system time -sum(time)/nodeSize) < the threshold, the system time is considered accurate and services are started normally. At the same time, write the temporary node leaf_TEMPORARY /${self} to maintain the lease.
  • 4) Otherwise, it will be considered that the system time of the machine will deviate greatly, and the startup will fail and the alarm will be reported;
  • 5) Write leaf_forever/${self} every three seconds to report its own system time.









The implementation code is as follows:


If (timestamp < lastTimestamp) {long offset = lastTimestamp - timestamp; If (offset <= 5) {try {// If the offset is less than 5ms, wait twice the time wait(offset << 1); //wait timestamp = timeGen(); If (timestamp < lastTimestamp) {// If (timestamp < lastTimestamp) {throwClockBackwardsEx(timestamp); } } catch (InterruptedException e) { throw e; } } else { //throw throwClockBackwardsEx(timestamp); }} // Assign an IDCopy the code









PS:
Github.com/weizhenyi/l…


10. Summary of this paper











Appendix: More articles on hot technologies for IM development



Just one entry for beginners: Developing mobile IM from scratch



Mobile IM Developers must read (1) : Easy to understand the “weak” and “slow” mobile web



Mobile IM Developers must read (ii) : Summary of the most comprehensive mobile weak Network optimization methods ever



This paper discusses the message reliability and delivery mechanism of mobile IM from the perspective of client



Summary of optimization means of modern mobile terminal network short connection: request speed, weak network adaptation, security guarantee



Tencent technology sharing: the evolution of bandwidth compression technology for social network pictures



Session and Token in HTTP short connections



IM Development Basics tutorial: Understand the principles of the front HTTP SSO SSO interface



How to ensure the efficiency and real time of pushing large-scale group messages in MOBILE IM?



The technical issues that mobile IM development needs to face



Is it better to design your own protocol using byte streams or character streams to develop IM?



Does anyone know the mainstream implementation of voice message chat?



Realization of IM message delivery guarantee mechanism (I) : to ensure the reliable delivery of online real-time messages



Realization of IM message delivery guarantee mechanism (II) : Ensure the reliable delivery of offline messages



How to ensure the “timing” and “consistency” of IM real-time messages?



A low cost method to ensure IM message timing



Should I use “push” or “pull” to synchronize online status in IM chat and group chat?



IM group chat messages are so complex, how to ensure that not lost and not heavy?



Talk about optimizing login requests in mobile IM development



How to save traffic by pulling data during IM login on the mobile terminal?



A brief discussion on the principle of multi-point login and message roaming of mobile TERMINAL IM



How to design a “retry failure” mechanism for a completely self-developed IM?



Easy to understand: Load balancing solution sharing at the mobile IM access layer based on a cluster



Technical Experiment and Analysis of wechat’s Influence on Network



Principle, Technology and Application of Instant Messaging System (Technical Paper)



The status of open source IM project “Mogujie TeamTalk” : An open source show with no end



QQ Music team share: Android image compression technology in detail (part 1)



QQ Music team share: Android image compression technology details (part 2)



Tencent original share (a) : how to greatly improve the mobile QQ picture transmission speed and success rate under the mobile network



How to greatly reduce APP traffic Consumption in mobile Network (Part 1)



(3) How to greatly reduce APP traffic Consumption in mobile Network (Part 2)



As promised: Mars, a cross-platform component library for wechat’s mobile IM network layer, has been officially open source



How does Social network-based Yelp achieve lossless compression of massive user images?



Tencent technology Sharing: How Tencent dramatically reduced bandwidth and Network traffic (photo compression)



How Tencent dramatically reduced bandwidth and Network traffic



Character encoding thing: quick understanding of ASCII, Unicode, GBK, and UTF-8



Fully master the features, performance and tuning of mainstream image formats on mobile terminals



Bullet SMS behind the bright: netease yunxin chief architect to share 100 million IM platform technology practice



IM development basics tutorial (five) : easy to understand, correctly understand and use MQ message queues



Wechat Technology Sharing: Practice of generating serial number of wechat’s Massive IM Chat Messages (Algorithm Principle)



Is it so hard to develop IM yourself? Hand-in-hand teach you from a simple Andriod version OF IM (source code)



Rongyun technology sharing: decrypting the chat message ID generation strategy of Rongyun IM products



More of the same…