Some thoughts on global Transmitter in distributed system

Original link: He Xiaodong blog

This article originated from the communication between Panda and Boss Li about the transmitter in Kangshen Communication group. Thank you for letting us learn new knowledge.

Why do I need a transmitter

In a distributed system, a large amount of data, messages, and HTTP requests need to be uniquely identified. For example, in a distributed system, a unique IDENTIFIER is required when services call each other, link analysis, and log tracing. A globally unique ID is required.

What kind of transmitter do you need

persistence

Persistence is a must in order to be globally unique for a long time, and it must not be used again, as well as strong consistency. Optionally stored in Redis or Etcd.

High availability

When the primary server is down, the secondary server can be automatically selected. During the switchover, the ID generated by the number transmitter may be discontinuous, but the service can be normal.

Other features

Depending on the specific service, authentication and permission control are optional and can be used at the request layer to restrict the source IP address and only allow fixed IP address access.

Several commonly used schemes of transmitter

UUID

UUID is short for Universally Unique Identifier, a machine-generated Identifier that is Unique within a certain range (from a specific namespace to global). UUID is a 16-byte 128-bit number, usually represented as a 36-byte string, such as: F89 f2504e0 3-4-11 d3-9 e82c3301 a0c – 0305.

Uuids are generated by a certain algorithm machine. To ensure the uniqueness of UUids, the specification defines elements including nic MAC address, timestamp, Namespace, random or pseudo-random number, time sequence, and algorithms for generating UUids from these elements. The complex nature of UUID, while ensuring its uniqueness, means that it can only be generated by a computer.

Advantages and disadvantages: Local generation, high performance, low latency, long bits, cannot be used as an index field, and is out of order, it is difficult to analyze trends based on features.

Class snowflake algorithm

Snowflake is an open-source distributed ID generation algorithm for Twitter. Its core idea is a long ID: 41bit as the number of milliseconds – 10bit as the machine number – 12bit as the serial number of milliseconds Algorithm A single machine can theoretically generate up to 1000*(2^12) per second, that is 400W ID,

Pros and cons: The entire ID is self-increasing, which is very good for viewing trends and generating independent third-party systems, with high reliability and high adjustability. The downside is a heavy reliance on machine clocks.

MySQL based number generator

Set the auto_increment_increment and auto_increment_offset fields to ensure that the ID is incremented.

begin;
REPLACE INTO Tickets64 (stub) VALUES ('a');
SELECT 1486630;
commit;
Copy the code

To ensure high availability, you need to have multiple MySQL machines and set different increment start values and steps for each machine. For example:

# TicketServer1:
auto-increment-increment = 2
auto-increment-offset = 1

# TicketServer2:
auto-increment-increment = 2
auto-increment-offset = 2
Copy the code

Pros and cons: Simple, reliable, low-cost, and can be maintained by a professional DBA. ID is monotonically increasing, and it is best to have appropriate business. The disadvantage is that it is strongly dependent on the database, and it is difficult to modify the starting point and step size. At the same time, it also needs to ensure the stable availability of the database. Each request to read or write additional data to the database causes a lot of strain and consumes a lot of resources.

The above are the three most commonly used global unique ID generation schemes. Most of them are snowflake like schemes. If the request sequence is not very high, you can lower the accuracy of the time, etc., with high flexibility. The generated ID time trend increases and can even be used as an index field, which occupies less space. It is also convenient to analyze the data in the later period.

Use the unique ID of the production environment

Snowflake – like scenarios are generated on request and cannot be generated in advance. MySQL based number generator, or UUID mode, can generate a large number of data in advance, according to the business to generate the amount of data, put in memory, MQ or Redis List, when the business requests a unique ID, directly pop out one of them. At the same time, an asynchronous scheduled task is added to calculate the number of unique ids remaining in the queue at a fixed time. If the number is insufficient, a large number of ids will be generated in time and added to the alternative queue.

Go snowflake的demo:

Reference article:

Good technical team article – transmitter
CSDN articles
Meituan distributed ID generation
Go snowflake package

As always, recommend a few high-quality courses, you learn something new, I earn a little commission, and we all win.

Some thoughts on global Transmitter in distributed system

Why do I need a transmitter

What kind of transmitter do you need

Several commonly used schemes of transmitter

UUID

Class snowflake algorithm

MySQL based number generator

Use the unique ID of the production environment

Related Posts

High performance index optimization strategy (7) : index and table maintenance

“Linux Performance Optimization in Action” reading notes

Hive calculates the maximum number of consecutive login days.