background

In the test environment, a colleague found a primary key conflict when an ID was inserted. If the Sequence function is used, the memory of each node should have a different ID segment, so this problem cannot occur. Otherwise, it’s going to be cognitive subversion

thinking

  • Did someone manually insert a piece of data, and then manually set it on the way in and outID?
  • Did someone manually adjust itSequencethevalue?
  • Why does the database still existIDSame data but in different tables? Is there something wrong with multithreading?

The preliminary screening

  • Make sure no manual insertion is performedID, are obtained by the program;
  • There’s time and energy to manually set it upSequencethevalueWell, indeed, who cares about that;
  • dataIDSame data in different tables, obviously 2 sets of different itemsNodeThe result of.

Conclusion: It was determined that there was a conflict between the value ranges of the Sequence obtained by the two machines.

That’s exactly what the question shows. Is it really about subverting our perception? Because the problem is relatively serious, so very serious. Be sure to find the cause of the problem!

Specific screening

At this point, we notice that the code has changed the innerStep of the TDDL Sequence from 1000 to 5000. The reason for the large adjustment is that during data migration, there is a large amount of data, which reduces the time for database operation due to ID expansion (in fact, here, it can be seen that the developer has been very good, and other places will also pay great attention to performance design).

Even if I change the internal step size to be different from others, it will not affect the problem of the Sequence conflict. The Sequence should guarantee itself. I don’t know if you have the same idea as me?

With the semi-suspicious Sequence Bug and the need to solve the problem, we began to polish the source code. This is the way to solve the problem

Tddl-sequence-3.2. jar, which uses the GroupSequence.

Find the root of the problem

The first step is to roll up the nextValue() method, and the core code is posted below.

newValue = oldValue + outStep; // The new value is the sum of the old value in the database + the external step size

int affectedRows = stmt.executeUpdate();// Update the new value to the database

return new SequenceRange(newValue + 1, newValue + innerStep);// The range of this node is [newValue + 1, newValue + innerStep]
Copy the code

As far as I can tell, there’s a crater. If the internal step sizes of two projects are inconsistent, the scope will overlap. This is indeed the cause of the problem, but this is not logical. Why do designers design this way? The mood at this point is to clear the TDDL -sequence.

The following look at the source when not quite understand the part of the answer.

Relationship between internal step size and external step size

outStep = innerStep * dscount; // External step = Internal step * Number of data sources where sequence resides
Copy the code

This is a convention in tDDL-sequence. The outStep is the step or unit of the value in the sequence.

Usually, dscount is set to 1, which is the 00 library.

How to adjust the step size?

private boolean check(int index, long value) {
	return (value % outStep) == (index * innerStep); // This is not equal, which means that outStep has been adjusted
}
// If we have only one dscount, where index=0, value should theoretically be an integer multiple of outStep

adjust = true; // Set this parameter to true, and the sequence will be adjusted automatically if the step size is adjusted
// How is it adjusted
newValue = (newValue - newValue % outStep) + outStep + index * innerStep; 
// newValue - newValue % outStep reduces the data to the nearest divisible outStep value and then adds an outStep.
Copy the code

Review questions

Looking back on things, specific examples show:

Using the gods drawn by our group

Just to clarify, two different applications one with a step size of 5000 and one with a step size of 1000. The nodes with small step size will be covered by the ones with large step size.

When the value of the database is 1000;

ProjectA: outStep=5000 Get the range: [6000, 11000], get the sequence first;

ProjectB: outStep=1000 get the range: [7000, 8000] get the sequence;

If the step size node inserts data first and uses an ID value not yet used by the smaller step size node, then the smaller step size node will report a primary key conflict when it inserts data first.

Doubt?

Why is the database value 1000 and the step size 5000? Is the range of fetch [6000,11000]? That’s $5,000 wasted.

This problem is caused by step size adjustment, because sequence requires the database value to be an integer multiple of outStep.

If you like my article, you can follow the individual subscription number. Welcome to leave messages and communicate at any time.