First, the window is exposed

Early one day, I received many interface error alarms. Although multiple interfaces are involved, the error messages are surprisingly consistent. The relevant background (simulation) is explained as follows:

1. Error message:

Error attempting to get column 'id' from result set. Cause:    
com.mysql.jdbc.exceptions.jdbc4.MySQLDataException: '2.156220742E9' in column '1' is outside valid range for the datatype INTEGER.
Copy the code

2. Code information:

A. build SQL table

CREATE TABLE `t_test`
(
    `id`               int(11) unsigned NOT NULL AUTO_INCREMENT COMMENT 'primary key'.`biz_field`        varchar(255)     NOT NULL DEFAULT ' ' COMMENT 'Business Field'.`biz_field_unique` varchar(255)     NOT NULL DEFAULT ' ' COMMENT 'Business Field (with unique index)',
    PRIMARY KEY (`id`),
    UNIQUE KEY `udx_t_test_biz_field_unique` (`biz_field_unique`))ENGINE = InnoDB
  AUTO_INCREMENT = 1
  DEFAULT CHARSET = utf8mb4 COMMENT ='Test table'
Copy the code
  • The primary key ID is int(11) unsigned
  • The field biz_field_UNIQUE has a unique index

B. Mapper in Mibatis

<select id="getAllTest" resultType="TestPo">
    SELECT * FROM t_test
    WHERE biz_field = #{bizValue}
</select>
Copy the code
  • The query field uses SELECT *
<insert id="insertTest" parameterType="TestPo">
    #{id} is not assigned to the code, so we always use the increment id
    INSERT INTO t_test(id, biz_field, biz_field_unique)
    VALUES (#{id}, #{bizField}, #{bizFieldUnique})
    ON DUPLICATE KEY
    UPDATE
        biz_field = #{bizField}
</insert>
Copy the code
  • Used when inserting dataINSERT INTO ON DUPLICATE KEY UPDATE

C. Mapped Po

//kotlin
class TestPo(
    /**
     * 主键id
     */
    var id: Int = 0./**
     * 业务字段
     */
    var bizField: String = ""./** * Business field (with unique index) */
    var bizFieldUnique: String = ""
)
Copy the code
  • The mapping field ID uses type Int (corresponding to an Integer in Java)

Second, brief analysis

The error message literally translates as:

原文 : '2.156220742E9' in column '1' is outside valid range for the datatype INTEGER. 原 文 : The value of the first column '2.156220742E9' is outside the range of the INTEGER type.Copy the code

Table id = ‘2.156220742E9’= 2156220742 ‘ MyBatis: select (select) from MyBatis: select (select) from MyBatis: select (select) from MyBatis

  • The MySql id is an unsigned int. The value ranges from 0 to 4294967295.
  • In Java, the ID is Integer. The value range is [-2147483648,2147483648].
  • 2147483648(Max Java Integer) < 2156220742(Max Java Integer) < 4294967295(Max mysql int unsigned)

If the value of an id exceeds the upper limit of an int in Java, an error will not be reported

Temporary treatment

In the second step, we identified the immediate cause of the error and made it easier to deal with it temporarily

1. The service does not use this field

A. Remove the field from the Po

If the field is not used in business, you can remove the field from the Po directly. In this way, even if the Mapper SQL has the field, there is no attempt to map the field to the Po, so there is no type conversion problem, and of course there is no error.

//kotlin
class TestPo(
    /** * primary key id */
    //var id: Int = 0,...).Copy the code

B. Remove the field from the SQL

Mybatis (Mapper, Mapper, Mapper, Mapper, Mapper, Mapper, Mapper, Mapper, Mapper)

<select id="getAllTest" resultType="TestPo">
    -- Change select * to a specific field, excluding the id of the field in question
    SELECT biz_field,biz_field_unique 
    FROM t_test
    WHERE biz_field = #{bizValue}
</select>
Copy the code

If you use SELECT * in SQL, it can be quite troublesome to write the field name by field name

2. Services use this field

A. Change the type of the field in the Po

If the business uses this field, you need to change the type of this field in the Po to long.

//kotlin
class TestPo(
    /**
     * 主键id
     */
    var id: Long = 0...).Copy the code

If the field is stored in another table, it needs to be changed, and if the value range of the field’s storage type is less than int unsigned, it needs to be changed

Find the real killer

After the temporary treatment, it returned to normal temporarily, but the real cause remains to be found

1. Is there really 2.2 billion? Of course not

When the error was reported, we saw that the table’s self-increment ID was as high as 2.1 billion, but there was really 2.1 billion data in the table (if there is really 2.1 billion, then you are most likely to be invited to tea by operation and maintenance), so we used select count to confirm that there was only 50,000 data. So the question comes to 50,000 data, then how can the increment of ID can grow to 2.1 billion?

sql-> select count(1) from t_test;
res-> 50000
Copy the code

2. Someone manually inserted data with a large ID? Is not

As we all know, the increment ID value does not automatically backfill, so if I manually insert a value with id= 100 million, the next increment ID will be 101 million. So if we manually insert a data with an ID value of 2.1 billion, then it is possible to have a scenario where the auto-increment ID reaches 2.1 billion with only 50,000 data. (Of course, it is not allowed to modify online data directly.)

To make sure, export all ids and look at the interval distribution

Results: The ID values were evenly distributed, not highly concentrated and with a large span, which did not conform to the conjecture. How did the ID in the middle of the span disappear?

Consider: since the id is obviously discontinuous, so there must be some place where the id is not used, and mysql will not reclaim this part of the ID

Q: How are discontinuities and large spans identified? Doesn’t it look continuous on the graph? Answer: The abscissa of the graph is the serial number of ID, and the ordinate is the id value. There are 50,000 uniformly distributed data in 2.1 billion, and its ID is definitely not a continuous drop. (This may be more intuitive to look at the actual ID value)

3. What can cause the mysql increment id to be discontinuous?

A. Someone deletes the intermediate ID value

No physical deletion operation is performed on this table

B. The code specifies discontinuous ids for insertion

Without this logic, rule out

C. Rollback of transactions with insert statements

Without this logic, rule out

[D]

Table structure shows that biz_field_unique has a unique index, which may cause conflicts and consume the increment id value

If you look at the code, there is only one SQL inserted into the table, and it is clear that the code does not specify the value of the ID, which is guaranteed to be generated by mysql, so there must be some operation that affects the generation of the id value!

4.INSERT INTO ON DUPLICATE KEY UPDATE? yes

INSERT INTO ON DUPLICATE KEY UPDATE INSERT INTO ON DUPLICATE KEY UPDATE INSERT INTO ON DUPLICATE KEY UPDATE INSERT INTO ON DUPLICATE KEY UPDATE Through logical investigation, it is found that the interface where the SQL is located is called frequently, and most of the data is updated, which leads to frequent unique index conflicts and consumption of self-increasing ids, which also conforms to the characteristics of relatively continuous and large interval ids.

5. Follow-up treatment

1. WhyINSERT INTO ON DUPLICATE KEY UPDATE

a.INSERT INTO ON DUPLICATE KEY UPDATEThe role of

As the name suggests, you can use this to update some fields in the event of a unique key conflict, and insert a record in the event of no conflict. Note that this statement contains a two-step operation to determine whether there is a unique key conflict and to insert or update data.

B. Why not use SELECT and then use UPDATE or INSERT?

INSERT INTO ON DUPLICATE KEY UPDATE (DUPLICATE KEY UPDATE, DUPLICATE KEY UPDATE, DUPLICATE KEY UPDATE, DUPLICATE KEY UPDATE, DUPLICATE KEY UPDATE)

I. simple version

Var data ="select * from t_test where biz_field_unique = uniqueValue" if (data == null){ "insert into t_test(biz_field, biz_field_unique) values ('value'.'uniqueValue')" }else{ "update t_test set biz_field = value where  biz_field_unique = uniqueValue"}Copy the code

Experienced friends can see at a glance that there are concurrency problems, as follows:

Request-2’s SELECT statement was executed before request-1’s INSERT statement, causing duplicate insert errors

II. The transaction

If I add a transaction to the SQL server, it is ok.

Var data ="select * from t_test where biz_field_unique = uniqueValue;"
if (data == null){
    "insert into t_test(biz_field, biz_field_unique) values ('value'.'uniqueValue');"
}else{
    "update t_test set biz_field = value wherebiz_field_unique = uniqueValue;" } transaction.com MIT // Commit the transactionCopy the code

Request-2 select statements may be executed before request-1 insert statements because there is no mutex for select statements.

III. Mutex version

In that case, how about I manually give a mutex?

// Pseudo-code transaction.open // open transaction //selectStatements to joinfor update
var data = "select * from t_test where biz_field_unique = uniqueValue for update;" 
if (data= =null) {"insert into t_test(biz_field, biz_field_unique) values ('value','uniqueValue');"
}else{
    "update t_test set biz_field = value where biz_field_unique = uniqueValue;"} transaction.com MIT // Commit the transactionCopy the code

As can be seen from the diagram, this is ok, but its execution process is equivalent to serial, the performance is not good, and its lock competition occurs in mysql, causing pressure to DB.

IV. Distributed version locking

Since you don’t want to put the pressure on db, you can also solve the problem from the service side:

Distributelock. lock var data ="select * from t_test where biz_field_unique = uniqueValue;"
if (data == null){
    "insert into t_test(biz_field, biz_field_unique) values ('value'.'uniqueValue');"
}else{
    "update t_test set biz_field = value wherebiz_field_unique = uniqueValue;" } distributelock. unlock // Releases distributed locksCopy the code

With distributed locks, this string of code can be executed serially, but performance is still poor

V. Integrated business edition

The previous several versions are all expected to get a general solution, then we combined with the actual use of the business situation will be better? First of all, it is clear that there is no delete operation on the table, that is, once the record is inserted, all subsequent operations will be update operations. According to the data in the table, we can get the estimated execution frequency of branches, as follows

// Pseudo code //selectNumber of executions21Hundred million100%
var data = "select * from t_test where biz_field_unique = uniqueValue;"
if (data= =null{/ /insertNumber of executions5w 0.0024%
    "insert into t_test(biz_field, biz_field_unique) values ('value','uniqueValue');"
}else{
    //updateNumber of executions21Hundred million- 5w 99.9976%
    "update t_test set biz_field = value where biz_field_unique = uniqueValue;"
}
Copy the code

As can be seen from the above calculation, the actual proportion of insert is very small, and it is not appropriate to make the whole process run in sequence for the sake of this part of the request volume. Therefore, we can optimize for insert. Using INSERT INTO ON DUPLICATE KEY UPDATE instead of INSERT strikes a balance between ID consumption and performance.

// Pseudo code //select
var data = "select * from t_test where biz_field_unique = uniqueValue;"
if (data= =null{/ /insert into on duplicate key update
    "insert into t_test(biz_field, biz_field_unique) values ('value','uniqueValue') on duplicate key update biz_field = value;"
}else{
    //update
    "update t_test set biz_field = value where biz_field_unique = uniqueValue;"
}
Copy the code
  • Using theINSERT INTO ON DUPLICATE KEY UPDATE, so there will still be a waste of ID values when conflicts occur, but this scenario is rare, and its growth range is acceptable

Write at the end

When encountering problems, it is generally necessary to understand the application scenarios of the business. Although there are common solutions to many problems, they may not be the optimal ones, but the business scenarios need to be considered comprehensively. For example, the problem in this article can use distributed lock, but because our business does not physically delete data, we can discard part of the ID, in exchange for ensuring the correct data, but still maintain good performance.