This article is compiled by Mr. Lai Zheng in the efficient operation and maintenance community 728 database salon sharing, Mr. Lai’s topic is MySQL 8.0 InnoDB new features.

I’m sorry. I just met some old friends. New ones are welcome. I’m very glad to participate in this activity. The topic of my speech today is what is good about InnoDB in MySQL8.0.

I would like to ask if you have used MySQL8.0. Show of hands. OK, not many.

Our Release date in China should be June. Some of the features, I think you are more interested in.

After all, 8.0 is a very large version of the change, we directly jumped from 5.7 to 8.0, the change of version rules, we will have some different paths in the product development process.

My speech today is mainly divided into several parts, the first is the improvement of functions, mainly talking about InnoDB in MySQL8.0, I only talk about InnoDB, because I am InnoDB group, what functions InnoDB provides in MySQL8.0.

Part two, performance improvements, what InnoDB is doing in 8.0.

And some of the things that I’m going to talk about today, the features and performance improvements that I’m going to talk about today, are just features that are already in our 8.0 code. In the future, there may be some new features in 8.0, which we will introduce in the future outlook.

The first big change in InnoDB, in 8.0, is the reconstruction of the data dictionary. You should know what a data dictionary is, that is, table structure, your user definition, everything related to DDL goes into the data dictionary.

Why are we doing this refactoring? Because before 8.0, 5.7, 5.6 and earlier, it has a big problem. MySQL has two sets of data dictionaries, one set of data dictionaries in MySQL and one set of data dictionaries in InnoDB.

MySQL layer, where does it put these data dictionaries? You put it in files,.frm stuff.

Where InnoDB is, InnoDB is in the head of its own table. What’s the problem with that?

First, it does not support atomic operations, i.e. the DDL cannot do atomic operations, whereas InnoDB tables can. There is no support for atomic operations and hence there is nothing. The two sets of dictionaries do not match.

The same table may have one table structure in MySQL and another in InnoDB, so the second problem is that the data dictionary is inconsistent.

Third, we need to be very careful when dealing with DDL concurrency, because two sets of data dictionaries have different concurrency mechanisms.

InnoDB uses dict_sys::mutex, but MySQL uses MDL, which is more likely to cause deadlocks because there are two different locking mechanisms mixed together.

Finally, a possible problem is that the table cannot be recovered from a crash, because the table structure is already broken.

This is something we’ve been wanting to do for a long time, and it took us until 8.0 to get it done. After we made these changes in 8.0, we got some benefits.

First of all, there is only one data dictionary. Second, atomic manipulation is supported, and there will be no inconsistency. Third, we support transactional DDL, which is the ACID feature where you make a DDL that supports all transactions.

Fourth, data dictionaries are only stored in InnoDB. As I mentioned earlier, we use the same locking mechanism to manage concurrency control for this data dictionary, namely MDL manages it. The.frm file, we have basically removed it, there may be residual.FRm files in 8.0, but we have basically eliminated it.

So this is one of the big, basic things we did in 8.0.

But you may not have too much feeling, because you usually do not encounter this problem, but once encountered, it may be a big problem, because your table is damaged, table structure is damaged.

So that you don’t have a problem like this in the future, we made this very fundamental change.

This is a diagram, in 8.0 we directly write, directly access the InnoDB data dictionary, like the original 5.7 access to FRM files, also access the InnoDB system table, this table is more intuitive to tell you how the original data dictionary and the current data dictionary.

Just now, I finished the first major improvement. It is worth mentioning that this major improvement was finally completed by the Chinese team. We finished this major function with about four Chinese people.

The function of adding fields quickly is also made by our Chinese friends, but it is not official. The initial patch was provided by Tencent team, and we improved it and merged it into our version. This function is very practical and may be used frequently.

It’s tempting to add a field to the application table now, but where was it? Adding a field will wait, maybe a long time, and it will need to create a new table and copy the original data.

Now we don’t have to, we don’t have to rebuild the table, we don’t have to copy the data.

Hair is a plus field instruction, instant return, in order to be able to do this, we changed the page structure, change the page structure of the data, can tell us, your current field records, or didn’t add field records, we aimed at two versions have different way to handle it, this is the function to achieve the purpose of, this is very practical, You will use this feature a lot in 8.0.

The third function, official security, is encryption, which is a continuation of the encryption function in 5.7. In version 5.7, we provide table data encryption, which is known to be in version 5.7.

In the next implementation, we’ll encrypt redo-log and undo log. If I don’t encrypt redo-log and undo log, I’m going to insert something that’s equal to a triple, and when it’s entered into these two logs, it’s going to have a triple, right?

So we’re going to encrypt these two so it’s more secure, and it doesn’t matter if you get your hands on the data, because it’s encrypted.

I have some shared tablespaces. You should know that you can create multiple tables in a single tablespace. Before our encryption, only support encryption a table a IBD file case, now for this can also be encrypted.

The next feature, Memcached and the enhancements to autoincrement columns, is Memcached, which is a caching system. InnoDB’s Memcached plugin, which you may not be familiar with.

This plugin provides direct access to InnoDB without the need to go through the upper level parsing of things, and direct API access to InnoDB.

In 8.0, we made enhancements to the Memcached plugin that now support multiple Get and range queries.

It used to be a single point search, but now you type in a range, and I’m going to find all the data greater than 5, less than 100, and I’m going to find all the values, and I can do that now.

The second function, the improvement of the increment column, originally comes from the current value of the increment column. How do we remember it? Actually, we don’t remember, we don’t know what the maximum value of its increment column is now.

How do we know? Getting the maximum value of the current increment column every time the table is restarted or opened can cause some problems, sometimes the increment column is repeated or something else is seen.

We will now put it in the data dictionary and not find the current value of the autoincrement column in this way.

There are also information schema changes, such as adding new information to access the current schema state, which will not be covered.

I’ve just talked about some of the major feature improvements in 8.0, and I’m looking ahead to more major features.

Next, for the second part, I’ll talk about the performance improvements InnoDB has made in 8.0.

The first significant performance improvement is the redo log parallel writing. Writing redo logs is a bottleneck in InnoDB. Because everyone does a redo log when they’re doing DML.

Write a redo log. In high concurrency cases, everyone has to write a redo log and there’s a conflict, and you don’t write as smoothly, do you?

Where would they be stuck? They would get stuck writing the log buffer, which has a large mutex protecting the log buffer. We removed this in 8.0.

Now, to write a redo log, we use an algorithm that doesn’t require a lock, which preallocates a chunk of the log buffer to a log buffer, which is allocated by LSN.

So instead of stealing the log buffer, it preallocates a portion of the log buffer that you write to, which greatly improves the redo log bottleneck.

For an idea of how much improvement, you can go to the website of our testing colleague’s department. His dedicated website breaks down the test results, and at least 50% performance improvement in high concurrency cases.

Especially as you keep increasing the number of threads, you used to have a peak where the performance curve would go down, and now it’s going to go up, especially with 128 concurrent threads, and you can look at that curve.

The second part of the performance improvement is three. The first is a cost-based optimization that provides statistics, such as pages in memory in the index, to the optimizer to see which index I am currently using is appropriate.

Percona also provides a buffer pool that is removed and divided into several buffer pools. Ali’s optimized redo log patch.

Purge has also been used to purge tables by ID, which allows us to group them by table ID to reduce conflicts when there are multiple threads.

When deleting data, the secondary index is not deleted immediately, but marked as deleted. The advantage of this is that I can delete data more quickly.

But when does that data get deleted? The Purge thread does it by looking for all the secondary index records that have been marked and which can be deleted, and then it deletes them all.

Before we didn’t have this feature, everyone would be mixed together, so it would be slow. Now we are going to group by this method, so it will be faster and have less impact on the foreground thread.

Deadlock detection can be adjusted dynamically. I expect that few people adjust deadlock detection, which may not have a significant impact on performance, but only in extreme cases.

In addition to these two big chunks, the first feature we are likely to add in 8.0 is better management of tablespaces. Our tablespaces will be more and more like Oracle’s.

So, for example, versioning a table space, so I can put tables of different formats, pages of different formats in the same table space, and that might provide some functionality in the future.

There is also a table space that will describe itself, so when you take the table space to another place, you need to take your dictionary definitions out, otherwise you don’t know what the structure of the table looks like.

We’ll probably use SDI in the future, put your current table structure information into this file, and then use this JSON file to copy it to the destination, and then use this JSON file to see what the structure of the table actually looks like.

In addition, there is better storage support for documents and JSON. This is in response to MangoDB, which has an advantage over MySQL in handling large text data or large blobs. This support was enhanced in 8.0.

For example, more flexible handling of BLOB objects has been implemented in version 8.0, which eliminates the need to rewrite a BLOB object by changing a field. If I change only one part of it, I might change only a very small part of the data.

There will also be performance optimizations in the future, for things life cycle optimizations, unnecessary removal, some improvements to MVCC, some improvements to deadlock detection. Also file system changes, do conflict protection.

Here is the connection to download, and most importantly, you want to know the latest performance test report of 8.0, you go to my colleague’s website, you can see the latest TPS, QPS of 8.0.

With good tools, optimize your database for your business! In particular, how to improve the execution efficiency of large databases to meet business needs!

Look at GOPS 2018 Shanghai station, wonderful issues, full of dry goods! Mr. Liu Chen from China and South Korea will meet you at GOPS 2018 Shanghai Station data Special session.

Disclosure: Teacher Liu is a very funny person in private!

GOPS 2018 Shanghai Station video introduction ⬇️

Read the article for more details