MySQL > create index

Copy the Table manner

InnoDB initially supported the creation of indexes through temporary table copy. Create a temporary table with a new index, copy all the data of the original table to the temporary table, and Rename the index creation operation.

In this way, the index is created while the original table is readable. But it uses up twice as much storage.

Inplace way

Native MySQL 5.5 and the way to create indexes provided in the Innodb_plugin. Run directly on the original table without copying temporary tables. This is an improvement over the Copy Table approach.

Inplace creates an index. During the creation process, the original table is also readable, but not writable.

The Online mode

Online add index Create statement: create index….. online

MySQL 5.6.7 provides the method of creating an index. InnoDB Online Add Index, first of all, Inplace to create Index, no need to use temporary table. The original table record can be modified as the clustered index is traversed, the record is collected, and inserted into the new index. The modified records are saved in the Row Log. After the clustered index has been traversed and inserted into the new index, the Row Log is replayed to bring the new index into a consistent state with the clustered index record.

Compared with Copy Table mode, Online Add Index adopts Inplace mode, which does not need Copy Table and reduces space overhead. Meanwhile, Online Add Index locks the table only when the last Block of the Row Log is replayed, reducing the time it takes to lock the table.

Compared with Inplace, Online Add Index absorbs the advantages of Inplace, but reduces the time of locking the table. The lack of version information on the new index makes it impossible to provide snapshot reads for older transactions.

2. Database index principle

A database index is a data structure that is maintained through additional write operations and storage space to improve the speed of data reading. Mainstream RDBMS take balanced Tree (B- Tree, B+Tree) as the default database index structure of database table. InnoDB and MyISAM engine use B+Tree as index implementation, but different from MyISAM, InnoDB primary key index is clustered index, while secondary index is non-clustered index. All indexes of MyISAM are non-clustered indexes, and indexes meet the left-most prefix matching rule. There are also databases that use hash buckets as data structures for indexes.

B+ trees are different from B- trees

  • Nodes in the B+ tree do not store data, and all data is stored on leaf nodes. As a result, the query time complexity is fixed at log N. However, non-leaf nodes of b-tree store data, and the query time complexity is not fixed, which is related to the position of key in the tree. The best value is O(1).
  • When the B+ leaf nodes are connected in pairs, the interval access can be greatly increased and the range query can be used, etc., while the key and data of each node in the B-tree cannot be searched in the interval.
  • B+ trees are better suited for external storage. Since the internal nodes have no data field, each node can index a larger and more accurate range.

A primary key is assigned to a database table. With a MySQL database, if a primary key is not actively set, the first unique index column without NULL is selected as the primary key column and used as a clustered index. If there is no such index, a clustered index is generated using the row number as the primary key. This row number is 6bytes, incremented. You can use select _rowid from table.

Clustered index

If you give a table a primary key, the table’s on-disk storage structure changes from a neatly arranged structure to a tree structure, or “balanced tree” structure. In other words, the entire table becomes an index, or clustered index. So a table can only have one primary key, one clustered index. The primary key converts the data format of the “table” into the format of the “index (balanced tree)”. Clustered index leaf nodes hold data.

Index can increase the speed of database query data, but decrease the speed of data writing, because the structure of balanced tree must always be maintained in a balanced state, adding, deleting and modifying data will change the index data content in each node of the balanced tree, damage the tree structure, therefore, every time the data changes, The DBMS must recomb the structure of the tree (index) to make sure it is correct, which incurs a significant performance cost, which is why indexes can have side effects on operations other than queries.

Nonclustered index

Non-clustered index is the general index that we usually use, and clustered index is the same as balanced tree as the data structure of the index. The value of each node in the index tree structure comes from the index field in the table. If you add indexes to multiple fields in the table, there will be multiple independent index structures, and each index (non-clustered index) is not associated with each other. Each time a new index is created for a field, the data in the field is copied and used to generate the index. Therefore, adding indexes to a table increases the size of the table and occupies disk storage space.

The difference between a non-clustered index and a clustered index is that a clustered index can be used to find the data to be searched, while a non-clustered index can be used to find the corresponding primary key value of a record, and then use the primary key value to find the required data through a clustered index.

Cover index

Unlike clustered and non-clustered indexes, a covered index can find the desired data without using a primary key. When a field is indexed, the contents of the field are synchronized to the index. If you specify two fields for an index, the contents of both fields are synchronized to the index. By overwriting index lookup directly, you can omit the next two steps of not using overwriting index lookup, which greatly improves query performance.