directory

More about the index, divided into the following points to explain:

  • Index overview (what is an index, index advantages and disadvantages)
  • Creating an index
  • Iii. Basic principles of indexes (interview key points)
  • 4. Index data structure (B tree, Hash)
  • Five, the principle of creating an index (most important, interview must ask! Please collect!
  • How to delete data with a level of 6 million or above

I. Overview of indexes

What is an index?

Indexes are special files (indexes on InnoDB tables are part of the table space) that contain Pointers to all the records in the table. More generally, an index is a table of contents. When you are using Xinhua dictionary, help you tear off the catalog, you can only search the idiom at the beginning of a word from the first page to the thousandth page. Tired! Give the directory back to you, you can quickly locate!

ⅱ. Advantages and disadvantages of indexes:

It can greatly speed up the retrieval of data, which is the main reason for creating indexes. , and through the use of index, can be in the process of query, use optimization hide, improve the performance of the system. However, indexes also have disadvantages: they require additional maintenance costs; Because index files are separate files, the addition, modification, and deletion of data will generate additional operations on index files, which consume extra IO and reduce the efficiency of adding, changing, and deleting data.

Second, the basic use of indexes

Creating an index :(three ways)

The first way:

ALTER TABLE ALTER TABLE ALTER TABLE

ALTER TABLE creates a normal, UNIQUE, or PRIMARY KEY index.

Table_name indicates the name of the table to which the index is to be added. Column_list indicates the column to which the index is to be added. If there are multiple columns, the columns are separated by commas.

The index name index_name is self-naming. By default, MySQL assigns a name based on the first index column. In addition, ALTER TABLE allows multiple tables to be changed in a single statement, so multiple indexes can be created at the same time.

Third method: Run the CREATE INDEX command to CREATE the INDEX

CREATE INDEX Can add a normal or UNIQUE INDEX to a table. (However, you cannot create a PRIMARY KEY index.)

Third, the basic principle of index

Indexes are used to quickly find records that have specific values. If there is no index, the query will generally traverse the entire table.

The principle of indexing is simple: it turns unordered data into ordered queries

  1. Sort the contents of columns that have been indexed
  2. Generate an inversion list of sorted results
  3. Spell the data address chain on the inverted list contents
  4. In the query, first get inverted list content, and then take out the data address chain, so as to get specific data

The data structure of the index

  • B tree
  • hash

# # Ⅰ. B tree index

Mysql uses storage engine to fetch data, and almost 90% of people use InnoDB. According to the implementation, InnoDB has only two index types: BTREE index and HASH index. B-tree index is the most frequently used index type in Mysql database. Almost all storage engines support BTree index. Mysql > select * from BTREE; select * from BTREE; select * from BTREE;

Query mode:

Primary key index area :PI(address of associated saved data) press the primary key to query,

Common index area :si(address of the associated ID, and then to the address above). So press the primary key query, the fastest

B + tree properties:

  1. N subtree nodes contain n keywords and do not store data but indexes of the data.
  2. All leaf nodes contain information of all keywords and Pointers to records containing these keywords, and leaf nodes themselves are linked in large order according to the size of keywords.
  3. All non-terminal nodes can be considered as index parts, which contain only the maximum (or minimum) keyword in the subtree.
  4. In a B+ tree, data objects are inserted and deleted only on leaf nodes.
  5. B+ trees have two head Pointers, one for the root node of the tree and one for the leaf node of the minimum key code.

ⅱ. Hash index

Briefly said, simple implementation of a HASH table is similar to the data structures (HASH), when we use HASH index in mysql, mainly through the HASH algorithm (common HASH algorithm is directly addressing method, in the square method and folding method, the divisor residual method, random number method), puts the data into a database field long HASH value, The row pointer to this data is stored in the Hash table; If a Hash collision occurs (two different keywords have the same Hash value), they are stored in a linked list under the corresponding Hash key. Of course, this is just a rough simulation.

Principles for creating indexes

Indexes are good, but they are not unlimited. It is best to follow the following principles:

  1. Mysql will keep matching to the right until it encounters a range query (>, <, between, like). A = 1 and b = 2 and c > 3 and d = 4 a = 1 and b = 2 and c > 3 and d = 4 a = 1 and b = 2 and C > 3 and d = 4
  2. Create indexes for fields that are used as query criteria more frequently
  3. Frequently updated fields are not suitable for creating indexes
  4. If the column cannot distinguish data effectively, it is not suitable for index column (such as gender, male and female unknown, there are only three at most, the distinction is too low).
  5. Expand indexes as much as possible, do not create new ones. For example, if you want to add (a,b) to a table that already has an index of A, you only need to modify the original index.
  6. Data columns that define foreign keys must be indexed.
  7. For columns that are rarely involved in a query, do not index columns with a high number of duplicate values.
  8. Do not index columns of data types defined as text, image, and bit.

How to delete data with a level of 6 million or above

About index: Because index needs extra maintenance cost, index file is a separate file, so when we add, modify and delete data, there will be extra operations on index file, these operations need to consume extra IO, which will reduce the efficiency of add/change/delete. So, when we delete millions of database data, check the MySQL official manual to see that the speed of deleting data is proportional to the number of indexes created.

  1. So when we want to delete millions of data, we can delete the index first.
  2. Then delete the useless data (this process takes less than two minutes)
  3. Re-create the index after the deletion (when there is less data) is also very fast, in about 10 minutes.
  4. With the previous direct delete is definitely much faster, not to mention in case of delete interruption, all delete will be rolled back. That’s even worse.

The reader’s welfare

How to improve code quality? — Summary of r&d experience from Ali P8 architects

Ali P8 shares the learning path of Java architects, the sixth point is particularly important

Eight tools every Java Developer should know

Want to interview a Java architect? Do you know the basics?

Draw a map to your core competency, turning midlife crisis into a gas station

There’s no midlife crisis, but setting goals as a plan

Being laid off is not the focus of winter, the focus is how to break the career bottleneck