Count (*), count(1), and count(col) may confuse you.

The count function

COUNT(expression) : Returns the total number of queried records. The expression parameter is a field or an asterisk.

test

MySQL version: 5.7.29

Create a user table and insert a million rows of data with 500,000 rows null for the GENDER field

CREATE TABLE `users` (
  `Id` bigint(20) NOT NULL AUTO_INCREMENT COMMENT 'id',
  `name` varchar(32) DEFAULT NULL COMMENT 'name',
  `gender` varchar(20) DEFAULT NULL COMMENT 'gender',
  `create_date` datetime DEFAULT NULL COMMENT 'Creation time'.PRIMARY KEY (`Id`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=DYNAMIC COMMENT='User table';
Copy the code

count(*)

Before MySQL 5.7.18, InnoDB processed statements by scanning clustered indexes. SELECT COUNT(*) Since MySQL 5.7.18, InnoDB handles SELECT COUNT(*) statements by iterating over the smallest available secondary index, unless the index or optimizer prompts the optimizer to use a different index. If the secondary index does not exist, the clustered index is scanned. If there are multiple secondary indexes, select the smallest secondary index first to reduce the cost. If there are no secondary indexes, use clustered indexes. Here’s a test to verify these ideas.

  • First, query the execution plan with only Id as the primary key index,

As you can see, type is index, key is PRIMARY, and key_len=8.

  • Next, index the name field and view it again using the execution plan

You can see that the same index is used, but the index is the name field index, key_len=99.

  • Then look at the execution plan again by adding an index to the create_date field with the name field index intact

Create_date index key_len=6

Regardless of which index is used, the total number of rows queried is one million, whether they contain NULL values or not.

count(1)

Count (1), like count(*), returns a million pieces of data, whether or not they contain NULL values.

count(col)

Count (col) Count the value of a column, divided into three cases:

The count (id) : id of statistics

The same is true for count(*), which returns one million entries.

Count (index col) : statistics on indexed fields

Use count(name) to query information. The execution plan is as follows:You can see that the index field is used for the statistics, and the index is hit. Set the name field in a column to NULL and run the count query to return 999999The NULL value of this column is set to an empty string, and the count query returns 1000000So, to sum up, the simple count of rows using index fields can hit the index, and only count the number of rows that are not NULL.

Count (normal col) : statistics on fields without indexes

Counting non-indexed fields does not use the index and only counts rows that are not NULL.

Count (1) and count(*)

I don’t know where I saw it or heard it before, but count(1) is more efficient than count(*), which is a wrong perception, because it says on the website, InnoDB handles SELECT COUNT( *) and SELECT COUNT(1) operations in the same way. There is no performance difference. InnoDB handles SELECT COUNT(*) and SELECT COUNT(1) operations in the same way, with no performance difference.

For MyISAM tables, if you retrieve from one table, no other columns are retrieved and there is no clause, COUNT(*) is optimized to return very quickly. This optimization only applies to MyISAM tables because the exact number of rows is stored for this storage engine and can be accessed very quickly. COUNT(1) The same optimization is performed only if the first column is defined NOT NULL. These optimizations are built without where and group by.

Ali development specification also mentionedSo if you can use count(*) in development, use count(*).

conclusion

Count (*), count(1), count(id): return the total number of records in the query, regardless of whether the field contains a null value, and count() and count(1) efficiency is the same, no difference. It can be inferred from the above execution plan that the efficiency of count(id) and count() should be the same or close to that of count() and count(1). If you are interested, you can test it. The number of rows that are not NULL is counted for both non-primary key indexes and non-indexed fields.

General ability, limited level, if any mistakes, please point out. Give it a thumbs up if it helps you