38 | all say InnoDB good, that also don’t use the Memory engine?
Order by NULL is used in both group by statements
- Why is the value 0 in the last line of a temporary memory table?
- When you use a temporary disk table, the value 0 is in the first row, right?
The data organization structure of a memory table
For the sake of analysis, I will simplify this problem by assuming the following two tables t1 and T2, where TABLE T1 uses the Memory engine and table T2 uses the InnoDB engine. I then perform select * fromt1 and select * fromt2, respectively.
create table t1(id int primary key, c int) engine=Memory; create table t2(id int primary key, c int) engine=innodb; Insert into t1 values (1, 1), (2, 2), (3, 3), (4), (5, 5), (6, 6), (7, 7), (8, 8), (9, 9), (0, 0); Insert into t2 values (1, 1), (2, 2), (3, 3), (4), (5, 5), (6, 6), (7, 7), (8, 8), (9, 9), (0, 0);Copy the code
As you can see, memory table T1 returns 0 on the last row, while InnoDB table T2 returns 0 on the first row.
The reason for this difference starts with the way the primary key indexes of the two engines are organized.
Table T2 uses the InnoDB engine. Its primary key index id is organized in a way that you are already familiar with: InnoDB table data is placed in the primary key index tree, the primary key index is B+ tree. Therefore, the data organization of Table T2 is shown in the figure below:
The values on the primary key index are stored in order. When you perform select *, you scan the leaves from left to right, so 0 appears in the first row.
Unlike InnoDB, the Memory engine keeps data and indexes separate. Let’s look at the data in table T1.
As you can see, the data portion of the memory table is stored separately as an array, while the primary key ID index stores the location of each data. The primary key ID is a hash index, and you can see that the keys on the index are not in order.
In memory table T1, when I perform select *, I do a full table scan, which is a sequential scan of the array. Therefore, 0 is the last data to be read and put into the result set.
InnoDB and The Memory engine organize data differently:
- The InnoDB engine puts data on the primary key index, and the other indexes hold the primary key ID. This approach is called an index Organized Table.
- The Memory engine uses a data organization form that stores the data separately and stores the data location on the index, which is called heap Organization Table (HeapOrganizied Table).
From this we can see some typical differences between the two engines:
- InnoDB table data is always stored in order, while memory table data is stored in the order of writing.
- When data files have empty Spaces, InnoDB tables can only write new values at fixed positions to ensure data orderliness, while memory tables can insert new values when they find empty Spaces.
- When data location changes, InnoDB tables only need to change the primary key index, while memory tables need to change all indexes.
- InnoDB tables require one index lookup for primary index queries and two index lookups for normal index queries. Memory tables have no such distinction, and all indexes have the same “status.”
- InnoDB supports variable-length data types, different records may have different lengths; The memory table does not support Blob and Text fields, and even if vARCHar (N) is defined, it is actually stored as char(N), a fixed-length string, so each row of the memory table has the same length of data.
Due to these properties of the memory table, after each row is deleted, the empty space can be reused by the data to be inserted later. For example, if you want to execute in table T1:
delete from t1 where id=5; Insert into t1 values (10, 10); select * from t1;Copy the code
You’ll see that in the result, the line id=10 appears after id=4, which is where id=5 was. It should be noted that the primary key index for table T1 is a hash index, so if a range query is performed, for example
select * from t1 where id<5;
Select * from primary key; select * from primary key; So what if you want a memory table to support a range scan?
Hash index and B-tree index
In fact, the memory table also supports b-tree index. Select * from b-tree where id = 1;
alter table t1 add index a_btree_index using btree (id);
The new b-tree index is familiar to you. It is similar to InnoDB’s B + Tree index. For comparison, you can look at the output of these two statements:
SQL > select * from T1 where id<5; Use force index to force the primary key id; the id=0 line is at the end of the result set.
In fact, one of the reasons we often think Memory tables have the advantage of being fast is that the Memory engine supports hash indexing. Of course, the more important reason is that all the data in the memory table is kept in memory, and memory is always faster to read and write than disk.
It is not recommended that you use memory tables in production. There are two main reasons for this:
- Lock granularity problem;
- Data persistence issues.
Memory table lock
Let’s start with the lock granularity of memory tables. Memory tables do not support row locks, only table locks are supported. Therefore, any update to a table blocks all other reads and writes on that table. Note that the table locks are different from the MDL locks we introduced earlier, but they are table-level locks. Next, I use the following scenario, with you to simulate the table level lock memory table.
In this execution sequence, session A’s UPDATE statement is executed for 50 seconds, during which session B’s query enters A lockwait state. The showprocessList result of session C is as follows:
Table locks do not support concurrent access as well as row locks. Therefore, the lock granularity of the memory table determines that its performance is not good when handling concurrent transactions.
Data persistence Issues
Next, let’s look at data persistence.
Memory tables have the advantage of having data in memory, but they are also a disadvantage. When the database restarts, all memory tables will be cleared.
You might say that if the database restarts unexpectedly and the memory table is cleared, there will be no problem. However, in highly available architectures, this feature of memory tables can be considered a bug. Why do you say that?
Let’s first look at the problem of using memory tables in m-S architecture.
Let’s look at the following sequence:
- Normal business access to the master library;
- The standby database is restarted after hardware upgrade, and the memory table T1 is cleared.
- When the standby database restarts and the client sends an UPDATE statement to modify rows in table T1, the standby database application thread will report an error “no rows to update”.
This will cause the master/slave synchronization to stop. Of course, if a master/slave switchover occurs, the client will see that the data in table T1 is “lost”.
In the proxy architecture, the default switchover logic is maintained by the database system itself. To the client, the network is disconnected and the memory table data is lost after the network is reconnected.
This is fine, you might say, since there is a master/slave switchover, the connection is disconnected, and the business side can sense the exception. However, this feature of memory tables makes usage even more “weird”. MySQL knows that the memory table data will be lost after the restart. MySQL implements a DELETE FROM T1 line in the binlog after the database is restarted.
If you use the double M structure shown here:
When the standby database is restarted, the delete statement in the standby database binlog is passed to the primary database and the contents of the primary database memory table are deleted. When you use it, you will notice that the main library memory table data is suddenly empty.
Based on the above analysis, you can see that memory tables are not suitable for use as regular tables in a production environment. Some students will say, but memory tables are fast. In fact, you can analyze this question as follows:
- InnoDB supports row locking and concurrency is better than memory tables.
- The amount of data that can fit into a memory table is small. If you are considering read performance, a table with high read QPS and small data volume will be cached in the InnoDB Buffer Pool even if InnoDB is used. Therefore, reading performance of tables with InnoDB is not bad.
Therefore, I recommend you to use InnoDB tables instead of normal memory tables. But there was one exception. This scenario is the user temporary table we talked about in Articles 35 and 36. If the amount of data can be controlled without consuming too much memory, you can consider using an in-memory table.
Temporary memory tables can just about ignore two memory table deficiencies for three main reasons:
- Temporary tables are not accessed by other threads, and there is no concurrency problem;
- The temporary table also needs to be deleted after the restart. The problem of clearing data does not exist.
- Temporary tables in the standby repository also do not affect user threads in the primary repository.
- Memory tables support hash indexes, a feature that can be used to speed up complex queries.