One, foreword

While doing my high-performance Java persistence training, I realized that it was necessary to explain how relational databases work, otherwise it would be difficult to grasp many concepts related to transactions, such as atomicity, persistence, and checkpoints.

In this article, I’ll provide a high-level explanation of the inner workings of relational databases, while also hinting at some database-specific implementation details.

Two, a picture wins thousands of text

Third, the Data pages

Disk access is slow. Memory, on the other hand, is several orders of magnitude faster than solid-state drives. For this reason, database vendors try to delay disk access as much as possible. Whether we’re talking about tables or indexes, the data is divided into pages of a certain size (for example, 8 KB).

When data (tables or indexes) needs to be read, the relational database maps disk-based pages to an in-memory buffer. When data needs to be modified, the relational database changes the in-memory Pages. To synchronize memory Pages with disk, you must flush (for example, fsync).

The buffer pool for storing disk-based pages is limited in size, so it is often necessary to store a working set of data. The buffer pool can store the entire data set only if the entire data can be put into memory. However, if the total data on disk is greater than the buffer pool size when new pages need to be cached, the buffer pool will have to exhale the old pages to make room for the new pages.

Four, the Undo log

Because changes in memory can be accessed by multiple concurrent transactions, concurrency control mechanisms (such as 2PL and MVCC) must be used to ensure data integrity. Therefore, once a transaction modifies a table row, the uncommitted changes are applied to the memory structure, and the previous data is temporarily stored in the undo Log append-only structure.

While this structure is called an Undo log in Oracle and MySQL, in SQL Server, transaction logs play this role. PostgreSQL does not have undo logs, but serves the same purpose with a multi-version table structure, because tables can store multiple versions of the same row. However, all of these data structures are used to provide rollback capability, which is mandatory for atomicity.

If the currently running transaction is rolled back, undo Log is used to rebuild the pages in memory at the time the transaction started.

Fifth, Redo log

Once the transaction commits, the changes in memory must remain unchanged. However, this does not mean that every transaction commit triggers Fsync. In fact, this is very bad for application performance. However, from the ACID transaction properties, we know that committed transactions must provide persistence, which means that committed changes need to persist even if we unplug the database engine.

So how can a relational database provide persistence without issuing fsync every time a transaction commits?

This is where the Redo log comes in. The Redo log is also an end-only disk-based structure that stores every change experienced by a given transaction. Therefore, when the transaction commits, each data page change is also written to _redo log_. Writing to the redo log is faster than refreshing a fixed number of Data Pages because sequential disk access is much faster than Random Access. Therefore, it also allows transactions to be processed quickly.

Although this structure is known as the Redo log in Oracle and MySQL, in SQL Server, the transaction log also plays this role. PostgreSQL calls this write-ahead logging (WAL).

But when do you flush changes in memory to disk?

Relational database systems use checkpoints to synchronize dirty Pages in memory with their disk-based counterparts. To avoid I/O traffic congestion, synchronization is usually completed in blocks within a long period of time.

But what happens if the relational database crashes before all the dirty memory Pages are flushed to disk?

In the event of a crash, at startup, the database will use the Redo log to rebuild disk-based Data Pages that have not been synchronized since the last successful checkpoint.

Six, the conclusion

These design considerations were adopted to overcome the high latency of disk-based storage while still providing persistent storage guarantees. Therefore, you need the undo Log to provide atomicity (rollback capability) and the Redo log to ensure the persistence of disk-based pages (tables and indexes).

7. The translator says:

Hello everyone, I am the spring brother (micA Java micro service component open source author) translation is not easy, please help to share with more students, thank you!!