Author: Doug, 10+ years embedded development veteran, focusing on: C/C++, embedded, Linux.

Pay attention to the following public account, reply [books], get Linux, embedded field classic books; Reply [PDF] to get all original articles (PDF format).

On x86 systems, physical memory is paged in 4KB units for more flexible use.

The contiguous virtual memory space is then mapped to discrete physical memory space through the middle mapping table.

Each entry in the mapping table points to the beginning address of a physical page.

However, such a mapping table has an obvious disadvantage: the mapping table itself also needs to be stored in physical memory.

On 32-bit systems, it uses up to 4MB of physical memory space (4 bytes per entry for a total of 4G/4K entries).

To solve this problem, x86 processors use a two-level transformation: page directory and page table.

In this article, we will start from the most basic underlying computing process, the most important memory management mechanism to fix, later to learn more in-depth knowledge points, it will be easier to understand.

The page table splitting process

In a 32-bit system, the maximum representable space of physical memory is 0xFFFF_FFFF, or 4GB.

Although the actual installed physical memory may be much smaller, the memory management mechanism needs to be designed with the maximum addressable range in mind.

The 4GB space can be divided into 1024 * 1024 physical pages in the unit of 4KB for a physical page:

In the previous article, using a single mapping table to point to these physical pages resulted in the mapping table itself taking up too much physical memory space.

Several segments defined in a user program may actually use very little space, no more than 4 GB.

But it still needs to allocate up to 4GB of physical memory space to hold the mapping table, which is wasteful.

To solve this problem, we can split the single mapping table into 1024 smaller mapping tables:

  1. There are only 1024 entries in each mapping table, and each entry still points to the start address of a physical page.

  2. A total of 1024 such mapping tables were used;

In this way, 1024(the number of entries in each table) x 1024(the number of tables) still covers 4GB of physical memory space.

Each of these tables is called a page table, so there are 1024 page tables.

There are 1024 entries in a page table, and each page entry takes up 4 bytes, so a page table takes up 4KB of physical memory space, which is exactly the size of a physical page.

If a page table takes up 4KB by itself, then 1024 page tables take up 4MB of physical memory, which is still a lot.

Yes, in total terms, but it’s impossible for an application to use all 4GB of space, maybe just a few dozen page tables.

For example: a user program code segment, data segment, stack segment, a total of 10 MB of space, then use three page table is enough, plus page directory, a total of 16 KB of space.

Calculation process:

Each page entry points to a 4KB physical page, so 1024 page entries in a page table can cover a total of 4MB physical memory.

So a 10MB program, rounded up (a multiple of 4MB, 12 MB), would need three page tables.

Remember the words above: a page table can cover 4MB of physical memory space (1024 x 4 KB).

The format of each entry in the page table is as follows:

Note the following attributes:

P(Present): indicates the presence bit. 1 – Physical page exists; 0 – Physical page does not exist;

RW(Read/Write): the Read/Write bit. 1 – The physical page is readable and writable; 0 – This physical page is readable only;

D (Dirty) : Dirty bits. Indicates whether the data in this physical page has been written;

Page directory structure

Now, each physical page is pointed to by an entry in a page table, so how to manage the address of the 1024 page table?

The answer is: page table of contents!

As the name suggests: in a page directory, each entry points to the beginning address (physical address) of a page table.

When the operating system loads user programs, it not only needs to allocate physical memory to store the contents of the program;

You also need to allocate physical memory to hold the program’s page directory and page table.

Do the math again:

As mentioned earlier, each page table covers 4MB of memory space, so there are 1024 entries in the page directory pointing to the physical addresses of 1024 page tables.

The page directory is 1024 * 4MB, or 4GB, which is the maximum address range for 32-bit addresses.

The format of each entry in the page directory is as follows:

The property field is similar to the property in the page table, except that its description object is the page table.

One more thing: each user program has its own page directory and page table! More on that below.

Several related registers

Now, all the physical addresses of the page table are pointed to by the page directory entry. How does the processor know the physical address of the page directory?

The answer is CR3 Register, also known as PDBR(Page Table Base Register).

This register holds the page directory address of the task currently being performed.

Each task (program) has its own page directory and page table, and the address of the page directory table is recorded in the TSS section of the task.

When the operating system schedules a task, the processor finds the TSS segment information for the new task to be executed and updates the page directory start address of the new task into the CR3 register.

When the instructions for a new task start to be executed, the processor is operating on linear addresses as it retrieves instructions and manipulates data.

The page processing unit will start with the page directory table stored in the CR3 register and eventually convert this linear address into a physical address.

Of course, the processor also has a quick table to speed up the conversion process from linear addresses to physical addresses.

The CR3 register format is as follows:

By the way, post some other control registers on the official website:

Among them, the highest bit PG of register CR0 is the switch to open the page processing unit.

That is to say:

When the system is powered on, the address addressing mode is always [segment: offset address].

Cr0.pg = 1 can be set when the startup code has the page directory and page table ready.

At this point, the page processing unit in the processor comes into play: for any linear address, it goes through the page processing unit to get a physical address.

Loading user program: allocation and populating of page directories and page tables

In the previous article, we introduced the whole process of a user program being loaded by the operating system, briefly as follows:

  1. Read the program header information, parse out the total length of the program, and allocate an adequate contiguous space from the task’s own virtual memory;

  2. Allocate a free physical page to be used as the program’s page directory. The address of the page directory is recorded in the TSS section created later.

  3. Using a linear address in virtual memory, allocate a physical page (4 KB) and register it in the page directory and page table.

  4. Read eight sectors of data (512 bytes each) from the hard disk and store them in the physical page just allocated;

  5. Check whether the program content has been read: Yes – go to step 6; No – Return to step 3;

  6. Create the necessary kernel data structures for user applications, such as TSS, TCB/PCB, etc.

  7. Create an LDT for the user program and create each segment descriptor in it;

  8. Copy the entries in the high-end address section of the operating system’s page directory to the user program’s page directory table.

In this way, all user programs in the page directory, high address entries point to the same page table address, achieving the purpose of sharing “operating system space.”

To focus on step 3, assume that the user program files are 20 MB long on the hard disk and the actual physical memory installed on the computer is 1 GB.

In the page directory, each entry covers 4 MB of space, so 20 MB of data needs 5 entries.

In the initial state, all entries in the page directory are empty, where the P bit is 0, indicating that the page table does not exist.

The operating system first allocates a 20 MB block of virtual memory, starting at a 1 GB (0x4000_0000) address, which is linear.

That is, the application file is read into memory, starting at address 0x4000_0000 and growing towards higher address.

Note: In the “flat” piecewise model, linear addresses are equal to virtual addresses.

0x4000_0000 = 0100_0000_0000_0000___0000_0000_0000_0000

The first 10 bits represent the index of the linear address in the page directory, the middle 10 bits represent the index in the page table, and the last 12 bits represent the offset address in the physical page.

Therefore, the first 10 bits are 0100_0000_00, indicating that this linear address is in the 256th entry in the page directory:

The operating system found that the entry was empty and did not point to any page tables.

We then find a free physical page from physical memory and use it as the page table to which the 256th entry in the page directory points.

Note: This physical page is used as a page table, not as a store for user program files.

Suppose a free physical page is found at an address of 128 MB (0x0800_0000) on physical memory to use as the page table.

Empty all 1024 entries in the page table and register the physical address of the page table 0x0800_0000 in the 256th entry in the page directory: 0x08000 (yellow).

Why not 0x0800_0000?

Because the address of a physical page must be 4KB aligned (the last 12 bits are all zeros), only the top 20 bits of the page table address need to be recorded in the page directory entry.

Now that you have the page table, it’s time to allocate a physical page to store the contents of the program.

Suppose you find another free physical page at the address 0x0800_1000 on top of the physical page that was used as the page table.

At this point, the address of the physical page where the program content is stored needs to be recorded in an entry in the page table.

So where should it be recorded in the page table? Which entry should I register in?

The middle 10 bits of the linear address are needed to determine:

0x4000_0000 = 0100_0000_0000_0000___0000_0000_0000_0000

If the middle 10 bits are all 0, the index value is 0, that is, the 0th entry in the page table, which holds the address of the physical page, as shown below:

The address of a physical page must be 4KB aligned (the last 12 bits are all zeros), so only the top 20 bits of the physical page address need to be recorded.

With the physical pages allocated to store the contents of the program file, it’s time to read the contents of the program file from the hard disk.

The size of a physical page is 4 KB and the size of a sector on the hard disk is 512 B, so a physical page can be filled by reading data from eight sectors in a row on the hard disk.

We’ve just assumed that the user program file is 20 MB long on hard disk.

When the contents of a physical page are read, the user program content is not read through the calculation, and the above process is repeated.

  1. Linear address added 4KB: 0x4000_1000 = 0100_0000_0000_0000___0001_0000_0000_0000_0000;

  2. The first 10 bits are still the 256th entry in the page directory, and the page table to which this entry points already exists, so no physical pages need to be allocated as page tables.

  3. Allocate a free physical page for program content, assuming one is found at 0x0100_4000, and register this address in the page table;

At this point, the middle 10 bits of the linear address have an index value of 1, so the first entry is registered in the page table.

  1. Read eight sectors of data from the hard disk and write to the physical page;

This is because the size of an entry in the page directory is 4 MB(that is, the total amount of physical page space pointed to by 1024 entries in a page table).

So when 4 MB of program content is read, all entries in the page table are filled.

At this point, the linear address space occupied by the program content is 0x4000_0000 ~ 0x403F_FFFF.

0x4040_0000 = 0x4040_0000 = 0x4040_0000

  1. Determine the page catalog entry:

0x4040_0000 = 0100_0000_0100_0000___0000_0000_0000_0000, the index value of the first 10 bits is 257;

  1. Find the entry 257 is empty and allocate a free physical page to be used as the page table.

  2. Assign a physical page to store the contents of the program file and record the address of the physical page in the page table;

The middle 10 bits of the linear address 0x4040_0000 have an index value of 0, so they are registered in the first entry of the page table.

The process behind is no longer nagging, same same ~~

The final layout of the page table and table should look something like this:

Linear address to physical address search, calculation examples

If you understand the content of the previous topic, you should be able to skip the section because it is the opposite process and the search process is easier.

Still continuing with our hypothesis:

  1. The length of the user program is 20 MB, stored in the virtual memory 0x4000_0000 ~ 0x4140_0000 (linear address) space;

  2. The code segment is 8 MB in length and starts at 0x40C0_0000 in virtual memory;

As shown below:

Now, the contents of the user program are all read into memory, and the page directories and page tables are all arranged.

In the page directory table, there are five entries that represent exactly this 20MB address space.

The physical page addresses stored by 8 MB of code are registered in the 259 and 260 entries in the page catalog table (the green entry on the right of the figure above).

Target: While executing code, the processor encounters a linear address, 0x4100_8800, which the page processing unit needs to translate to a physical address.

0x4100_8800 = 0100_0001_0000_0000___1000_1000_0000_0000

First, based on the first 10 bits of the linear address (0100_0001_00), we get an index value of 260 in the page directory.

The page table address recorded in this entry is 0x08040, because the bottom 12 bits of the page table address must be bit0, so the page table address is 0x0804_0000.

The starting address of the page table of contents, which must be obtained from the CR3 register;

Then, based on the middle 10 bits of the linear address (00_0000___1000), we get an index value of 8 in the page table.

The physical page address recorded in this entry is 0x02004. Add 12 bits of low value to get the physical page starting address 0x0200_4000.

Finally, based on the last 12 bits of the linear address (1000_0000_0000), its offset on the physical page is 2048.

That is, 2048 bytes offset from the start address of the physical page (0x0200_4000) is the physical address (0x0200_4800) corresponding to the linear address (0x4100_8080).

And you’re done!





That’s pretty much the end of the discussion about converting virtual addresses to physical addresses, page directories, and page table searches.

I don’t know if you have enough food and drink?

Next week, I’ll write another “meta-manipulation” of the page table and the page table itself, and the series will be almost over.

If you are still satisfied, please encourage me, give a thumbs-up and forward it to my technical friends in the circle of friends, which is also the biggest encouragement for me, thank you very much!

Recommended reading

[1] C language pointer – from the underlying principle to the tricks, with graphics and code to help you explain thoroughly [2] step by step analysis – how to use C to achieve object-oriented programming [3] The original GDB underlying debugging principle is so simple [4] inline assembly is terrible? Finish this article and end it!

Other series albums: selected articles, C language, Linux operating system, application design, Internet of Things