1 Virtual Memory

1.1 concept

When a process is running, part of it is first loaded into the memory and the other part is temporarily left in the disk. When the instructions to be executed or data to be accessed are not in the memory, the operating system automatically transfers them from the disk to the memory. The virtual address space is the virtual memory allocated to the process. A virtual address is a location in virtual memory where instructions or data can be accessed.

1.2 Hierarchical structure of memory

1.3 Virtual storage and Storage System

  1. The organic combination of memory and disk use, so as to get a large capacity “memory”, that is, virtual memory
  2. Virtual memory is the abstraction of memory, built on the storage system, by the operating system to coordinate the use of each memory
  3. Virtual memory provides a much larger address space than physical memory. The size of virtual memory is limited by the addressing mechanism of the computer system and the available disk space.

1.4 Address Protection

There are three main points:

  1. Ensure that each process has a separate address space
  2. Ensure that the process has access to a valid address range (to prevent address transgressions)
  3. Ensure that the process’s operations are legal (to prevent overreach, such as not being able to write to read-only memory)

1.5 Virtual Paging

Virtual pager storage management system: Virtual storage technology + pager storage management solution. The basic idea of virtual page is:

Before the process starts to run, it does not load all pages, but one or zero pages. After that, it dynamically loads other pages according to the needs of the process. When the memory is full and a new page needs to be reloaded, it replaces a page in the memory according to some algorithm so as to load a new page.

There are two specific ways:

  1. Demand Paging: The OS calls into memory only when required
  2. Prepaging: Preloading resources that might be needed into memory. Virtual pages leverage a resource conversion technique: swapping CPU time and disk space for expensive memory.

2. Design of page table and page table items in virtual page format

2.1 concept

A page table is composed of page table entries. The design of page entries is guaranteed by hardware. The following table shows the composition of page entries. Table 2-1 Composition of entries on the page

attribute instructions
Page frame no. That is the memory block number, also known as the physical page number
Significant bit Also known as the interrupt bit, indicating whether the page is in memory or on disk, usually a value of 0 indicating that the corresponding content has not been read into memory
Access to a Reference bit indicating whether the page has been accessed
To modify a Whether this page has been modified in memory
To protect a Page read/write permissions

2.2 About page tables

2.2.1 Page table size

1 About page table size of 32-bit virtual address space? If the page size is 4K and the page entry size is 4 bytes, the address space of a process has 232/(4K)=232/212=220 pages. Its page table needs to occupy [4K/4=1K=1024] pages. What is the page table size of the 64-bit virtual address space? The page size is 4K, and the page entry size is 8 bytes. Page table size reaches 32000TB. Imagine how big it is. Therefore, it is necessary to realize the discontinuous storage of Page table pages in memory. The basic idea is to create an address index table for Page table pages, that is, Page Directory.

2.2.2 Multi-level page table structure and address mapping

2.2.2.1 Secondary page table structure and address mapping

2.2.2.2 Multi-level page table Structure (Core I7 as an example)

2.2.3 Page Catalog Entries and Page Entries (I386 as an example)

2.3 Reverse (invert) the page table

2.3.1 Basic ideas

Based on the above analysis, the address translation process starts from the virtual address space: the virtual address – searches the page table – to obtain the page frame number – to form the physical address. However, one page table per process consumes a lot of resources, so the introduction of inverted page table, the basic idea is as follows:

From the physical address space, the system establishes a page table. Page entries record information about process I and the mapping between a virtual address (virtual page number) of process I and the page enclosure number.

2.3.2 design

  1. Map the page number portion of the virtual address to a hash (to solve the problem of slow queries with only one page table);
  2. The hash value points to an inverted page table;
  3. The size of the inverted page table is proportional to the actual memory, regardless of the number of processes.

3 Address mapping process and TLB introduction

3.1 Address Mapping Process

3.2 Introduction of fast table (TLB)

3.2.1 Problems of address mapping

In a two-level or multi-level page table, at least two memory accesses are required in the address mapping process. As a result, CPU instruction processing speed differs greatly from memory instruction access speed, and CPU speed cannot be fully utilized. In order to speed up address mapping and improve system performance, fast table is introduced based on the principle of program access locality.

3.2.2 Concept of fast table

TLB — Translation look-aside Buffers, which are introduced into the CPU to match the processing rate and memory access speed of the CPU. TLB is a type of access memory, in addition to the wiring addressing mechanism, there is wiring logic, according to a specific matching flag in a storage cycle to compare all words at the same time. TLB, also known as Associative Memory, is characterized by parallel search by content and saves part of the page table entries of the running process.

3.2.3 Address Mapping Process after Fast Table Is Introduced



As shown in the figure above, the mapping from a virtual address to a physical address is mainly about the page number, because the in-page offset of the physical address is the same as the in-page offset of the virtual address. If yes, the Page frame ID is obtained. Otherwise, the Page frame ID is queried in the Page table. At this time, pay attention to the valid bit. The operating system calls the corresponding page from disk into memory.

4 Abnormal Page Fault

4.1 Concept and causes

Also known as a page error or page failure, a hardware exception occurs during address mapping. Causes:

  1. If the accessed virtual page is not loaded into the physical memory, the page is missing
  2. Page violation (read/write, user/kernel)
  3. Incorrect access address

4.2 Page missing exception Handling

During address mapping, the hardware checks the page table and finds that the page to be accessed is not in the memory.

The operating system executes the page missing exception handler: gets the disk address, starts the disk to load the page into memory.

  1. If there is a free page frame in memory, it will allocate a new page to load, and change the valid bit of the corresponding page and the corresponding page frame number;
  2. If there is no free page frame in memory, replace a page frame in memory. If the content of the page box has been modified, write it back to disk.

5 Software-related policies

5.1 resident set

Represents the number of page boxes assigned to each process. Including fixed allocation policy and variable allocation policy. Table 5-1 Resident set allocation policies

Allocation policy Characteristics of the
Fixed allocation strategy Determined when the process is created
Variable allocation strategy According to the local performance of the page-missing rate evaluation process, whether to increase or decrease the number of page-frames will be decided, but increasing or decreasing the number of page-frames will bring system overhead

5.2 Replacement Problem

When there are no free page frames in memory and you need to swap some out, this is called a displacement. Permutation of a page box involves two core issues: permutation scope and permutation strategy.

5.2.1 Replacement range

Table 5-2 Range of permutations

Displacement range Characteristics of the
Local replacement strategy Select only in the resident set of the process that produced the missing page
Global permutation strategy All unlocked page boxes in memory are considered candidates for displacement

When a new process is loaded into memory, allocate a certain number of page boxes, and then fill these page boxes; 2) When a certain page missing anomaly occurs, select a page from the resident set of the abnormal process for replacement; 3) Continuously evaluate the process’s page box allocation, increase or decrease the page box allocated to it to improve overall performance.

5.2.2 Replacement Strategy

To determine the displacement range, it is necessary to further determine which page frame should be replaced, which requires the displacement strategy. The core goal of any replacement strategy is to replace the page that is least likely to be accessed recently, so most replacement strategies predict future behavior based on past behavior. But be careful: the more elaborate and complex the substitution strategy is, the more hardware and software overhead is required to implement it. Constraints on displacement: locked page frames cannot be replaced.

5.2.3 Page Frame Lock

Because of the overhead of the virtual memory technique, which may call disk, the program run time is uncertain. To make the running time predictable, a lock bit is added to each page box to prevent the operating system from swapping pages used by the process out of memory. For example, the core code of the operating system, key data structures, I/O buffers, and especially the memory pages that are being I/O need to set the lock bit.

5.2.4 Clearing Policies

The paging daemon retrieves the page box from the process’s resident set. In fact, the optimal state of a virtual page system is when there are a large number of free page boxes when a page missing exception occurs. So it’s better to keep a certain number of page frame supplies in the system than to use all memory and search and replace a page frame if needed. The implementation mechanism of the clearing strategy is as follows: 1) Design a paging daemon, which sleeps most of the time and can wake up periodically to check the memory status; 2) If there are too few free page frames, the paging daemon selects the page out memory through a predetermined page replacement algorithm; 3) If pages are modified after loading into memory, write them back to disk. So the paging daemon ensures that all free page boxes are “clean.” 2 page buffering Technique When a process needs to use a page box that has been displaced, remove it from the free page box set to restore the page if the page box has not been covered by new content. This is also the buffer mechanism, which is implemented as follows: 1), do not discard the displaced pages, put them in one of two tables: if not modified, put them in the free page list, otherwise put them in the modified page list; 3) The displaced page remains in memory and can be quickly added to the resident collection of the process once it is accessed again.

6 Page Replacement Algorithm (Replacement)

Also known as page elimination algorithm.

6.1 Optimal page replacement algorithm OPT

6.1.1 Design idea

Replace pages that are no longer needed or will be used in the farthest future.

6.1.2 meaning

However, this algorithm is difficult to implement because the running process does not know the page to be used in the future. The algorithm is best used as a standard to measure the performance of other algorithms.

6.2 FIFO algorithm

6.2.1 Design idea

Replaces the page that stays in memory the longest. Through the page list method, replace the page of the list head each time.

6.2.2 Belady phenomenon

The FIFO page replacement algorithm produces the Belady anomaly, that is, when the number of physical pages assigned to the process increases, the number of page misses increases.

6.3 Second Chance Algorithm (SCR)

6.3.1 Design idea

Second Chance. FIFO algorithm is improved, according to the first-in, first-out algorithm from the head of the linked list to select a page box, check its access bit R, if it is 0 (indicating that it has not been accessed recently), then replace the page; If it is 1, a second chance is given to place the page at the end of the list and to visit position 0.

6.3.2 Implementation Mechanism — Clock algorithm

The idea of SCR is based on the linked list. If access R is 1, the page frame should be removed from the head of the linked list and added to the tail of the list. This process of chain removal and linking has consumption, so the clock algorithm based on the circular linked list is adopted. Select the page frame to be displaced by moving the pointer.

6.4 Recently Unused Algorithm (NRU)

6.4.1 Design idea

Not Recently Used is to replace a page that has Not been Used in a recent period of time. Set the two bits of the page table entry: access bit R and change bit M. When starting a process, R and M are set to 0, and R bit is periodically cleared (reset). Table 6-1 Possible values of R and M of the NRU algorithm

Serial number R M describe
1 0 0 No access, no modification
2 0 1 No access, modification
3 1 0 There is access, but no modification
4 1 1 There’s access, there’s modification

As shown in the table above, the idea of the NRU algorithm is to randomly select a page substitution from the least numbered non-empty class. If the class numbered 1 is empty, look for the class numbered 2.

6.4.2 Implementation Mechanism – Clock algorithm

1, starting from the current position of the pointer, scan the pager buffer and select the first pager encountered (R=0, M=0) user replacement; 2. If the first step fails, rescan, select the first (R=0, M=1) page frame (if R=1, skip), and set the access bit R of the skipped page frame to 0 in the scanning process; 3. If step 2 also fails, the pointer returns to its original position and all the page boxes in the collection have access bit R of 0. Repeat step 1 again, and step 2 if necessary, so that you can find the page frame you want to replace.

6.5 Least Recently Used Algorithm (LRU)

6.5.1 Design idea

Least Recently Used refers to replacing the page that has been accessed the longest since the last time, that is, replacing the page that has not been Used for the longest time.

6.5.2 Implementation Mechanism

The basic implementation is to use timestamps or maintain a stack of access pages. The advantage of this algorithm is that its performance is closest to OPT, but its disadvantage is that it has high overhead.

6.6 Least Frequently Used Algorithms (NFU)

6.6.1 Design idea

Not Frequently Used refers to selecting the least Frequently visited page replacement.

6.6.2 Implementation Mechanism

NFU algorithm actually listed LRU algorithm a software solution. Its basic implementation mechanism is: 1) maintain a software counter, each page has one, the initial value is 0; 2), every time the clock is interrupted, the counter increases R; 3) In case of page missing interruption, the page with the smallest counter value shall be selected for replacement.

6.7 Aging Algorithm

6.7.1 Design idea

Aging algorithm is actually an improvement of NFU algorithm and an LRU simulation algorithm. The basic idea is that the counter moves one bit to the right before adding R, and adds the R bit to the left end of the counter.

6.7.2 Implementation Mechanism

As shown in the figure below, each time the clock is interrupted, change the value of the technician according to the value of the access bit R: if the access bit is 1, move it one bit to the right and then the highest position 1; If the access bit is 0, move right one bit. When a page – missing interrupt occurs, the page – box with the smallest counter value is selected to replace it.

6.8 Working set Algorithm

6.8.1 Factors influencing the number of missing pages

Page replacement algorithm Page size the size of a program the number of page frames assigned to a process

Thrashing (also known as jashing) : In virtual memory, pages are frequently scheduled between memory and disk, so that it takes longer to schedule the page than the process actually runs, resulting in a drastic decrease in system efficiency. This is Thrashing.

6.8.2 Page size

Factors to consider: 1, internal fragmentation 2, page table length 3, auxiliary physical characteristics of large pages in and out of memory frequency is lower, but easy to generate internal fragmentation, small pages just the opposite. In general, the optimal page size formula is 2 x (average process size x length of page entries) to the power of half. However, common page sizes, such as 4K and 4M, will be used in general operating systems.

6.8.3 Working Set algorithm

6.8.3.1 Basic Ideas

According to the principle of locality, under normal circumstances, the process for a period of time is always concentrated to access some pages, these pages are called active page, if assigned to a process of physical page number is too little, then the processes needed for the active page cannot be loaded into memory, the process. Frequently occurred in the running process of a page fault. If the process should be provided with the same number of physical pages as the number of active pages, this can greatly reduce the number of page miss interrupts.

6.8.3.2 Basic Concepts

Working set: The set of page boxes that a process is currently using. Working set W(t,△)= The set of pages accessed by the process in the past △ virtual units of time. The content depends on three factors: 1. 2. Time t; 3. The working set window length △, the longer the working set, the larger the working set.

6.8.3.3 Implementation mechanism of working set algorithm

In the working set algorithm, the basic idea is to find a page that is not in the working set and replace it, so that each page entry has a field that records the last time that the page was visited. The implementation process is as follows. Scan all page entries and perform the following operations in sequence: 1. If the R bit of a page is 1, set the last accessed time of the page to the current time and clear the R bit. 2. If the R bit is 0, check whether the page is accessed before “current time -t”. 1) If so, the page is replaced; 2) If not, record the minimum value in the last access time of all scanned pages, scan the next page and repeat steps 1 and 2.

6.9 Summary of page replacement algorithm

See the following table. Table 6-2 Summary of page replacement algorithms

algorithm evaluation
OPT Cannot be implemented, but can be used as a benchmark for other algorithms
NRU The LRU algorithm is a rough approximation
FIFO Important pages may be displaced, causing a Belady exception
Second Chance Compared with FIFO has a great improvement
Clock Implementation of the
LRU Excellent, but difficult to achieve
NFU A relatively rough approximation of LRU
Aging Very similar to the efficient algorithm of LRU
Working set It’s expensive to implement

summary

After the study of the above key content, the two concepts are further explained.

7.1 Memory-mapped Files

A process maps a file through a system call (Mmap) to part of its virtual address space, and accessing the file is like accessing a large array in memory, rather than reading or writing to the file. In most implementations, the contents of all pages are not read at once but are read one page at a time as the page is accessed.

7.2 Copy-on-write technology

For example, processes T1 and T2 share three pages of physical memory, P1, P2, and P3. Because the pages are shared, each page is marked as write-on-copy (a combination of read-only tags and another tag). When T2 tries to change the data on Page P2, but conflicts with the read-only flag, resulting in a Page Fault exception and access to the operating system. The operating system checks that the Page is a Copy Page, and creates a p2-copy Page in the physical memory to write the corresponding content to the Page. P2-copy of copied pages is private to the process performing the write and invisible to all other processes that Copy pages while sharing the write.