One, foreword

In the last section we talked about process management for Linux:

(1) Linux process management for Linux performance tuning

In this section we’ll talk about the memory architecture of Linux

Second, the overview

During process execution, the Linux kernel allocates a memory area to the process as required. The process uses this area as a workspace to perform operations as required. It’s like having your own desk where you can put documents, memos, and work. The difference is that the kernel allocates space in a more dynamic way. There are often thousands of processes running on the system, but memory is limited. So Linux has to deal with memory efficiently. In this section, you’ll learn about the Linux memory architecture, address layout, and how Linux manages memory space efficiently.

To sum up, if you have a limited memory, such as 8 GIGABytes, and you open a large number of applications, the computer is bound to stall, slow down, or even unresponsive!

Three, memory system

  • 3.1 Physical and Virtual Memory

In reality, we are often faced with the choice of 32-bit or 64-bit operating systems (although 32-bit is now largely obsolete), and the biggest difference for users is whether they can support more than 4GB of virtual memory space. From a performance standpoint, it is interesting to understand the difference between 32-bit and 64-bit systems that Linux maps physical memory to virtual memory. As shown in the figure below, you can clearly see the difference between memory mapping on 32-bit and 64-bit systems. Rather than explore the details of physical memory mapping to virtual memory in detail, a little knowledge of Linux’s memory architecture is all we need for performance tuning.

On 32-bit machines, the Linux kernel can only map directly to the first GB of physical memory (896M, because there is still space to be reserved). The memory on this is called ZONE_NORMAL, and this space must be mapped to the bottom 1GB. This mapping is completely transparent to the application, but allocating memory pages to ZONE_HIGHMEM incurs a bit of performance penalty.

On the other hand, on 64-bit systems, such as ia-64, ZONE_NORMAL extends all the way up to 64GB or 128GB. As you can see, the loss of mapping memory pages from ZONE_HIGHMEM to ZONE_NORMAL does not exist on 64-bit systems.

The following figure shows the virtual addressing layout for 32-bit and 64-bit architecture Linux systems:

On a 32-bit architecture, the maximum address space available to a single process is 4GB, which is limited by 32-bit virtual memory mapping. In a standard 32-bit environment, virtual addresses are divided into 3GB of user space and 1GB of memory space, although some 4GB/4GB address layouts exist in reality.

In 64-bit architecture, since there are no memory limitations, each process can potentially use a large address space.

  • 3.2 Virtual Memory Manager

Because operating systems map all memory to virtual memory, the physical memory architecture of an operating system is often invisible to users and applications. If we want to learn how to tune Linux memory, we must first understand how Linux handles virtual memory. As described in 3.1, instead of using physical memory, the application requests a memory map of a specific size from the Linux kernel and receives a map of virtual memory. As shown in the figure below, virtual memory does not have to be a mapping of physical memory. If an application uses a large chunk of virtual memory, some portion of this virtual memory may be mapped from swap space on disk.

As you can see, applications often do not write directly to the disk subsystem. Instead, they write first to the cache or buffer, and then, when pdflush is idle, or when a file size exceeds buffer and cache, The pdflush kernel thread writes data from buffer or cache to disk. Refer to the writing dirty Buffer section below.

The way the Linux kernel manages disk caching is closely related to the way the kernel writes data to the file system. Linux handles memory resources more efficiently than other operating systems, which allocate only a specific portion of memory as a disk cache.

The virtual memory manager is configured by default to use all available free memory space as disk cache, so it is common to see Linux systems with several gigabytes of memory only 20 MB free.

Linux also makes efficient use of swap space. When the operating system starts to use swap space, it does not indicate that the system has a memory bottleneck, but proves that Linux uses system resources efficiently. See Page Frame Reclaiming.

Here are a few key concepts:

  1. Page Frame Allocation

A page is a contiguous set of linear addresses in physical or virtual memory, and the Linux kernel handles memory in pages, typically 4KB in size. When a process requests a certain amount of pages, if there are pages available, the kernel allocates those pages directly to the process. Otherwise, the kernel takes some pages from other processes or the page cache for the process to use. The kernel knows how many pages are available and where they are.

  1. Buddy System

The Linux kernel maintains idle pages using a mechanism called the Buddy System. The partner system maintains free pages and tries to allocate pages to processes that send in page requests, and it also tries to keep memory areas contiguous. Not taking into account the fragmentation of small pages can lead to memory fragmentation, and it can be difficult to allocate a contiguous large memory page, which can lead to reduced memory usage and performance. The following figure illustrates how the partner system allocates memory pages:

If attempts to allocate memory pages fail, the reclamation mechanism is started. You can see the partner system information in the /proc/buddyinfo file:

  1. Page frames to recycle

If there are no memory pages available when a process requests a specified number of memory pages, the kernel tries to free a particular memory page (previously used, now unused, and still marked as active based on some principle) for use by a new request. This process is called memory reclamation. The kSWapd kernel thread and the try_to_free_page() kernel function are responsible for page recycling.

Kswapd usually sleeps in task Interruptible state and is awakened by partner systems when the number of free pages in an area falls below the threshold. It looks for recyclable pages in active pages based on the Least Recently Used principle. The least recently used pages are released first. It maintains candidate pages using active and inactive lists. Kswapd scans the active list to check for recent use of pages, and pages that have not been used recently are put into the inactive list. Use the vmstat -a command to see how much memory is considered active and inactive, respectively:

Kswapd also follows another principle. Pages have two main uses: page cache (PAGE CAHE) and process Address space (PROCESS Address space). Page cache refers to pages mapped to disk files; The pages of the process address space (also called anonymous memory because it is not a map of any file and has no name) are used to do the stack. When reclaiming memory, KSWAPD prefers to reclaim the page cache.

If most of the page cache and process address space comes from memory reclamation, performance may be affected in some cases. We can control this behavior by modifying the /proc/sys/vm-swappiness file:

  1. Swap partition

Page out occurs for candidate pages in the inactive list that belong to the process address space when page reclamation occurs. It’s normal to have swap space in itself. On other operating systems, swap is nothing more than a guarantee that the operating system can allocate more space than physical memory, but Linux uses swap’s space more efficiently. Virtual memory consists of physical memory and a disk subsystem or swap partition. In Linux, if the virtual memory manager realizes that a memory page has been allocated but has not been used for a long time, it moves the page to swap space.

Page out and swap out: It's easy to confuse "Page out" with "swap out". "Page out" means to swap pages (part of the entire address space) to swap;"swap out"Swap all address Spaces to swap.Copy the code

Fourth, more popular understanding

Virtual memory consists of physical memory and a disk subsystem or swap partition. Physical memory, disks, and swap partitions are abstracted into one large memory, called virtual memory. The address allocated by the process on virtual memory is mapped to the physical device, so some of it may be in physical memory, may be in swap, may be on disk.

Based on the above, we have the following understandings:

  1. Each process has its own 4 gigabytes of memory, which is just virtual memory. Every time you access an address in memory, you need to translate that address into physical memory

  2. When a new process is created, it creates its own memory space, and copies the process’s data, code, etc., from disk into its own memory space. The process controls where the data is stored in the task_struct table. Which addresses have no data, which are readable, and which are writable can all be recorded through this linked list

  3. All processes share the same physical memory, and each process maps and stores only the virtual memory space it currently needs into physical memory (see 3.1).

  4. To know which virtual memory addresses are in physical memory, which are not, and where they are in physical memory, a page table is required

  5. Each entry in the page table has two parts. The first part records whether the page is in physical memory, and the second part records the address of the physical memory page (if so).

  6. When a process accesses a virtual address and looks at the page table, if the corresponding data does not exist in the physical memory, a page missing exception occurs. If the memory is full, a page is overwritten. Of course, if the page has been modified, the page needs to be written back to the disk

Five, the next section is??

With Linux’s memory architecture behind us, the next section will look at the Linux file system