mmap
Mmap maps files or devices into memory, allowing applications to read and write files as if they were memory.
#include <sys/mman.h>
void *mmap(void *addr, size_t length, int prot, int flags,
int fd, off_t offset);
int munmap(void *addr, size_t length);
Copy the code
For details about the parameters, see Linux Programmer’s Manual: Mmap (2).
After using Mmap, reading and writing to files is no different from reading and writing to memory (no system calls are required) if the data being accessed happens to be in page cache.
If the accessed data is not in the page cache, a Page fault or, more precisely, a Major Page fault occurs, which causes file I/O, threads block, and context switches occur.
(Image by Scylla)
The advantages and disadvantages
The benefits of using Mmap to access files are obvious:
- Reading and writing files do not require a read/write system call.
- Memory copies in user space and kernel space can be reduced.
Mmap also has some disadvantages:
- Only fixed-length files are supported (mremap might be used to support variable-length files, but I’ve rarely seen anyone do this).
- Mapping a large number of files or large files can make page tables expensive. For an introduction to page tables, refer to my previous article on Linux Memory Management.
other
The kernel maintains a task_struct structure for each process. The mm_struct in task_struct describes information about virtual memory. The Mmap field in mm_struct is a vm_area_struct pointer. The vm_AREA_struct object in the kernel is organized into a linked list + red-black tree structure. In theory, a single call to Mmap by a process produces a VM_AREA_struct object (regardless of the fact that the kernel automatically merges adjacent and eligible memory regions).
In fact, Linux uses MMAP to load binaries into memory. Because binaries are code segments, this is not a simple mapping of the entire file. You need to map the segmented regions of the file to different locations in memory. Mmap maps virtual space to file contents. For details on the process memory layout, see my previous article on Linux Memory Management.
You can run cat /proc/
The resources
- Different I/O Access Methods for Linux, What We Chose for Scylla, and Why
- Linux Programmer’s Manual: mmap(2)