Pinned memory or page locked memory:
The host has virtual memory. If the host memory is insufficient, memory data is exchanged to the virtual memory, which is the disk space on the host. If you need the page, you can load it back from the disk. This allows you to use more memory than you actually have.
Locked page memory allows the MDA controller on the GPU to use host memory without CPU involvement. Video memory on the GPU is page locked because memory on the GPU does not support swapping to disk. Locking page memory is when allocating host memory to lock the page so that it does not swap with disk.
CUDA page lock memory can be used using CUDA driver API’s cuMemAllocHost() or cudaMallocHost() in CUDA runtime API. Alternatively, you can take the space allocated by Malloc() directly on the host and register it as lock-page memory (using the cudaHostRegister() function).
The benefits of using locked page memory are as follows:
- Data transfers between device memory and lock page memory can be processed in parallel with the kernel.
- The locked page memory can be mapped to the device memory, reducing data transfer between the device and the host.
- The data exchange between the host system lock page memory and device memory in the front-end bus will be faster; And can be write-combining, at this time the bandwidth will be large.
Pin_memory is the page lock memory. When you create a DataLoader, set pin_memory=True, which means that the Tensor generated originally belongs to the page lock memory in memory. In this way, it will be faster to escape the Tensor of memory to the GPU’s video memory.
There are two modes of memory in the host, one is page locked, the other is not page locked. The content stored in the page locked memory will not be exchanged with the virtual memory of the host under any circumstances (note: virtual memory is hard disk), and the data stored in the page unlocked memory will be stored in the virtual memory when the host memory is insufficient.
And the video card in the video memory is all locked page memory!
When the computer has sufficient memory, you can set pin_memory=True. Set pin_memory=False when the system is stuck or swapping memory is used too much. Because pin_memory is related to computer hardware performance, the PyTorch developers cannot ensure that every alchemist has a high-end device, so pin_memory defaults to False.