Writing in the front
In this article we will learn about zero copy technology in Linux IO. The reference link at the end of this article is very good, everyone can take a look
Traditional IO process
Consider a process where we read a file from disk and transfer the data to another machine over the network. For the user, this might simply be understood as a two-step operation.
File.read(fileDesc, buf, len);
Socket.send(socket, buf, len);
Copy the code
However, if we look at the inner workings of the part of the kernel involved in transport, we will see that even with the hardware support for DMA transport, this approach is inefficient. First, the kernel uses DMA to load data from the disk into its own kernel buffer, unless the data is still cached in the kernel buffer after the same file was previously accessed. This transfer doesn’t require much CPU work, just buffer management and DMA creation and processing. The Linux operating system stores the piece of data into the requesting application’s address space based on the address specified in the read() system call. The operating system needs to copy the data again from the buffer of the user application address space into the kernel buffer associated with the network stack, which is also cpu-intensive. After the data copy operation is complete, the data is packaged and sent to the network interface card. During data transfer, the application can go back and perform other operations. Later, when a write() system call is invoked, the data contents of the user application buffer can be safely discarded or changed because the operating system has kept a copy of the data in the kernel buffer that can be discarded when the data is successfully delivered to the hardware.
So we can see that this process involves three context switches and four data copies:
Using the mmap ()
In Linux, one way to reduce the number of copies is to call mmap() instead of read, for example:
tmp_buf = mmap(file, len);
write(socket, tmp_buf, len);
Copy the code
First, after the application calls mmap(), the data is first DMA copied into a buffer in the operating system kernel. The application then shares this buffer with the operating system, so that the operating system kernel and application storage space do not need to do any data copying. After the application calls write(), the operating system kernel copies the data from the original kernel buffer into the socket-related kernel buffer. Next, the data is copied from the kernel socket buffer to the protocol engine, which is the third data copy operation
Although mmap() can reduce an I/O copy, there are some situations where mmap() is not necessary because of the complexity of the implementation and the overhead of calling mmap() :
- Use it directly when accessing small files
read()
orwrite()
Will be more efficient. - Used when a single process performs sequential access to a file
mmap()
There is little performance gain. For example, useread()
When a file is read sequentially, the file system uses read-ahead to cache the file contents to the file system bufferread()
Will hit the cache to a large extent.
So, when is it more efficient to use mmap() to access files?
- When performing random access to a file, if
read()
orwrite()
, indicates a low cache hit ratio. Use in this casemmap()
Will usually be more efficient. - When multiple processes simultaneously access the same file (either sequentially or randomly), if used
mmap()
, then the file contents of the OS buffer can be shared between multiple processes, from an operating system perspective, usingmmap()
Can save a lot of memory.
sendfile()
In order to simplify the user interface while retaining the benefits of mmap()/write() technology to reduce CPU copy times, Linux introduced the sendfile() system call in version 2.1.
sendfile(sockfd, fd, NULL, len);
Copy the code
Sendfile () not only reduces data copying, it also reduces context switching. First: the sendFile () system call uses the DMA engine to copy the data from the file to the operating system kernel buffer, which is then copied to the socket-related kernel buffer. Next, the DMA engine copies the data from the kernel socket buffer to the protocol engine.
As you can see, using SendFile () reduces one I/O copy and two context switches compared to sending files using read() and write().
sendfile with DMA Gather Copy
To avoid data duplicates caused by the operating system kernel, a network interface is required to support collection operations, which means that data to be transferred can be scattered across different locations of storage rather than in continuous storage. Thus, read data from a file does not need to be copied into the socket buffer, but just need to upload the buffer descriptor network protocol stack, after its establish a packet related structures in the buffer, then copy function combines all of the data collected through the DMA into one network packet. The CARD’s DMA engine reads headers and data from multiple locations in a single operation. The socket buffer in Linux 2.4 satisfies this condition, which is known as the zero-copy technique used in Linux. This method not only reduces the overhead caused by multiple context switches, but also reduces the number of data copies made by the processor. Nothing changes in the code for the user application. First, the sendFile () system call uses a DMA engine to copy the contents of the file into the kernel buffer. The buffer descriptor with file location and length information is then added to the socket buffer. Instead of copying the data from the operating system kernel buffer to the socket buffer, the DMA engine copies the data directly from the kernel buffer to the protocol engine, thus avoiding the last copy.
reference
ZeroCopy: Techniques, Benefits and Pitfalls
Efficient data transfer through zero copy