The original address: www.programmersought.com/article/723…

Zero Copy is a familiar term that features many high-performance networking frameworks such as Netty, Kafka, and RocketMQ. So what exactly is zero copy?

Zero copy

Wikipedia defines zero copy as follows:

“Zero-copy” describes computer operations in which the cpus do not perform the task of copying data from one memory area to another. This is frequently used to save CPU cycles and memory bandwidth when transmitting a file over a network.

“Zero copy” describes a computer operation in which the CPU does not perform the task of copying data from one memory region to another. This is typically used to save CPU cycles and memory bandwidth when transferring files over the network.

Zero copy in Linux system

The term

  • The kernel space

    Computer memory is divided into user space and kernel space. The kernel space runs the OS kernel code and has access to all memory, machine instructions, and hardware resources with the highest permissions.

  • The user space

    All space outside the kernel for normal user process operations. User-space processes cannot access kernel space, only a small portion of the kernel through interfaces exposed by kernel system calls (system calls). If a user process requests a system call, it needs to send a system interrupt (software interrupt) to the kernel, which schedules the appropriate interrupt handler to handle the request.

  • DMA

    Direct memory access (DMA) is designed to respond to speed size mismatches between cpus and hard disks by allowing certain hardware subsystems to access independent of CPU main memory.

    Without DMA, the entire process of the CPU performing IO operations is blocked and no other work can be done, causing the computer to fall into a hang. If there is DMA intervention, the IO process looks like this: while the CPU is initiating the DMA transfer, it can perform other operations; Once the transfer is complete, the DMA controller (DMAC) sends an interrupt signal to the CPU, which can then process the transferred data.

Traditional network transmission

A common scenario for network IO is to read a file from a hard disk and send it to the network over a network adapter. Here is a simple pseudocode:

// read data from hard disk
File.read(fileDesc, buf, len);
// Send data to the network
Socket.write(socket, buf, len);
Copy the code

At the code level, this is a very simple operation, but at the system level, let’s see what’s going on behind the scenes:

  1. User initiated read() System call (syscall), request hard disk data. At this point, it will happen once Context switch(the context switch).
  2. DMA Read the file from the hard disk. At this time, a copy is generated:hard disk— – >DMA buffer.
  3. DMA Copy data toUser space.read() The call returns. At this time, it happened once Context switch And a data copy: DMA buffer— – >User space.
  4. User initiated write() System call, request to send data. This happens once Context switch And a data copy: User space— – >DMA buffer.
  5. DMA Copy the data to the network card for network transmission. The fourth data copy occurs at this time:DMA buffer— – >Socket buffer
  6. write() The call returns and happens again Context switch.

The data flow is as follows:

As you can see, there are four context switches and four data copies involved. For simple network file sending, there is a lot of unnecessary overhead.

sendfile transmission

For the above scenario, you can see that two CPU copies from DMA buffer to user space and from user space to socket buffer are completely unnecessary, and zero copy is born. For this, the Linux kernel provides the sendFile system call.

If sendFile () is used to perform the above request, the system flow can be simplified as follows:

The sendFile () system call enables internal replication of data within DMA without copying data to user space. Therefore, the number of context switches is reduced to 2 and the number of data copies is reduced to 3.

Here is a question: whyDMAThere will be a copy inside (this copy needsCPUparticipate)?

This is because the early network cards required the data to be sent to be continuous in physical space, so there wasSocket Buffer. But if the network card itself supports scatter-gather, that is, it can gather and send data from discontinuous memory addresses, then it can be further optimized.

Network card support scatter-gather of sendfile transmission

InLinuxThis has been optimized after kernel version 2.4. If the computer network card supports the collection operation,sendfile()Operation can be omittedSocket BufferThe data is copied, instead, The descriptors of the data location and length are directly passed toSocket Buffer:

With the support of the network adapter, the number of context switches is reduced to two times, and the number of data replication is reduced to two times. Double data replication is required, that is, data replication in memory has been completely avoided.

In the case of sending files from the hard disk to the network, the sendFile () system call is zero copy in the true sense if the network card supports collection operations.

Memory mapping (mmap)

In the case of “sending files over the Internet,” using the SendFile () system call can greatly improve performance (according to tests, throughput can be up to three times that of the traditional method). One drawback is that sendFile () only supports “continuous operations” of “read -> send”, so sendFile () is generally used to handle static network resources and can’t handle additional operations on data. Memory mapping provides a solution to this.

Mmap is a method of memory-mapping files. It can map the file to the address space of the process and realize the correspondence between the file disk address and the virtual address in the virtual address space of the process. In this way, the user process can read and write to the block of memory using Pointers, and changes made by the kernel space to the block of memory are directly reflected in the user space.

In short, MMAP enables data sharing between user space and kernel space. As you can guess, if the MMAP system call is used, the steps for the above scenario are as follows:

The data flow is as follows:

In contrast to traditional methods, MMAP keeps a Copy of the data, which can also be broadly referred to as Zero Copy. It also allows users to customize data operations, which has an advantage over sending files.

Zero copy in JDK NIO

Starting with version 1.4, the JDK introduced NIO with proper Zero Copy support. Since the JVM runs on the above operating system, its functionality is just a wrapper around the system’s underlying API, and if the operating system does not support Zero Copy(mmap/ SendFile), there is little the JVM can do. The correct Zero Copy Package for the JDK is mainly reflected in the FileChannel class.

map()

The map() method looks like this:

public abstract class FileChannel
    extends AbstractInterruptibleChannel
    implements SeekableByteChannel.GatheringByteChannel.ScatteringByteChannel {

    public abstract MappedByteBuffer map(MapMode mode, long position, long size) throws IOException;
}
Copy the code

The map() method is described as follows:

Maps a region of this channel file directly into memory. For most operating systems, mapping files to memory is more expensive than reading or writing tens of kilobytes of data through the usual read/write methods. From a performance standpoint, it is usually only worthwhile to map relatively large files into memory.

The map() method returns MappdByteBuffer, of which DirectByteBuffer is a subclass. It refers to a block independent of JVM memory that is outside the GC and controlled by the mechanism, leaving you to manage the create and destroy operations yourself.

transferTo()

The transferTo() method is as follows:

public abstract class FileChannel
    extends AbstractInterruptibleChannel
    implements SeekableByteChannel.GatheringByteChannel.ScatteringByteChannel {

    public abstract long transferTo(long position, long count, WritableByteChannel target) throws IOException;
}
Copy the code

The transferTo() method is described as follows:

This approach can be much more efficient when transferring bytes from a file from this channel to a given writable section channel than a simple loop that reads from this channel and writes to the target channel.

Many operating systems can transfer bytes directly from the file system cache to the target channel without actually copying them.

Note that since sendFile () is only suitable for socket buffer sending data, improving performance with Zero Copy technology can only be used for sending data over the network.

What does that mean? If you simply use Transfer To() To write data from one file on your hard disk To another, there will be no performance gains.

Zero copy in Netty

// todo