Zero Copy Zero – Copy

Let’s take a look at its definition:

“Zero-copy” describes computer operations in which the CPU does not perform the task of copying data from one memory area to another. This is frequently used to save CPU cycles and memory bandwidth when transmitting a file over a network.

Zero-copy means that during data operation, there is no need to copy buffer data from one memory area to another memory area. Therefore, one memory copy is not required, which reduces CPU execution and reduces memory bandwidth.

Zero-copy at the operating system level

Zero-copy at the OS level usually refers to avoiding copying data back and forth between user-space and kernel-space.

  • For example, the MMAP system call provided by Linux can map a segment of user-space memory to the kernel space. When the mapping is successful, the modification of this segment of memory can be directly reflected in the kernel space.

  • Kernel space changes to this area also directly reflect user space. Because of this mapping relationship, we do not need to copy data between user-space and kernel-space, which improves the efficiency of data transfer.

Netty’s zero-copy is not quite the same as OS zero-copy. Netty’s zero-copy is completely user-mode (Java). Its Zero-Copy is more of a concept of optimizing data manipulation.

Netty zero-copy zero-copy

  • Netty provides the CompositeByteBuf class, which combines multiple ByteBuFs into a logical ByteBuf, avoiding copying between individual ByteBuFs.

  • With the wrap operation, we can wrap byte[] arrays, ByteBuf, ByteBuffer, and so on into a Netty ByteBuf object, thus avoiding copy operations.

  • ByteBuf supports slice, so you can split ByteBuf into multiple BytebuFs that share the same storage area, avoiding memory copying.

  • FileChannel. TranferTo, which is wrapped in FileRegion, can be used to transfer files directly to the target Channel, avoiding the memory copy problem caused by the traditional write loop.

Zero copy is implemented with a CompositeByteBuf

Suppose we have a protocol data that consists of a header and a message body, which are stored in two bytebuFs:

ByteBuf header = ...
ByteBuf body = ...
Copy the code

In code processing, you usually want to combine header and body into a single ByteBuf for easy processing, so the usual practice is:

ByteBuf allBuf = Unpooled.buffer(header.readableBytes() + body.readableBytes());
allBuf.writeBytes(header);
allBuf.writeBytes(body);
Copy the code

As you can see, we have copied the header and body into the new allBuf, which creates two additional data copies. Is there a more efficient and elegant way to achieve the same goal? Take a look at how CompositeByteBuf fulfills this requirement.

ByteBuf header = ...
ByteBuf body = ...
CompositeByteBuf compositeByteBuf = Unpooled.compositeBuffer();
compositeByteBuf.addComponents(true, header, body);
Copy the code

In the code above, we define a CompositeByteBuf object and call it

public CompositeByteBuf addComponents(boolean increaseWriterIndex, ByteBuf... buffers) {... }Copy the code

The header and body method is merged into a logical ByteBuf, that is:

Note that although the CompositeByteBuf appears to be composed of two ByteBuFs, within the CompositeByteBuf the two ByteBuFs exist separately, The CompositeByteBuf is only logically integrated.

One thing to note about the CompositeByteBuf code above is that we call addComponents(Boolean increaseWriterIndex, ByteBuf… Buffers) to add two bytebuFs, where the first parameter is true, indicating that writeIndex of the CompositeByteBuf is automatically increments when a new ByteBuf is added.

In addition to using the CompositeByteBuf class directly above, we can also use the Unpooled. WrappedBuffer method, which encapsulates the CompositeByteBuf operations underneath, so it is more convenient to use:

ByteBuf header = ...
ByteBuf body = ...
ByteBuf allByteBuf = Unpooled.wrappedBuffer(header, body);
Copy the code

Zero copy through wrap operation

We have a byte array that we want to convert to a ByteBuf object for subsequent operations, so the traditional way is to copy this byte array into ByteBuf, that is:

byte[] bytes = ...
ByteBuf byteBuf = Unpooled.buffer();
byteBuf.writeBytes(bytes);
Copy the code

Obviously, there is an extra copy operation in this way, so we can use Unpooled’s related method to wrap the byte array and generate a new instance of ByteBuf without copying it. The above code can be changed to:

byte[] bytes = ...
ByteBuf byteBuf = Unpooled.wrappedBuffer(bytes);
Copy the code

Bytes are wrapped as an UnpooledHeapByteBuf object with the unpooled.wrappedBuffer method. No copy operation is performed during the wrapping. The resulting ByteBuf object shares the same storage space as the Bytes array, and changes to bytes are reflected in the ByteBuf object.

Zero copy is achieved through slice operation

The slice operation is the opposite of the wrap operation. Unpooled. WrappedBuffer can merge multiple BytebuFs into one, while the slice operation can slice a ByteBuf into multiple ByteBuf objects that share a storage area. ByteBuf provides two slice operations:

public ByteBuf slice(a);
public ByteBuf slice(int index, int length);
Copy the code

The slice method without arguments is equivalent to the buf.slice(buf.readerIndex(), buf.readableBytes()) call, which returns a slice of the readable part of buF. The slice(int index, int Length) method is relatively flexible, we can set different parameters to obtain different sections of BUF.

ByteBuf byteBuf = ...
ByteBuf header = byteBuf.slice(0.5);
ByteBuf body = byteBuf.slice(5.10);
Copy the code

The slice method produces header and body without copying; the header and body objects internally share different parts of byteBuf storage. That is:

Zero copy with FileRegion

Netty uses FileRegion to implement zero-copy file transfer, but at the bottom layer FileRegion relies on the zero-copy function of Java NIO Filechannel. transfer.

Let’s start with the basics of Java IO. Suppose we want to implement a file copy function, then using the traditional way, we have the following implementation:

public static void copyFile(String srcFile, String destFile) throws Exception {
    byte[] temp = new byte[1024];
    FileInputStream in = new FileInputStream(srcFile);
    FileOutputStream out = new FileOutputStream(destFile);
    int length;
    while((length = in.read(temp)) ! = -1) {
        out.write(temp, 0, length);
    }
    in.close();
    out.close();
}
Copy the code

The above is a typical reading and writing binary code implementation. I don’t need to tell you that the above code constantly reads data from the source file into the temp array, and then writes temp to the destination file. This copying operation does not have much impact on small files, but if we need to copy large files, Frequent memory copy operations consume a lot of system resources, so let’s look at how FileChannel with Java NIO can achieve zero copy:

public static void copyFileWithFileChannel(String srcFileName, String destFileName) throws Exception {
    RandomAccessFile srcFile = new RandomAccessFile(srcFileName, "r");
    FileChannel srcFileChannel = srcFile.getChannel();
    RandomAccessFile destFile = new RandomAccessFile(destFileName, "rw");
    FileChannel destFileChannel = destFile.getChannel();
    long position = 0;
    long count = srcFileChannel.size();
    srcFileChannel.transferTo(position, count, destFileChannel);
}
Copy the code

As you can see, using FileChannel, you can transfer the contents of the source file directly to the destination file without having to use an additional temporary buffer, avoiding unnecessary memory manipulation. Let’s take a look at how FileRegion is used in Netty to transfer a file with zero copy:

@Override
public void channelRead0(ChannelHandlerContext ctx, String msg) throws Exception {
    RandomAccessFile raf = null;
    long length = -1;
    try {
        // 1. Open a file with RandomAccessFile.
        raf = new RandomAccessFile(msg, "r");
        length = raf.length();
    } catch (Exception e) {
        ctx.writeAndFlush("ERR: " + e.getClass().getSimpleName() + ":" + e.getMessage() + '\n');
        return;
    } finally {
        if (length < 0&& raf ! =null) {
            raf.close();
        }
    }
    ctx.write("OK: " + raf.length() + '\n');
    if (ctx.pipeline().get(SslHandler.class) == null) {
        // SSL not enabled - can use zero-copy file transfer.
        Call raf.getChannel() to get a FileChannel.
        // 3. Encapsulate FileChannel as a DefaultFileRegion
        ctx.write(new DefaultFileRegion(raf.getChannel(), 0, length));
    } else {
        // SSL enabled - cannot use zero-copy file transfer.
        ctx.write(new ChunkedFile(raf));
    }
    ctx.writeAndFlush("\n");
}
Copy the code

As you can see, the first step is to open the file with RandomAccessFile. Netty then wraps a FileChannel using DefaultFileRegion:

new DefaultFileRegion(raf.getChannel(), 0, length)
Copy the code

Java zero copy

Zero copy indicates that the number of times that data is copied between the user mode and kernel mode is zero.

Traditional data copy (file to file, client to server, etc.) involves four times of user-mode kernel mode switching and four times of copy. Among the four times of copy, two times of copy between user-mode and kernel mode requires CPU’s participation, and two times of copy between kernel mode and IO device requires DMA mode without CPU’s participation. Zero copy avoids copy between user and kernel modes and reduces the need to switch between user and kernel modes twice.

  • Java’s Zero Copy is mostly used in web applications. Java libaries in Linux and Unix support zero copy, key API is a Java nio. Channel. The FileChannel transferTo (), transferFrom () method.

  • You can use these methods to transfer bytes directly from the channel calling it to another Writable Byte channel, without passing data through the application, in order to increase the efficiency of data transfer.

Zero copy technology is used in Web environments

Many Web applications provide a lot of static content to the user, which means that a lot of data is read from the hard disk and transferred to the user through the socket unchanged. This operation may not seem to consume much CPU, but it is actually inefficient.

Original copy technique

Kernal reads the data from disk and sends it to the user-level application, which then passes the same content back to the Socket at KerNAL level. Application is really just an inefficient medium. Pass data from disk file to socket.

Zero copy technology

Every time data passes through the User-kernel boundary, it will be copied, which will consume CPU and occupy the bandwidth of RAM. So you can use a technique called zero-copy to get rid of unwanted copies.

  • The application uses Zero Copy to request the kernel to transfer disk data directly to the socket, rather than through the application. Zero Copy improves application performance and reduces context switching between kernel and user modes.
  • Using the kernel buffer as an intermediary (rather than passing data directly to the user buffer) seems inefficient (an extra copy). However, kernel buffers are actually used to improve performance.
The downside of zero copy

During read operations, the kernel buffer plays the role of prefetch cache. When the data size of the write request is smaller than that of the kernel buffer, the performance can be significantly improved. During write operations, the existence of the kernel buffer can make write requests completely asynchronous.

Unfortunately, this method itself becomes a performance bottleneck when the requested data size is much larger than the kernel buffer size. This is because data needs to be copied many times between disk, kernel, and user buffers.

Zero Copy improves performance by eliminating these redundant data copies.

The traditional approach and the context switch involved

To transfer a file to another program over the network, the copy operation needs to undergo four context switches between user mode and kernel mode inside the OS, and even the data is copied four times. The detailed steps are as follows:

  • The read() call causes a context switch from user mode to kernel mode. Internally, sys_read() is called to read data from the file. The first copy is performed by direct memory access (DMA), which reads the file content from the disk and stores it in the kernel’s buffer.

  • The requested data is then copied into the user buffer, at which point read() returns successfully. The return of the call triggers a second context switch: from kernel to user. At this point, the data is stored in user’s buffer.

  • Send () Socket call brings a third context switch, this time from user mode to kernel mode. At the same time, a third copy occurs: data is put into the kernel adress space. Of course, this kernel buffer is a different buffer from the one in the first step.

  • Finally, the Send () system call returns, creating a fourth context switch. At the same time, the DMA EGine copies the data from the kernel buffer to the Protocol engine for a fourth copy. The fourth copy is independent and asynchronous.

Zero Copy mode and the context transformation involved

In Linux 2.4 or later kernels (for example, Linux 6 or centos 6 or later), developers modify the socket buffer Descriptor to enable the network adapter to support Gather Operation, which further reduces data copying operations through the kernel. This method not only reduces the context switch, but also eliminates cpu-related data copying. The usage at the user level has not changed, but the internal principles have:

The transferTo() method causes the contents of the file to be copied to the kernel buffer, which is done by the DMA engine. No data was copied to the socket buffer. Procedure Instead, the socket buffer was appended with some descriptor information, including the location and length of the data. The DMA engine then transfers data directly from the kernel buffer to the Protocol Engine, eliminating the need for a single CPU copy operation.

Java NIO zero copy example

A FileChannel in NIO has transferTo and transferFrom methods that can copy data from a FileChannel directly to another Channel, or copy data from another Channel directly to a FileChannel. This interface is often used for efficient network/file data transfer and large file copy.

Under the condition of the operating system support, data transmission via this method does not need to copy the source data from kernel mode to the user mode, and then copy from user mode to the target channel kernel mode, and also to prevent the two context switches between user mode and kernel mode, i.e. to use the “zero copy”, so the performance in general higher than that of the Java IO provided in the method.

Transfer a file from client to server over the network:
/** * disk-nic zero copy */
class ZerocopyServer {
    ServerSocketChannel listener = null;
    protected void mySetup(a) {
        InetSocketAddress listenAddr = new InetSocketAddress(9026);
        try {
            listener = ServerSocketChannel.open();
            ServerSocket ss = listener.socket();
            ss.setReuseAddress(true);
            ss.bind(listenAddr);
            System.out.println("Listening port :" + listenAddr.toString());
        } catch (IOException e) {
            System.out.println("Port binding failed:" + listenAddr.toString() + Port may already be in use, cause of error:+ e.getMessage()); e.printStackTrace(); }}public static void main(String[] args) {
        ZerocopyServer dns = new ZerocopyServer();
        dns.mySetup();
        dns.readData();
    }

    private void readData(a) {
        ByteBuffer dst = ByteBuffer.allocate(4096);
        try {
            while (true) {
                SocketChannel conn = listener.accept();
                System.out.println("Connection created:" + conn);
                conn.configureBlocking(true);
                int nread = 0;
                while(nread ! = -1) {
                    try {
                        nread = conn.read(dst);
                    } catch (IOException e) {
                        e.printStackTrace();
                        nread = -1; } dst.rewind(); }}}catch(IOException e) { e.printStackTrace(); }}}Copy the code
class ZerocopyClient {
    public static void main(String[] args) throws IOException {
        ZerocopyClient sfc = new ZerocopyClient();
        sfc.testSendfile();
    }

    public void testSendfile(a) throws IOException {
        String host = "localhost";
        int port = 9026;
        SocketAddress sad = new InetSocketAddress(host, port);
        SocketChannel sc = SocketChannel.open();
        sc.connect(sad);
        sc.configureBlocking(true);
        String fname = "src/main/java/zerocopy/test.data";
        FileChannel fc = new FileInputStream(fname).getChannel();
        long start = System.nanoTime();
        long nsent = 0, curnset = 0;
        curnset = fc.transferTo(0, fc.size(), sc);
        System.out.println("Total bytes sent :" + curnset + "Time-consuming (ns)." + (System.nanoTime() - start));
        try {
            sc.close();
            fc.close();
        } catch(IOException e) { System.out.println(e); }}}Copy the code
Zero copy of file to file
/** * disk-disk zero copy */
class ZerocopyFile {
    @SuppressWarnings("resource")
    public static void transferToDemo(String from, String to) throws IOException {
        FileChannel fromChannel = new RandomAccessFile(from, "rw").getChannel();
        FileChannel toChannel = new RandomAccessFile(to, "rw").getChannel();
        long position = 0;
        long count = fromChannel.size();
        fromChannel.transferTo(position, count, toChannel);
        fromChannel.close();
        toChannel.close();
    }
    @SuppressWarnings("resource")
    public static void transferFromDemo(String from, String to) throws IOException {
        FileChannel fromChannel = new FileInputStream(from).getChannel();
        FileChannel toChannel = new FileOutputStream(to).getChannel();
        long position = 0;
        long count = fromChannel.size();
        toChannel.transferFrom(fromChannel, position, count);
        fromChannel.close();
        toChannel.close();
    }
    public static void main(String[] args) throws IOException {
        String from = "src/main/java/zerocopy/1.data";
        String to = "src/main/java/zerocopy/2.data";
        // transferToDemo(from,to);transferFromDemo(from, to); }}Copy the code