preface
Our Web application processes more or less static content, reading data from disk and writing it to the socket without modification, with pseudo-code like this:
read(file, tmp_buf, len);
write(socket, tmp_buf, len);
Copy the code
Although seemingly simple, it is not very efficient because after these two calls, the data has been copied at least four times, and about the same number of user/kernel context switches have been performed. So what is user/kernel? User mode is when the program is running at level 3 privilege, because this is the lowest privilege level, the normal user process is running at the privilege level, conversely, when the program is running at level 0 privilege, it is said to be running in kernel mode. And in order to make an application to access to the kernel management of resources, the kernel must provide a set of universal access interface, the interface is called system calls, when we need to do IO operations such as open, read, write, and through the system call is required to interact with the kernel, but the system call overhead is very large, try to minimize the number of system calls, Because system calls will enter the kernel mode from the user mode, frequent switching between the user mode and the kernel mode will consume a lot of CPU resources, affecting the performance of data transmission. There are two other ways to switch from user mode to kernel mode, exceptions and peripheral interrupts.
And what is privilege here?
Starting from the 80286 processor, Intel introduced the protection mode. Privilege is an important concept in the protection mode. The core code of the operating system runs at the highest privilege level (0 privilege level), while the user program runs at the lowest privilege level (3 privilege level).
For the above example, we can divide it into the following steps:
-
Read makes a system call that causes the context to switch from user mode to kernel mode, and then DMA reads the file contents from disk and stores the data into the kernel address space buffer.
-
The data is copied from the kernel buffer to the user buffer, and then the read system call returns, causing the context to switch from the kernel back to user mode.
-
The write system call causes the context to switch from user mode to kernel mode, performing a third copy, putting the data into the kernel address space buffer, this time into another buffer that is specifically associated with the socket.
-
Write system call returns.
As you can see, the kernel first reads the data from the disk, then pushes it across the kernel to the application, which pushes it back across the kernel again, writing it out to the socket. The application is actually acting as an intermediary here, transferring data from disk files to sockets, so copying data between the kernel context and the application context is redundant. Is there any way to copy data directly from the kernel context to the kernel context?
The answer is to use zero copy, a technology that allows the kernel to copy data directly from disk files to sockets without going through applications, which not only greatly improves application performance, but also reduces context switching between kernel and user mode. In simple terms, it prevents the CPU from copying data from one block of storage to another, reducing unnecessary copying. This is what “zero copy” means
Zero copy technology in Java
transferTo
. We can through the Java nio. Channels. FileChannel the transferTo () method to zero copy in Linux system, the transferTo () method directly to the bytes from it is invoked on the channel transmission to another section can write channels, The data doesn’t have to flow through the application. Internally, it depends on the underlying operating system’s zero-copy support. In Linux, it is passed to the SendFile () system call, which not only reduces data copying but also reduces context switching from four to two. The number of data copies has been reduced from four to three.
The steps are as follows:
- The sendFile system call causes the DMA engine to copy the file contents into the kernel buffer, which is then copied by the kernel into the kernel buffer associated with the socket.
- DMA copies the Socket buffer to the network card buffer.
public class Test {
public static void main(String[] args) {
long l = System.currentTimeMillis();
transferTo("/ home/HouXinLin/apps/gradle/gradle - 6.1.1 - all. Zip"."/home/HouXinLin/temp.zip");
System.out.println(System.currentTimeMillis() - l);
}
private static void stream(String src, String dest) {
try {
BufferedInputStream bufferedInputStream = new BufferedInputStream(new FileInputStream(new File(src)));
BufferedOutputStream bufferedOutputStream = new BufferedOutputStream(new FileOutputStream(new File(dest)));
byte[] temp = new byte[2048];
int size = 0;
while ((size = bufferedInputStream.read(temp)) > 0) {
bufferedOutputStream.write(temp, 0, size);
}
bufferedInputStream.close();
bufferedOutputStream.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch(IOException e) { e.printStackTrace(); }}private static void copy(String src, String dest) {
try {
Files.copy(Paths.get(src), new FileOutputStream(new File(dest)));
} catch(IOException e) { e.printStackTrace(); }}private static void transferTo(String src, String dest) {
try {
FileChannel readChannel = FileChannel.open(Paths.get(src), StandardOpenOption.READ);
FileChannel writeChannel = FileChannel.open(Paths.get(dest), StandardOpenOption.WRITE, StandardOpenOption.CREATE);
readChannel.transferTo(0, readChannel.size(), writeChannel);
readChannel.close();
writeChannel.close();
} catch(Exception e) { e.printStackTrace(); }}}Copy the code
After testing, the other two methods averaged over hundreds of milliseconds (138.5MB file size), while transferTo was over 50 milliseconds.
map
Mmap is a memory mapped file method provided by Linux. The MMAP system call enables the DMA engine to copy the contents of the file into the kernel buffer and then share the buffer with the user process, so that no copying between the kernel and the user memory space is required. Mmap replaces the read operation.
tmp_buf = mmap(file, len);
write(socket, tmp_buf, len);
Copy the code
So we can reduce the half of the amount of the kernel to replicate data, when a large amount of data transmission, this way has a great effect, however, this approach is flawed, existence hidden danger, in memory mapping file, when another process will be the same file truncation, so the write system call for access to illegal address terminated by bus error signal SIGBUS, SIGBUS kills the process by default and the server may be terminated.
The Filechannel. map method maps the size area of a file starting at position into a memory image file and returns MappedByteBuffer, which is inherited from ByteBuffer. The map method is implemented underneath through MMAP, so after the file memory is read from disk into the kernel buffer, user space and kernel space share the buffer.
MapMode has three parameters: MapMode, Position, and size MapMode: mapping mode. The options are READ_ONLY, READ_WRITE, and PRIVATE. Position: The Position from which the mapping starts and the number of bytes. Size: how many bytes back from position.Copy the code
private static void map(String src, String dest){
try {
FileChannel readChannel = FileChannel.open(Paths.get(src), StandardOpenOption.READ);
MappedByteBuffer map = readChannel.map(FileChannel.MapMode.READ_ONLY, 0, readChannel.size());
FileChannel writeChannel = FileChannel.open(Paths.get(dest), StandardOpenOption.WRITE, StandardOpenOption.CREATE);
writeChannel.write(map);
readChannel.close();
writeChannel.close();
} catch(Exception e) { e.printStackTrace(); }}Copy the code
This is also very fast, tested and identical to the transferTo method.
The benefits of zero copy
- Reduce or eliminate unnecessary CPU copies, freeing up the CPU to perform other tasks
- Reduce memory bandwidth usage
- Zero-copy techniques in general also reduce context switching between user space and operating system kernel space
Viewing system calls
When a program accesses a hardware device, such as reading disk files, receiving network data, etc., it must switch from user mode to kernel mode and access the hardware device through system calls. Strace can track the system calls made by this process, including parameters, return values, and execution time.
We package the above program as a JAR and execute the following command:
strace java -jar Demo.jar
Copy the code
As you can see from the output, the mmap function is called internally.