IO model notes (bad writing is better than good memory)

User space and kernel space

Virtual memory is divided by the operating system into two parts: Kernel space, where the Kernel code runs, and User space, where the User program code runs.

A process is kernel-state when it is running in kernel space and user-state when it is running in user-space.

Kernel space can execute any commands and call all system resources. User space can only perform simple operations and cannot directly call system resources, but must use system interfaces

(also known as system call) to send instructions to the kernel. Processes can be switched from user space to kernel space through the system interface.

In short, the memory space used by the operating system is the kernel space, which is the User space used by the applications we install or develop ourselves

STR = "my string" // userspace x = x + 2 file.write(STR) // Switch to kernel space y = x + 4 // Switch back to userspaceCopy the code

In the above code, the first and second lines are simple assignments, performed at User Space. The third line needs to be written to a file, which switches to Kernel space because users cannot write files directly

You have to go through the kernel’s arrangement, and the fourth line is assignment again, which cuts back to the User space.

Run the top command to view the CPU time allocation between the User space and Kernel space.

In the third line, 1.4us is the percentage of CPU consumed in user space, 2.1SY is the percentage of CPU consumed in system space, and 96.5 ID is the percentage of CPU consumed in idle processes. The higher the value, the more leisurely the CPU is

PIO and DMA

It is mentioned above that the user application needs to operate the disk and switch to the Kernel space first. Then how does the Kernel space interact with the disk?

Long ago, data transfer between disk and memory was cpu-controlled, that is, we read disk files into memory

, data is stored and forwarded by the CPU, which is called PIO. Obviously, this approach is very unreasonable and requires a lot of work

CPU time to read the file, causing the system to almost stop responding when the file is accessed.

Later, DMA (Direct Memory Access) replaced PIO, which can be done directly without passing through the CPU

Disk access and data exchange in memory (kernel space). In DMA mode, the CPU just needs to give instructions to the DMA controller,

Let the DMA controller handle the transfer of data. The DMA controller transfers data through the system bus and notifies you when the transfer is complete

CPU, this greatly reduces CPU usage and saves system resources, while the transfer speed of DMA is not significantly different from that of PIO, because it depends mainly on the speed of slow devices.

To be sure, PIO mode computers are now rare.

Cache IO vs. direct IO

Cache IO: Data is copied from disk to kernel space via DMA, and from kernel space to user space via CPU

Cache I/O is also known as standard I/O. The default I/O operation of most file systems is cache I/O. Caching in Linux

In the IO mechanism, data is first copied from disk to kernel-space buffers, and then from kernel-space buffers to applications

Address space of.

Read operation: The operating system checks whether the kernel cache has the data required by the user, and returns the data from the cache. Otherwise,

Read from disk and cached into the kernel space cache.

Write operation: Copies data from user space to the kernel-space buffer, at which point the write operation has been completed for the user program

The operating system determines when the data will fall, unless sync is explicitly invoked.

Advantages of caching IO: To some extent, it separates the kernel space from the user space and protects the operating system.

You can reduce the number of disk reads to improve performance.

Disadvantages of cache I/O: In the cache I/O mechanism, DMA always stores data into the kernel space buffer first, rather than directly

Transfer between user space and disk, so that data needs to be copied many times during transfer. (Kernel space to user space)

The CPU and memory cost of copying data is also high.

Direct I/O: The kernel cache is introduced to improve disk access performance, because when a process needs to read a disk file, if

The file contents are already in the kernel buffer, so no further disk access is required; When the process needs to write to disk

When data is actually written to the kernel buffer, the actual compass operation is delayed by certain policies.

However, for some complex applications, such as database services (mysql), they want to bypass in order to fully improve performance

Kernel buffer, by their own IO management in user space implementation, including caching mechanism, delay rewriting mechanism, etc. To support uniqueness

For example, the database can improve the query cache hit ratio (self-implemented cache) in a more rational way.

On the other hand, bypassing the kernel buffer can also reduce the kernel space overhead, since the kernel buffer itself is the kernel space memory used.

Advantages of direct IO: Applications directly access disk data without passing through the kernel buffer

One less copy of data from the kernel buffer to the user program. The application determines the timing of data loss to minimize data loss.

Disadvantages of direct IO: If the data accessed is not in the application cache, it needs to be loaded from disk each time, so direct IO is generally required

The application implements the cache itself.

Network IO

The operating system copies data from disk to kernel buffer via DMA

2 The CPU copies data from the kernel buffer to the application buffer (user space)

3 CUP Writes data from the application buffer to the Socket cache of the kernel

4 The operating system copies the Socket cache to the nic cache through DMA.

As can be seen from the above process, data has gone from kernel mode to user mode for nothing, wasting two copies, both of which are CPU copies and occupy CPU resources

Network seven layer model

Here we focus on layers 3, 4 and 7. The layer 3 network layer determines the IP addresses of the sender and receiver, and the port number of the layer 4 connection distribution data is determined at this layer. The way to establish a connection includes JAVA sockets. Finally, the seventh application layer receives the packet, reads the packet according to certain packet rules, that is, protocols, and responds to the data and decides whether to close the connection.

Here is a code example to verify the above statement.

package com.datang.pet.control.test; import java.io.InputStream; import java.io.OutputStream; import java.net.ServerSocket; import java.net.Socket; public class Main { public static void main(String[] args) throws Exception { ServerSocket serverSocket = new ServerSocket(8080); System.out.println(" started successfully "); while (true) { Socket socket = serverSocket.accept(); InputStream inputStream = socket.getInputStream(); String httpRequest = ""; byte[] httpRequestBytes = new byte[1024]; int length = 0; if ((length = inputStream.read(httpRequestBytes)) > 0) { httpRequest = new String(httpRequestBytes, 0, length); } system.out. println(" after the message body (" + httpRequest + ") message body end "); OutputStream outputStream = socket.getOutputStream(); StringBuffer httpResponse = new StringBuffer(); HttpResponse. Append (" HTTP / 1.1 200 OK \ n "), append (" content-type: text/html\n") .append("\r\n") .append("<html><body>") .append("ddddddddddddddddd") .append("</body></html>"); outputStream.write(httpResponse.toString().getBytes()); socket.close(); }}}Copy the code

View Code

Use Java code to create a ServerSocket to listen on port 8080. Outputs the received data, returns an HTTP response, and closes the client Socket connection. In the browser sending request test, it can be seen that the browser has sent two requests, the second is the Icon sent by the browser, the first is SSS =111 sent by us, and the server can also receive the request body in HTTP format. Finally, socket.close() closes the client connection. If it is not closed, the browser will keep spinning, indicating that the session is not finished.

Non-blocking IO

public static void telnet() { ServerSocket serverSocket = null; try { serverSocket = new ServerSocket(8080); System.out.println(" started successfully "); } catch (Exception e) {system.out.println (" failed to start "); return; } while (true) { try { Socket socket = serverSocket.accept(); System.out.println(" before getting input stream "); InputStream inputStream = socket.getInputStream(); System.out.println(" after the input stream "); String httpRequest = ""; byte[] httpRequestBytes = new byte[1024]; int length = 0; if ((length = inputStream.read(httpRequestBytes)) > 0) { httpRequest = new String(httpRequestBytes, 0, length); } system.out. println(" after the message body (" + httpRequest + ") message body end "); OutputStream outputStream = socket.getOutputStream(); outputStream.write("ok".getBytes()); socket.close(); } catch (Exception e) {system.out.println (" client Exception "); }}}Copy the code

View Code

Synchronous AND asynchronous I/OS

The two refer to how user-space and kernel-space data interact

Synchronization: User space needs data and must wait for kernel space for results before doing anything else.

Asynchronous: The user space needs data and can perform other operations without the kernel space for the result. When the data is ready, the kernel space asynchronously notifies the user space and sends the data to the user space.

Blocking AND non-blocking I/OS

Both refer to the way user space and kernel space IO operate

Blocking: The userspace call systemCall () blocks when it sends IO operations to the kernel space.

Non-blocking: When userspace calls SystemCall () to send IO operations to kernel space, the call does not block and returns directly, though it may return with no data.

Synchronous and blocking are similar, asynchronous and non-blocking are similar, but there is also synchronous IO non-blocking IO, where user space needs data and sends systemCall () directly in non-blocking mode

Return, but user space will poll the kernel space to see if the data is ready.

IO model notes (bad writing is better than good memory)

User space and kernel space

PIO and DMA

Cache IO vs. direct IO

Network IO

Network seven layer model

Non-blocking IO

Synchronous AND asynchronous I/OS

Blocking AND non-blocking I/OS

Related Posts

Talk about ReentrantReadWriteLock’s bit operations

Operational tell me why CPU soared 300%, I’m online program will collapse | Java Debug notes

Python implements circular queues