Other related topics

JAVA InputStream and OutputStream read files and send them over sockets. How many times did JAVA NIO copy files and send them over sockets? JAVA IO topic 3: JAVA memory mapping and application scenarios JAVA IO topic 4: JAVA sequential IO principles and corresponding application scenarios

Basic concepts of sequential AND random I/O

If the address of the initial sector given by this I/O and the address of the end sector of the last I/O are completely consecutive or not very far apart, they are regarded as sequential I/ OS. Otherwise, if the difference is large, it counts as a random I/O.

Hardware understanding 1. Mechanical hard disks

The performance of a mechanical hard disk is affected by three factors: seek time, rotation delay and data transfer time.

  1. Seek time

Refers to the time required to move the read/write head to the correct track. The shorter the seek time, the faster the I/O operation. Currently, the average seek time of a disk ranges from 3 to 15ms.

  1. Rotational delay

Refers to the time required for the disk rotation to move the sector in which the requested data is located below the read/write disk. The rotation delay depends on the disk rotation speed and is usually expressed as > 1/2 of the time it takes for the disk to rotate once. For example, the average rotation delay of a 7200rpm disk is about 4.17ms (60 x 1000/7200/2), and that of a 15000rpm disk is 2ms.

  1. Data transfer time

The time required to complete the transmission of the requested data. It depends on the data transfer rate and is equal to the data size divided by the data transfer rate. Currently IDE/ATA can reach 133MB/s, SATA II can reach 300MB/s interface data transmission rate, data transmission time is usually much less than the consumption of the first two parts. It can be ignored in simple calculation.

The sequential write performance of a mechanical hard disk is good mainly because it takes time for the head to move to the correct track.

Hardware Understanding 2. Solid-state drives

A solid-state drive is a randomly addressing chip that does not have seek time or rotation delay. But solid each data update is not directly manipulate the original location, but write data in a new location, and then delete the old position, the longer it will produce a lot of disk fragments, and solid-state drives has realized the garbage collector like JVM gc, random write, the more the more garbage, recycling consumption is larger, So sequential writing is relatively efficient.

Sequential IO at the operating system level

The operating system optimizes disk read operations by caching disk data (Page Cache and Buffer Cache in Linux). Disk Cache has a prefetch function. The detailed process of prefetch is as follows:

For the first read request of each file, the system reads the requested page and the few pages that follow, which is called synchronous prefetch. For the second read request, if the read page is not in the Cache, the file access is not sequential, and the system continues to use synchronous prefetch. If the page is in the Cache, the prefetch is hit. The OS doubles the size of the page. In this case, the prefetch process is asynchronous.

Therefore, asynchronous prefetch can be used to improve the I/O response speed in sequential read mode.

Random and sequential IO in Java

  • Random IO

In Java common apichannel. Read/write, outputStream/inputStream, are random IO, its work mode is first allocated memory, to write data in memory, cast memory to the operating system to read and write, is not explicitly specify the location of the reading and writing, The operating system randomly assigns the location of the write.

  • Order IO

Sequential I/O can be implemented using MappedByteBuffer in Java. The difference between sequential I/O and random I/O is that a contiguous segment of file space is allocated in advance. After each write in this space, the last offset is recorded and the next write continues from this position. An API provided by MappedByteBuffer specifies the location of the write to achieve sequential IO.

Sample code, in order:

public static int write(String path) throws IOException {
	// Data to write
        List<String> lines = Lists.newArrayList("First line \n"."Second line \n"."Third line \n");
	// Open the file channel and grant read and write permissions
        FileChannel targetFileChannel = FileChannel.open(Paths.get(path), StandardOpenOption.WRITE, 
        StandardOpenOption.READ);
        // Map file buffer
        MappedByteBuffer map = targetFileChannel.map(FileChannel.MapMode.READ_WRITE, 0.64);
        for (String line : lines) {
            byte[] bytes = line.getBytes(StandardCharsets.UTF_8);
            Map. put(bytes, position, len) can also be used to specify the position to be written to
            map.put(bytes);
        }
        // Return the offset
        return map.position();
}
Copy the code

Therefore, at the code level, calling methods such as inputStream.write without specifying a specific location will cause the operating system to randomly determine the location of the write, which can be understood as random write. Use MappedByteBuffer to open up a memory map, and in order to append the way, called sequential write. Of course, if you allocate multiple MappedByteBuffers and write them in random places, then it is sequential. Therefore, I understand that sequential writing is not something that the operating system or some existing function does for us directly. The key is how we implement it.

Application scenarios of sequential I/O

Sequential I/O is used for message queues in scenarios such as Kafka, rocketMQ, and QMQ to improve throughput. Here is a brief look at the application of QMQ to sequential I/O (rocketMQ I have not seen). Java memory mapping and application scenarios

Refer to the article

Disk I/O stuff. – Meituan tech team