In Java, it is easy to think of InputStream and OutputStream in IO. However, there are many ways to copy files, besides IO. NIO, Apache’s utility classes, and the JDK’s own file copy methods

IO copy

public class IOFileCopy {

    private static final int BUFFER_SIZE = 1024;

    public static void copyFile(String source, String target) {
        long start = System.currentTimeMillis();
        try(InputStream in = new FileInputStream(new File(source));
            OutputStream out = new FileOutputStream(new File(target))) {
            byte[] buffer = new byte[BUFFER_SIZE];
            int len;
            while ((len = in.read(buffer)) > 0) {
                out.write(buffer, 0, len);
            }

            System.out.println(String.format("IO file copy cost %d msc", System.currentTimeMillis() - start));
        } catch(Exception e) { e.printStackTrace(); }}}Copy the code

The file reading process in traditional IO can be divided into the following steps:

  • The kernel reads data from disk to the buffer. This process is read from disk to the kernel buffer by the disk operator via DMA, which is cpu-independent

  • The user process is copying data from the kernel buffer to the user-space buffer

  • The user process reads data from the user-space buffer

NIO copy

NIO copies files in two ways, one is through pipes, but through file memory memory mapping

public class NIOFileCopy {

    public static void copyFile(String source, String target) {
        long start = System.currentTimeMillis();
        try(FileChannel input = new FileInputStream(new File(source)).getChannel();
            FileChannel output = new FileOutputStream(new File(target)).getChannel()) {
            output.transferFrom(input, 0, input.size());
        } catch (Exception e) {
            e.printStackTrace();
        }

        System.out.println(String.format("NIO file copy cost %d msc", System.currentTimeMillis() - start)); }}Copy the code

File memory mapping:

By mapping kernel-space addresses and user-space virtual addresses to the same physical address, DMA hardware can fill buffers visible to both kernel and user-space processes. User processes directly read file contents from the memory. Applications only need to deal with the memory and do not need to copy the buffer back and forth, greatly improving I/O copy efficiency. The memory used to load memory-mapped files is outside the Java heap area

public class NIOFileCopy2 {

    public static void copyFile(String source, String target) {
        long start = System.currentTimeMillis();
        try(FileInputStream fis = new FileInputStream(new File(source));
            FileOutputStream fos = new FileOutputStream(new File(target))) {
            FileChannel sourceChannel = fis.getChannel();
            FileChannel targetChannel = fos.getChannel();
            MappedByteBuffer mappedByteBuffer = sourceChannel.map(FileChannel.MapMode.READ_ONLY, 0, sourceChannel.size());
            targetChannel.write(mappedByteBuffer);
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }

        System.out.println(String.format("NIO memory reflect file copy cost %d msc", System.currentTimeMillis() - start));
        File targetFile = newFile(target); targetFile.delete(); }}Copy the code

NIO memory-mapped file copying can be broken down into the following steps

Files# method used by copyFile

public class FilesCopy {

    public static void copyFile(String source, String target) {
        long start = System.currentTimeMillis();
        try {
            File sourceFile = new File(source);
            File targetFile = new File(target);
            Files.copy(sourceFile.toPath(), targetFile.toPath());
        } catch (IOException e) {
            e.printStackTrace();
        }

        System.out.println(String.format("FileCopy file copy cost %d msc", System.currentTimeMillis() - start)); }}Copy the code

FileUtils# method used by copyFile

Dependencies need to be introduced before using FileUtils

  • Rely on

    <dependency> <groupId> Commons -io</groupId> <artifactId> Commons -io</artifactId> <version>2.4</version> </dependency>Copy the code
  • Fileutilicals #copyFile encapsulates the class: Fileutilretrograde. Java

    public class FileUtilsCopy {
    
        public static void copyFile(String source, String target) {
            long start = System.currentTimeMillis();
            try {
                FileUtils.copyFile(new File(source), new File(target));
            } catch (IOException e) {
                e.printStackTrace();
            }
    
            System.out.println(String.format("FileUtils file copy cost %d msc", System.currentTimeMillis() - start)); }}Copy the code

Performance comparison

Since there are so many implementations, you must choose the one with the best performance

Test environment:

  • windows 10
  • CPU 6 nuclear
  • JDK1.8

Test code: performtest.java

public class PerformTest {

    private static final String source1 = "input/test1.txt";
    private static final String source2 = "input/test2.txt";
    private static final String source3 = "input/test3.txt";
    private static final String source4 = "input/test4.txt";
    private static final String target1 = "output/test1.txt";
    private static final String target2 = "output/test2.txt";
    private static final String target3 = "output/test3.txt";
    private static final String target4 = "output/test4.txt";

    public static void main(String[] args) { IOFileCopy.copyFile(source1, target1); NIOFileCopy.copyFile(source2, target2); FilesCopy.copyFile(source3, target3); FileUtilsCopy.copyFile(source4, target4); }}Copy the code

Five times in total, the read and write file sizes were 9KB, 23KB, 239KB, 1.77MB, and 12.7MB, respectively

Note: all units are milliseconds

From the execution results:

  • Files#copy > fileutills #copyFile => IO > NIO > NIO Files#copy > Fileutills #copyFile

  • If the file is small => NIO > IO > NIO > Files#copy > Fileutills #copyFile

  • IO > Files#copy > Fileutills #copyFile => NIO > IO > Files#copy > Fileutills #copyFile

  • Modifying the SIZE of THE I/O buffer affects the copy efficiency, but the performance is not better if the size of the I/O buffer is slightly larger than the size of the file to be copied

When files are small, THE IO efficiency is higher than THAT of NIO. The underlying implementation of NIO is complex, and NIO’s advantages are not obvious. Meanwhile, NIO memory mapping takes time to initialize, so it has no advantage over IO replication when files are small

If the pursuit of efficiency can choose NIO memory mapping to achieve file copy, but for large files to use memory mapping copy should pay special attention to the system memory usage. Use memory mapping to copy large files.

For most operating systems, mapping a file into memory is more
expensive than reading or writing a few tens of kilobytes of data via
the usual {@link #read read} and {@link #write write} methods.  From the
standpoint of performance it is generally only worth mapping relatively
large files into memory
Copy the code

Most operating systems have more memory mapping overhead than I/O overhead

At the same time, according to the test results, the file copy method provided by the tool class and JDK is not very effective. If you do not pursue efficiency, you can still use it. After all, if you can write less code, you can write less code

Years ago the last article, I am afraid that thirty evening blessing too much, you will not see my greetings, here in advance to wish you all the wealth of the New Year “rat” rat “do not come

Finally: test code