You may not have mastered Java IO yet

Hello everyone, I am xiao CAI, a desire to do CAI Not CAI xiao CAI in the Internet industry. Soft but just, soft praise, white piao just! Ghost ~ remember to give me a three – even oh!

This article focuses on I/O systems in Java

Refer to it if necessary

If it is helpful, do not forget the Sunday

Wechat public number has been opened, xiao CAI Liang, did not pay attention to the students remember to pay attention to oh!

Preface:

Creating a good input/output (I/O) system is a difficult task for programmers of programming languages

Java IO: Java input/output system. Most programs need to process some input and produce some output from that input, so Java provides us with the Java.io package

As a qualified program developer, we are not unfamiliar with IO, JAVA IO system knowledge system is as follows:

After looking at the above figure, it becomes clear that there is so much support in the java.io package. And we suddenly at the same time do not have to be alarmed, as the saying goes that all change does not leave its ancestor, we only need to expand according to the source, I believe that you can well master IO knowledge system.

The File type

Read and write operations involve dealing with files, so if you want to master IO streams, start with files.

The word “File” is neither singular nor plural. It can mean either a particular File or a set of files in a directory.

The list of

What if File represents a set of files in a directory, and we want to get a directory?

File has the API ready for us, and based on the return value type, it’s not hard to guess the purpose of each API method.

The TestFile folder contains the following files:

The name list

If we want to get a list of names under a given directory, we can use these two apis:

list()
list(FilenameFilter filter)

The no-argument list() method defaults to listing all the file names in the specified directory. If we want to specify a list of directory names we can use another method:

We expected to get the file name with the test keyword, and we got what we wanted.

File list

Sometimes we do a lot of operations not on a single file, but on the entire set of files. To generate this set of files, we need another API method called File:

listFiles()
listFiles(FilenameFilter filter)
listFiles(FileFilter filter)

Given this experience, we can easily guess that listFiles() is used to list all files:

This method returns the same array, but an array of type File.

You are also smart enough to know how to retrieve a file set with a specified keyword

With the above listed file name is exactly the same, is really a small clever ~

But what is the difference between the parameters passed by the listFiles(FileFilter filter) method? Let’s try:

The accept() method needs to be overridden for the same interface, but this method takes only one argument, File. Therefore, these two parameters are used for file filtering

Directory tools

Create a directory

The great thing about the File class is that it not only allows you to manipulate existing directory files, it also allows you to create something out of nothing!

The characteristics of a file are nothing more than name, size, last modified date, readable/write, type, etc

Then we should also be able to obtain through the API:

While File does not provide a direct method to obtain the type, we can obtain the type of the File by obtaining the full name of the File and then clipping the suffix to obtain the File:

A change of hand operation, self-sufficiency can also obtain the file type, is really a small clever ~

All of the above are based on the existence of the file directory, so if we want to operate on the file directory does not exist. Or if we carelessly entered the wrong file directory name, what would happen and would the process work?

The result is that an exception is thrown, which is normal, and operating on a file directory that doesn’t exist is nonsense

So we can do something like this if we are not sure if a file directory exists:

In the figure, we can see two API methods that we have not seen before, namely exists() and mkdirs().

exists(): Verifies whether a file directory exists
mkdirs(): Used to create a directory

With the above validation before operation, we successfully avoided the exception. In addition to mkdirs(), mkdir() can also create directories. What are the other differences between the two methods besides the missing s?

mkdir(): Only one directory can be created
mkdirs(): You can create multi-level directories

Our current scenario is that the Test directory does not exist, and the dir01 directory does not exist, so we have to create two levels of directories. But the mkdir() method doesn’t work, it can’t be created. So in this case we should use the mkdirs() method.

The File type

A File can be a File or a File set, and a File set can contain a File or a folder. If we want to read and write a File, but do not want to operate on a folder, it will be embarrassing, so we can use isDirectory to determine whether it is a folder:

Input and output

Above we covered the basic operations of the File class, and then we moved on to the I/O module.

Input and Output We often use the concept of streams, such as input streams and output streams. This is an abstract concept that represents any data source object with the ability to produce data or any receiver object with the ability to receive data. Streaming shields the actual I/O device from the details of processing the data!

I/O can be divided into input and output parts.

Input streams are divided into byte input streams (InputStream) and character input streams (Reader). Any class derived from InputStream or Reader implements the read() method to read a single byte or an array of bytes.

Output streams are divided into byte output streams and character output streams. Any class derived from OutputStream or Writer implements write(), which is used to write a single byte or an array of bytes.

So we can see the Java rules: all classes related to input should inherit from InputStream, and all classes related to output should inherit from OutputStream

InputStream

Used to represent classes that generate input from different data sources

What are those different data sources? Common examples are: 1. Byte arrays 2. String objects 3. Files 4.

Each of these data sources has a corresponding InputStream subclass to operate on:

class	function
ByteArrayInputStream	Allows memory buffers to be used as inputStreams
StringBufferInputStream	Have been abandonedConvert String to InputStream
FileInputStream	Used to read information from a file
PipedInputStream	Generate data for writing the related PipedOutPutStreampipeliningThe concept of
SequenceInputStream	Convert two or more InputStream objects to one InputStream
FilterInputStream	Abstract class, as`A decorator`Provides useful functionality for other InputStreams

OutPutStream

The class of this class determines the destination of the output: 1. Byte arrays 2. Files 3. The pipe

Common subclasses of OutPutStream are:

class	function
ByteArrayOutputStream	Create a buffer in memory where all data sent to the stream is placed
FileOutputStream	Used to write information to a file
PipedOutputStream	Any information written to it is automatically implemented as the output of the related PipedInputStreampipeliningThe concept of
FilterOutputStream	Abstract class, as`A decorator`Provides useful functionality for other OutputStreams

A decorator

We have seen that both the input stream and the output stream have abstract classes FilterInputStream and FilterOutputStream. These classes act as decorators. I/O operations in Java require many different combinations of capabilities, and this is why the decorator pattern is used.

What is a decorator? A decorator must have the same interface as the object it decorates, but it can also extend the interface, which gives us a fair amount of flexibility, but it also adds complexity to the code.

FilterInputStream and FilterOutputStream are two classes that provide decorator class interfaces to control specific input streams (InputStream) and output streams (OutputStream).

FilterInputStream

InputStream is a byte InputStream, so the data read should be received as a byte array, as follows:

We have to use a byte array to receive and read the value, and then convert it to a string.

Now that we have a decorator FilterInputStream, can we use a decorator subclass to help us read? FilterInputStream subclasses FilterInputStream subclasses

class	function
DataInputStream	Used with DataOutputStream, we can read basic data types (int, char, long) from the stream in a portable manner
BufferedInputStream	Using it prevents you from having to actually write every time you read. Stands for buffer zone

The DataInputStream allows us to read different primitive data types as well as strings. With the corresponding DataOutputStream, we can “stream” primitive data from one place to another.

Then before getting to BufferedInputStream let’s look at a set of tests:

The size of test01.txt is about 610M, and test02 and test03 are empty text files

So now we’re going to write the text with plain InputStream + OutputStream and decorated BufferedInputStream + BufferedOutputStream

Common combination:

Buffer combination:

It can be seen that the time consuming of the two methods is 4864 ms and 1275 ms respectively. Using a normal combination is four times as long as the buffer, which can be a huge difference if the file is larger! Surprised at the same time must also be surprised, this is why?

If you read a file with the read() method, the hard disk is accessed once for every byte read, which is inefficient. Even if the read(byte b[]) method is used to read more than one byte at a time, frequent disk operations occur when large files are read.

The BufferedInputStream API documentation explains that when BufferedInputStream is created, an array of internal buffers is created. This internal buffer can be repopulated as needed from the contained input stream as many bytes at a time as bytes are read from the stream. That is, the Buffered class initializes to create a large array of bytes that are read from the underlying input stream at a time to fill the array. When the program reads one or more bytes, it reads directly from the byte array. When the in-memory bytes are read, the buffer array is filled with the underlying input stream again. So this way of reading data from direct memory is much more efficient than accessing disk every time.

BufferedInputStream/BufferedOutputStream not directly manipulate data source, but the other bytes for packaging, they are the processing flow.

The program saves the data to a BufferedOutputStream buffer, not to a file immediately. Arrays in the buffer are saved to a file if:

Buffer full
flush()Clear buffer
close()Close the stream

FilterOutputStream

The basic operations of OutputStream are as follows:

The value is written to the file by calling the write() method. There are two things to note:

Writing documents is overwritten by default

If we call this method twice, the contents of the text file should be two lines of public numbers: small food good memory, but in reality only one line is used, because what is written later overwrites what is already there. The solution is to add append = true to the constructor

The difference between a write and a read is that an error will be reported if the file does not exist while a write will create the file for you by default if the file does not exist

The OutputStream decorator class FilterOutputStream also exists. The following are common subclasses of the decorator class:

class	function
DataOutputStream	Used with a DATAInputStream, you can write basic type data (int, char, long, etc.) to the stream in a portable manner
BufferedOutputStream	Use it to avoid the actual write operation each time data is sent, representsUse buffer, you can call`flush`Clear buffer

DataOutputStream and BufferedOutputStream have been covered above and won’t be covered here.

The Reader and Writer

In Java 1.1, the basic I/O streaming library was significantly modified, adding Reader and Writer classes. In my previous limited cognition, I would mistakenly think that these two classes are to replace InputStream and OutputStream, but the fact is not the same as my limited cognition.

InputStream and OutputStream provide I/O functionality in byte-oriented form, while Reader and Writer provide I/O functionality in Unicode-compatible character-oriented form

Both coexist and provide adapters – InputStreamReader and OutputStreamWriter

InputStreamReaderCan put theInputStreamconvertReader
OutputStreamWriterCan put theOutputStreamconvertWriter

The two are very similar, though not identical, as follows:

Byte stream	Characters of the flow
InputStream	Reader
OutputStream	Writer
FileInputStream	FileReader
FileOutputStream	FileWriter
ByteArrayInputStream	CharArrayReader
ByteArrayOutputStream	CharArrayWriter
PipedInputStream	PipedReader
PipedOutputStream	PipedWriter

Even decorator classes are almost similar:

Byte stream	Characters of the flow
FilterInputStream	FilterReader
FilterOutputStream	FilterWriter
BufferedInputStream	BufferedReader
BufferedOutputStream	BufferedWriter
PrintStream	PrintWriter

Using Reader and Writer is also quite simple:

Let’s take a look at the use of decorators, BufferedReader and BufferedWriter

RandomAccessFile

RandomAccessFile works for files made up of records of known size, so we can use seek() to move records from one place to another and then read or modify them. Records in a file are not necessarily the same size, as long as we can determine which records are large and where they are in the file.

We can see that RandomAccessFile inherits not from InputStream and OutputStream, but from the somewhat unfamiliar DateInput and DataOutput interfaces.

This class is a bit of a maverick. Let’s look at its constructor:

We only truncate part of the constructor, after all, only the key points

Looking at the constructor, you can see that there are four patterns defined here:

r	Open the text in read-only mode, which means you can’t manipulate the file with write
rw	Both read and write operations are allowed
rws	Each time a write operation is performed, the disk is synchronously flushed to refresh the content and metadata
rwd	Whenever a write operation is performed, the disk is synchronously flushed to refresh the content

How does that help? In plain English, the RandomAccessFile class requires everything. He can read and write

In essence, RandomAccessFile works like a combination of DataInputStream and DataOutputStream, with the addition of methods such as getFilePointer() to find the current file location, Seek () is used to move to a new location within the file, and length() is used to determine the maximum size of the file. The second parameter indicates whether we are “random read (R)” or “both read and write (RW)”, but it does not support writing files individually. Let’s actually do it:

Get read-only RandomAccessFile:

Get readable and writable RandomAccessFile

We first wrote test to the File, then moved the head pointer three bits and wrote File four more words. The result became testFile, because we moved the pointer to start at position four.

ZIP

When you see the word zip, it’s natural to think of compressed files, and yes, compressed files are extremely important in Java I/O. Perhaps it should be said that compression of files is also extremely important in our development.

Classes for ZIP compression are provided in the Java built-in classes. You can use ZipOutuputStream and ZipInputStream in the java.util.zip package to compress and decompress files. Let’s first look at how to compress the file ~

ZipOutputStream

The ZipOutputStream constructor is as follows:

public ZipOutputStream(OutputStream out) {/* doSomething */}
Copy the code

We need to pass in an OutputStream object. So we can roughly think of a compressed file as writing data to a compressed file, which might sound a little convoluted. Let’s take a look at what apis are available in ZipOutputStream:

methods	The return value	instructions
putNextEntry(ZipEntry e)	void	Start writing a new ZipEntry and move the position in the stream to the beginning of the entry value data
write(byte[] b, int off, int len)	void	Writes the byte array to the current ZIP entry data
setComment(String command)	void	Sets the comment text for this ZIP file
finish()	void	The content of the ZIP OutputStream is written without closing its associated OutputStream

Let’s show how to compress files:

Scenario: We need to compress the TestFile folder on drive D into test.zip on drive D

The specific operation logic is as follows:

Through the above steps we can easily compress a file

ZipInputStream

Finish how to compress the file, then naturally how to decompress the file!

public ZipInputStream(InputStream in) {/* doSomethings */}
Copy the code

ZipInputStream is similar to a compressed stream. The constructor also needs to pass in an InputStream object, and the API must, of course, be one to one:

methods	The return value	instructions
read(byte[] b, int off, int len)	int	Reads the position of the off offset in the target B array, which is len bytes in length
avaiable()	int	Check whether the data specified in the current entry has been read. If so, return 0; otherwise, return 1
closeEntry()	void	Close the current ZIP entry and position the stream to read the next entry
skip(long n)	long	Skips the number of bytes specified in the current ZIP entry
getNextEntry()	ZipEntry	Reads the next ZipEntry and moves the position within the stream to the beginning of the data to which this entry refers
createZipEntry(String name)	ZipEntry	Create a ZipEntry object with the specified name argument

Here’s how to decompress a file:

Don’t be intimidated by the length of the code, but if you read carefully, decompressing the file is easy:

We get a ZipEntry using getNextEntry(), which fetches files in a similar way to deep traversal, each time returning the following directory:

Each time, all files in a directory, such as all files in folder dir01, will be traversed again before folder dir02, so we do not need to use a recursive way to get all files. Once each file is retrieved, the output stream is retrieved through ZipFile and written to the unzipped file. The general process is as follows:

New I/O

The new JavaI/O class library, introduced in the java.nio.* package of JDK1.4, is also simple to speed up. In fact, older I/O packages have been re-implemented using NIO to take advantage of this speed increase.

As long as the structure used more closely resembles the way the operating system performs I/O, the speed will naturally increase as well, hence the two concepts: channels and buffers.

How do we understand the concepts of channels and buffers? We can think of the buffer as a little train in a coal mine, and the passage as the track of the train, carrying a full load of coal from the source. So instead of interacting directly with the channel, we interact with the buffer and dispatch the buffer to the channel. Channels either get data from the buffer or send data to the buffer.

ByteBuffer is the only buffer that interacts directly with a channel and can store raw bytes.

ByteBuffer buffer = ByteBuffer.allocate(1024);
Copy the code

Bytebuffers can usually be created by specifying size creation using the allocate() method. In addition, ByteBuffer can be created in 4

To better support the new I/O, three classes in the old I/O library have been modified to produce FileChannel. The modified classes are FileInputStream, FileOutputStream, and RandomAccessFile for both reading and writing. It’s worth noting here that these are byte manipulation streams, because character streams cannot be used to generate Channels, but Channels provides a practical way to generate readers and writers in Channels

Access to the channel

As we have seen above, there are three classes that support channel generation, which can be done as follows:

These are the three ways to create channels, and read and write operations have been tested. Let’s take a look at the test code below and summarize:

The getChannel() method will produce a FileChannel. We can pass it byteBuffers that we can read and write. One of the ways we store bytes in bytebuffers is to populate them directly using the put() method, filling in one or more bytes, or values of primitive data types. However, it is also possible to “wrap” an existing byte array into a ByteBuffer using the wray() method. In this way, instead of copying the underlying array, it can be used as storage for the resulting ByteBuffer, which can be called array-supported ByteBuffers
We can also see the position() method used by FileChannel, which moves the FileChannel around the file, where we move it to the end and then do the rest of the reading and writing.
For read-only access, we must explicitly allocate ByteBuffer using the static allocate() method. If we want better speed we can also use allocateDirect() to produce a “direct” buffer with higher coupling to the operating system. But the cost of this allocation can be larger, and the implementation varies from operating system to operating system.
If we want to call read() to store bytes for ByteBuffer, we have to call the flip() method on the buffer, which is used to tell the FileChannel that it is ready for someone else to read the bytes, and of course, to get the maximum speed. Here we use ByteBuffer to receive bytes without using the buffer for further operations. If we need to continue reading (), we must call clear() to prepare for each read() method.

The channel is connected

Programmers tend to be lazy, and the FileChannel approach seems cumbersome. So is there an easier way? There must be transferTo() and transferFrom() to connect a channel directly to another channel. Specific use is as follows:

Either method 1 or method 2 can successfully write the file to the test03.txt file

END

I/O operations are an integral part of our daily development, so we need to master this too!

Today you work harder, tomorrow you will be able to say less words!

I am xiao CAI, a man who studies with you. 💋

Wechat public number has been opened, xiao CAI Liang, did not pay attention to the students remember to pay attention to oh!

You may not have mastered Java IO yet

The File type

The list of

The name list

File list

Directory tools

Create a directory

The File type

Input and output

InputStream

OutPutStream

A decorator

FilterInputStream

FilterOutputStream

The Reader and Writer

RandomAccessFile

ZIP

ZipOutputStream

ZipInputStream

New I/O

Access to the channel

The channel is connected

Related Posts

From Problems to Open Source Django-Simpleui-Captcha Project | Python Theme Month

Architecture design: file service storage design

Backtracking algorithms for data structures and algorithms