blog.csdn.net/mu\_wind/ar… ?

Java IO stream

  • preface
  • 1 Introduction to Java IO
    • 1.1 I/O Traffic Classification
    • 1.2 Case practice
  • 2 I/O flow object
    • 2.1 the File type
    • 2.2 the byte stream
    • 2.3 characters flow
    • 2.4 the serialization
  • 3 I/O flow method
    • 3.1 Byte stream method
    • 3.2 Character stream method
  • 4 Additional Content
    • 4.1 bits, bytes, characters
    • 4.2 I/O Flow Efficiency Comparison
    • 4.3 NIO

preface

Someone once asked the author of Fastjson (ali technical expert high-speed rail) : “You didn’t get any benefits from developing Fastjson, but you were scolded. Why did you do this?”

The bullet train replied, “Because love itself is a reward!”

The answer struck me. Think of yourself, and so it is. Writing is a painful process, heart writing is even more suffering, need to consider every word, repeated deletion to become. However, when a good article comes out of your hands, it’s worth the pain. It would be even more inspiring if these posts were read and recognized by everyone. The happiness of a technical person can be so pure and simple.

Point wave attention is not lost, a key three even good luck!

IO streams are an important part of Java and one we deal with a lot. This blog post on Java IO is one of the top three on the web (please spit!)

If you can answer the following questions (questions will be added), then congratulations, you have a good grasp of IO knowledge, can immediately close the article. Instead, you can find the answer in the following article.

  1. What are the features of Java IO streams?
  2. How many types of Java IO streams are there?
  3. What is the relationship and difference between byte stream and character stream?
  4. Is the character stream buffered?
  5. Does buffering flow have to be efficient? Why is that?
  6. What design pattern ideas in Java are embodied by buffered flows?
  7. Why serialization? How is serialization implemented?
  8. After serializing the data, modify the class file again, read the data will be a problem, how to solve the problem?

1 Introduction to Java IO

IO, also known as in and out, refers to data transfer between applications and external devices. Common external devices include files, pipes, and network connections.

Java handles IO through streams, so what is a stream?

A Stream, “Stream,” is an abstract concept. It refers to a Stream of data (characters or bytes) that is sent in a first-in, first-out manner.

When the program needs to read data, it opens a stream to a data source, be it a file, memory, or network connection. Similarly, when a program needs to write data, it starts a stream to its destination. You can imagine the data sort of flowing through it.

In general, the following are the characteristics of streams:

  1. Fifo: Data written to the output stream is read by the input stream first.
  2. Sequential access: A string of bytes can be written to the stream one by one, and a string of bytes will be read in write order when read, with no random access to intermediate data. (RandomAccessFileExcept)
  3. Read-only or write-only: A stream can be either an input stream or an output stream. The input stream can only be read and the output stream can only be written. In a data transfer channel, if data is to be written and read, two streams are provided.

1.1 I/O Traffic Classification

I/O streams are classified into the following three types:

  1. According to the direction of data flow: input flow, output flow
  2. Data by processing unit: byte stream, character stream
  3. By function: node flow, processing flow

1. Input and output streams

Input and output are relative to the application, such as file reading and writing, where reading files is an input stream and writing files is an output stream, which can easily be reversed.



2. Byte stream and character stream

Byte streams and character streams are used almost exactly the same, except that byte streams and character streams operate on data units that are 8-bit bytes, while character streams operate on 16-bit characters.

Why do we have character streams?

Java characters adopt Unicode standard. In Unicode encoding, an English character is one byte, and a Chinese character is two bytes.



In UTF-8, a Chinese character is three bytes. For example, in the following figure, the five Chinese characters for “Unknown cloud depth” correspond to 15 bytes: -28-70-111-26-73-79-28-72-115-25-97-91-27-92-124

The problem is that if you use a byte stream to process Chinese, if you read and write the number of bytes corresponding to one character at a time, there will be no problem. Once you split the number of bytes corresponding to one character, there will be garbled characters. To make it easier to handle Chinese characters, Java introduced character streams.

Other differences between byte stream and character stream:

  1. Byte stream is generally used to process images, video, audio, PPT, Word and other types of files. Character streams are generally used to process plain text files, such as TXT files, but cannot process non-text files, such as images and videos. In a word: byte streams can handle all files, while character streams can only handle plain text files.
  2. The byte stream itself has no buffer, so the efficiency of buffering byte stream is very high compared to byte stream. The buffered character stream is not so efficient as the buffered character stream. See efficiency comparison at the end of this article.

In the case of writing a file, we looked at the source code for a character stream and found that it was really useful to use buffers:



3. Node flow and processing flow

Node stream: A stream class that directly operates on data reads and writes, such as FileInputStream

Processing stream: Linking and encapsulating an existing stream to provide powerful, flexible read and write capabilities for programs by processing data, such as BufferedInputStream.

The process flow and node flow apply Java’s decorator design pattern.

The node flow and the processing flow are vividly depicted in the figure below. The processing flow is the encapsulation of the node flow, and the final data processing is done by the node flow.



Among the many processing flows, there is one that is very important, and that isBuffer flow.

We know that program – disk interaction is slow for memory operations and can easily become a performance bottleneck for programs. Reducing the interaction between programs and disks is an effective way to improve program efficiency. Buffered streams apply this idea: normal streams read and write one byte at a time, while buffered streams set up a cache in memory that stores enough data to be operated on before interacting with memory or disk. In this way, the number of interactions is reduced by increasing the amount of data for each interaction while the total amount of data remains unchanged.

Think of the example in life. When we move bricks, it must be inefficient to put them on the car one by one. We can use a cart, put the bricks on the cart, then push the cart in front of the car, put the bricks on the car. In this example, the trolley can be regarded as a buffer, the presence of the trolley, we reduce the number of loading, thus improving efficiency.

What needs to be noted is, is buffering flow efficient? Not necessarily. In some cases, buffer flow efficiency may be lower. For details, see IO Flow Efficiency Comparison.

The COMPLETE I/O classification diagram is as follows:

1.2 Case practice

Next, let’s look at how to use Java IO.

The text reading and writing example, which is what the article says at the beginning, will be “Matsushita asked the boy, the teacher to pick medicine. Only in this mountain, where the clouds are deep.” Write local text, then read from the file and output to the console.

1, FileInputStream, FileOutputStream

Byte stream is not recommended because it is inefficient

public class IOTest { public static void main(String[] args) throws IOException { File file = new File("D:/test.txt"); write(file); System.out.println(read(file)); } public static void write(File file) throws IOException { OutputStream os = new FileOutputStream(file, true); // String String = "Matsushita asked the boy, teacher to pick medicine. Only in this mountain, where the clouds are deep." ; // Write the file os.write(string.getbytes ()); // Close the stream os.close(); } public static String read(File file) throws IOException { InputStream in = new FileInputStream(file); Byte [] bytes = new byte[1024]; StringBuilder sb = new StringBuilder(); Int length = 0; int length = 0; While ((length = in.read(bytes))! Sb.append (new String(bytes, 0, length)); } // Close the stream in.close(); return sb.toString(); }} 123456789101112131415161718192021222324252627282930313233343536373839Copy the code

BufferedInputStream BufferedOutputStream

Buffering byte streams are designed for high efficiency, and the real read and write operations are FileOutputStream and FileInputStream, so it’s not surprising that the constructor takes objects from these two classes.

Public class IOTest {public static void write(File File) throws IOException {// Buffer stream, BufferedOutputStream BIS = new BufferedOutputStream(new FileOutputStream(file, true)); // String String = "Matsushita asked the boy, teacher to pick medicine. Only in this mountain, where the clouds are deep." ; // Write the file bis.write(string.getbytes ()); // Close the stream bis.close(); } public static String read(File file) throws IOException { BufferedInputStream fis = new BufferedInputStream(new FileInputStream(file)); Byte [] bytes = new byte[1024]; StringBuilder sb = new StringBuilder(); Int length = 0; int length = 0; // loop to fetch data while ((length = fis.read(bytes))! Sb.append (new String(bytes, 0, length)); } // Close the stream fis.close(); return sb.toString(); }} 12345678910111213141516171819202122232425262728293031323334Copy the code

3, InputStreamReader, OutputStreamWriter

Character streams are suitable for reading and writing text files. The OutputStreamWriter class is also implemented by the FileOutputStream class, so its constructor is the Object of the FileOutputStream

Public class IOTest {public static void write(File File) throws IOException {// OutputStreamWriter Displays the specified character set. Osw = new OutputStreamWriter(new FileOutputStream(file, true), "UTF-8"); // String String = "Matsushita asked the boy, teacher to pick medicine. Only in this mountain, where the clouds are deep." ; osw.write(string); osw.close(); } public static String read(File file) throws IOException { InputStreamReader isr = new InputStreamReader(new FileInputStream(file), "UTF-8"); Char [] chars = new char[1024]; StringBuilder sb = new StringBuilder(); // The length of the character array read, -1 indicates no data int length; While ((length = isr.read(chars))! Sb.append (chars, 0, length); sb.append(chars, 0, length); } // Close the stream isr.close(); return sb.toString() } } 12345678910111213141516171819202122232425262728293031Copy the code

4, character stream convenience class

Java provides FileWriter and FileReader to simplify reading and writing character streams. New FileWriter is equivalent to new OutputStreamWriter(new FileOutputStream(file, true)).

public class IOTest { public static void write(File file) throws IOException { FileWriter fw = new FileWriter(file, true); // String String = "Matsushita asked the boy, teacher to pick medicine. Only in this mountain, where the clouds are deep." ; fw.write(string); fw.close(); } public static String read(File file) throws IOException { FileReader fr = new FileReader(file); Char [] chars = new char[1024]; StringBuilder sb = new StringBuilder(); // The length of the byte array read, -1 indicates no data int length; While ((length = fr.read(chars))! Sb.append (chars, 0, length); sb.append(chars, 0, length); } // Close the stream fr.close(); return sb.toString(); }} 123456789101112131415161718192021222324252627282930Copy the code

5, BufferedReader, BufferedWriter

public class IOTest { public static void write(File file) throws IOException { // BufferedWriter fw = new BufferedWriter(new OutputStreamWriter(new // FileOutputStream(file, true), "UTF-8")); BufferedWriter bw = new BufferedWriter(new FileWriter(file, true)); // String String = "Matsushita asked the boy, teacher to pick medicine. Only in this mountain, where the clouds are deep." ; bw.write(string); bw.close(); } public static String read(File file) throws IOException { BufferedReader br = new BufferedReader(new FileReader(file)); StringBuilder sb = new StringBuilder(); // Read data String line; While ((line = br.readline ())! = null) {// Convert the read to a string sb. Append (line); } // Close the stream br.close(); return sb.toString(); }} 1234567891011121314151617181920212223242526272829303132Copy the code

2 I/O flow object

In the first section, we have a general understanding of IO, and completed a few cases, but there is still a lack of more detailed understanding of IO, so next we will decompose Java IO, comb out the complete knowledge system.

The Java class provides more than 40 classes, and we only need to take a closer look at the most important ones for everyday use.

2.1 the File type

The File class is used to manipulate files, but it cannot manipulate data in files.

public class File extends Object implements Serializable, Comparable<File>
1
Copy the code

The File class implements Serializable and Comparable

, indicating that it supports serialization and sorting.

Constructor of the File class

The method name

instructions

File(File parent, String child)

Creates a new File instance based on the parent abstract pathname and child pathname strings.

File(String pathname)

Creates a new File instance by converting the given pathname string to an abstract pathname.

File(String parent, String child)

Creates a new File instance based on the parent pathname string and child pathname string.

File(URI uri)

Create a new instance of File by converting the given file: URI to an abstract pathname.

A common method of the File class

methods

instructions

createNewFile()

A new empty file is created inseparably if and only if no file with the name specified by this abstract pathname exists.

delete()

Deletes the file or directory represented by this abstract pathname.

exists()

Tests whether the file or directory represented by this abstract pathname exists.

getAbsoluteFile()

Returns the absolute pathname form of this abstract pathname.

getAbsolutePath()

Returns an absolute pathname string for this abstract pathname.

length()

Returns the length of the file represented by this abstract pathname.

mkdir()

Creates the directory specified by this abstract pathname.

Example of the File class

public class FileTest { public static void main(String[] args) throws IOException { File file = new File("C:/Mu/fileTest.txt"); // Check if file exists if (! File.exists ()) {// Create file.createnewfile () if it does not exist; } system.out. println(" file absolute path: "+ file.getabsolutePath ()); System.out.println(" file size: "+ file.length()); // Delete file file.delete(); }} 12345678910111213141516Copy the code

2.2 the byte stream

InputStream and OutputStream are two abstract classes that are the base classes of byte streams. All concrete byte stream implementation classes inherit from these two classes.

InputStream, for example, inherits Object and implements Closeable

public abstract class InputStream
extends Object
implements Closeable
123
Copy the code

InputStreamClass has many implementation subclasses. Here are some of the more common ones:



To elaborate on the class shown above:

  1. InputStream:InputStreamIs the abstract base class for all byte input streams. As mentioned earlier, abstract classes cannot be instantiated, but actually exist as templates, defining methods for all implementation classes to handle input streams.
  2. FileInputSream: file input stream, a very important byte input stream used to read files.
  3. PipedInputStream: pipe byte input stream, can achieve pipe communication between multiple threads.
  4. ByteArrayInputStream: byte array Input stream that reads in bytes from a byte array (byte[]). That is, the resource files are stored in bytes into the byte array of the class.
  5. FilterInputStreamDecorator class, the concrete decorator inherits this class. These classes are processing classes that encapsulate node classes to achieve some special functions.
  6. DataInputStream: data input stream, which is used to decorate other input streams to “allow applications to read basic Java data types from the underlying input stream in a machine-independent manner.”
  7. BufferedInputStream: buffer stream, decorate the node stream, there will be an internal cache, used to store bytes, each time the cache is full and sent, instead of one or two bytes sent, more efficient.
  8. ObjectInputStream: Object input stream, used to provide pairsBasic data or objectsPersistent storage of. In layman’s terms, the ability to transfer objects directly is often used in deserialization. It is also a processing flow, and the constructor’s input parameter is oneInputStreamInstance object of.

OutputStreamClass inheritance diagram:

The OutputStream class inheritance is similar to that of InputStream, except PrintStream.

2.3 characters flow

Like byte streams, character streams have two abstract base classes, Reader and Writer. All other character stream implementation classes inherit from these two classes.

In order toReaderAs an example, its main implementation subclasses are shown below:



A detailed description of each class:

  1. InputStreamReader: bridge from byte stream to character stream (InputStreamReaderThe constructor entry parameter isFileInputStreamThat reads bytes and decodes them into characters using the specified character set. The character set it uses can be specified by name, given explicitly, or it can accept the platform’s default character set.
  2. BufferedReader: Reads text from the character input stream, setting up a buffer to improve efficiency.BufferedReaderIs theInputStreamReaderThe input of the former constructor is an instance object of the latter.
  3. FileReader: a convenience class for reading character files,new FileReader(File file)Is equivalent tonew InputStreamReader(new FileInputStream(file, true),"UTF-8"), butFileReaderCharacter encoding and default byte buffer size cannot be specified.
  4. PipedReader: pipe character input stream. Realize the pipeline communication between multiple threads.
  5. CharArrayReaderFrom:CharAn array of media streams that read data.
  6. StringReaderFrom:StringThe media stream that reads data in.

Writer and Reader are similar in structure and in opposite direction. The only difference is that Writer subclass PrintWriter.

2.4 the serialization

To be continued…

3 I/O flow method

3.1 Byte stream method

InputStream main methods:

  • read(): Reads a byte of data from this input stream.
  • read(byte[] b): Reads up to B. length of bytes of data from this input stream into a byte array.
  • read(byte[] b, int off, int len): Reads up to len bytes of data from this input stream into a byte array.
  • close(): Closes this input stream and releases all system resources associated with it.

Byte OutputStream OutputStream main methods:

  • write(byte[] b): writes B. length bytes from the specified byte array to the file output stream.
  • write(byte[] b, int off, int len): writes len bytes from the specified byte array starting with offset off to the file output stream.
  • write(int b): writes the specified byte to the file output stream.
  • close(): Closes this input stream and releases all system resources associated with it.

3.2 Character stream method

Main methods for Reader:

  • read(): Reads a single character.
  • read(char[] cbuf): reads characters into an array.
  • read(char[] cbuf, int off, int len): reads characters into a part of an array.
  • read(CharBuffer target): Attempts to read characters into the specified character buffer.
  • flush(): Flushes the buffer of the flow.
  • close(): Close the stream, but refresh it first.

Writer main methods:

  • write(char[] cbuf): Writes to a character array.
  • write(char[] cbuf, int off, int len): Writes a part of the character array.
  • write(int c): Writes a single character.
  • write(String str): Writes a string.
  • write(String str, int off, int len): Writes a part of the string.
  • flush(): Flushes the buffer of the flow.
  • close(): Close the stream, but refresh it first.

In addition, there are two unique methods for character buffering streams:

  • BufferedWriterclassnewLine()Writes a line separator. This method automatically ADAPTS to the line separator on your system.
  • BufferedReaderclassreadLine(): Reads a line of text.

4 Additional Content

4.1 bits, bytes, characters

Byte is a unit of measurement, indicating the amount of data. It is a unit of measurement used in computer information technology to measure storage capacity. Usually, one Byte is equal to eight bits.

Character The letters, numbers, words, and symbols used in computers, such as’ A ‘, ‘B’, ‘$’,’ &’, etc.

Generally, a letter or character occupies one byte in the English state, and a Chinese character is represented by two bytes.

Bytes and characters:

  • In THE ASCII code, one English letter (case insensitive) is one byte, and one Chinese character is two bytes.
  • In UTF-8 encoding, an English word is one byte and a Chinese word is three bytes.
  • In Unicode, an English byte is one byte and a Chinese byte is two bytes.
  • Symbols: English punctuation is one byte, Chinese punctuation is two bytes. For example: English full stop. The size of 1 byte, Chinese full stop. It is 2 bytes in size.
  • Utf-16 encodings require two bytes for each alphanumeric character or Chinese character storage (some Chinese characters in the Unicode extension require four bytes).
  • In UTF-32 encoding, it takes four bytes to store any character in the world.

4.2 I/O Flow Efficiency Comparison

First, compare the efficiency of the ordinary byte stream with that of the buffer byte stream:

public class MyTest { public static void main(String[] args) throws IOException { File file = new File("C:/Mu/test.txt"); StringBuilder sb = new StringBuilder(); for (int i = 0; i < 3000000; i++) { sb.append("abcdefghigklmnopqrstuvwsyz"); } byte[] bytes = sb.toString().getBytes(); long start = System.currentTimeMillis(); write(file, bytes); long end = System.currentTimeMillis(); long start2 = System.currentTimeMillis(); bufferedWrite(file, bytes); long end2 = System.currentTimeMillis(); System.out.println(" ordinary byte stream time: "+ (end-start) +" ms"); System.out.println(" buffer byte stream time: "+ (end2-start2) +" ms"); } // Public static void write(File File, byte[] bytes) throws IOException { OutputStream os = new FileOutputStream(file); os.write(bytes); os.close(); } public static void bufferedWrite(File File, byte[] bytes) throws IOException { BufferedOutputStream bo = new BufferedOutputStream(new FileOutputStream(file)); bo.write(bytes); bo.close(); }} 12345678910111213141516171819202122232425262728293031323334353637Copy the code

Running results:

Common byte stream: 250 ms Buffer byte stream: 268 ms 12Copy the code

The result surprised me. The buffer flow was supposed to be efficient. To know why, can only go to the source code to find the answer. Look at the write method for byte buffered streams:

public synchronized void write(byte b[], int off, int len) throws IOException {
    if (len >= buf.length) {
        /* If the request length exceeds the size of the output buffer,
           flush the output buffer and then write the data directly.
           In this way buffered streams will cascade harmlessly. */
        flushBuffer();
        out.write(b, off, len);
        return;
    }
    if (len > buf.length - count) {
        flushBuffer();
    }
    System.arraycopy(b, off, buf, count, len);
    count += len;
}
123456789101112131415
Copy the code

The comments make it clear: if the request exceeds the size of the output buffer, flush the output buffer and write the data directly. In this way, buffer flows cascade harmlessly.

But, as to why so design, I did not want to understand, which understand big guy can leave a message to give directions.

Based on the above situation, to compare the efficiency difference between ordinary byte stream and buffer byte stream, it is necessary to avoid reading and writing long strings directly. Therefore, the following comparison case is designed: copy files with byte stream and buffer byte stream separately.

public class MyTest { public static void main(String[] args) throws IOException { File data = new File("C:/Mu/data.zip"); File a = new File("C:/Mu/a.zip"); File b = new File("C:/Mu/b.zip"); StringBuilder sb = new StringBuilder(); long start = System.currentTimeMillis(); copy(data, a); long end = System.currentTimeMillis(); long start2 = System.currentTimeMillis(); bufferedCopy(data, b); long end2 = System.currentTimeMillis(); System.out.println(" ordinary byte stream time: "+ (end-start) +" ms"); System.out.println(" buffer byte stream time: "+ (end2-start2) +" ms"); } public static void copy(File in, File out) throws IOException {// Encapsulating data source InputStream is = new FileInputStream(in); OutputStream OS = new FileOutputStream(out); int by = 0; while ((by = is.read()) ! = -1) { os.write(by); } is.close(); os.close(); } public static void bufferedCopy(File in, File out) throws IOException {// Encapsulate data source BufferedInputStream bi = new BufferedInputStream(new FileInputStream(in)); BufferedOutputStream bo = new BufferedOutputStream(new FileOutputStream(out)); int by = 0; while ((by = bi.read()) ! = -1) { bo.write(by); } bo.close(); bi.close(); }} 1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950Copy the code

Running results:

Common byte stream: 184867 ms Buffer byte stream: 752 ms 12Copy the code

This time, the efficiency difference between the normal byte stream and the buffer byte stream is obvious, up to 245 times.

Let’s look at the efficiency of the character stream versus the buffered character stream:

Public class IOTest {public static void main(String[] args) throws IOException {// Data preparation dataReady(); File data = new File("C:/Mu/data.txt"); File a = new File("C:/Mu/a.txt"); File b = new File("C:/Mu/b.txt"); File c = new File("C:/Mu/c.txt"); long start = System.currentTimeMillis(); copy(data, a); long end = System.currentTimeMillis(); long start2 = System.currentTimeMillis(); copyChars(data, b); long end2 = System.currentTimeMillis(); long start3 = System.currentTimeMillis(); bufferedCopy(data, c); long end3 = System.currentTimeMillis(); System.out.println(" Ordinary byte stream 1 Time: "+ (end-start) +" ms, file size: "+ a.length() / 1024 +" KB "); System.out.println(" Ordinary byte stream 2 Time: "+ (end2-start2) +" ms, file size: "+ B.length () / 1024 +" KB "); System.out.println(" Buffer byte stream time: "+ (end3-start3) +" ms, file size: "+ c.length() / 1024 +" KB "); } public static void copy(File in, File out) throws IOException {Reader Reader = new FileReader(in); Writer writer = new FileWriter(out); int ch = 0; while ((ch = reader.read()) ! = -1) { writer.write((char) ch); } reader.close(); writer.close(); } public static void copyChars(File in, File out) throws IOException {Reader Reader = new FileReader(in);  Writer writer = new FileWriter(out); char[] chs = new char[1024]; while ((reader.read(chs)) ! = -1) { writer.write(chs); } reader.close(); writer.close(); } public static void bufferedCopy(File in, File out) throws IOException { BufferedReader br = new BufferedReader(new FileReader(in)); BufferedWriter bw = new BufferedWriter(new FileWriter(out)); String line = null; while ((line = br.readLine()) ! = null) { bw.write(line); bw.newLine(); bw.flush(); } // Release resource bw.close(); br.close(); Public static void dataReady() throws IOException {StringBuilder sb = new StringBuilder(); // Data preparation public static void dataReady() throws IOException {StringBuilder sb = new StringBuilder(); for (int i = 0; i < 600000; i++) { sb.append("abcdefghijklmnopqrstuvwxyz"); } OutputStream os = new FileOutputStream(new File("C:/Mu/data.txt")); os.write(sb.toString().getBytes()); os.close(); System. The out. Println (" finished "); }} 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646 5666768697071727374757677787980818283Copy the code

Running results:

Common character stream 1 Time: 1337 ms, file size: 15234 KB Common character stream 2 time: 82 ms, file size: 15235 KB Buffered character stream time: 205 ms, file size: 15234 KB 123Copy the code

We tested it several times, and the results were similar. There was no significant improvement in the efficiency of the visible character buffer stream. We used its readLine() and newLine() methods more.