This article contains 9,185 words. Estimated reading time: 15 minutes
IO and NIO in Java can be said to be essential, involving hard disk file reading and writing, network file reading and writing, as long as the file is dealing with the basic company of IO and NIO, so next we will learn IO and NIO together, the blog will continue to update more articles. If you feel good, you can click on it
IO stream learning
We all know that in IO, the superclass of IO has byte stream InputStream and OutputStream, character stream Reader and Writer. Let’s look at the IO stream as a whole
Byte stream input and output comparison diagram:
Character stream input and output comparison diagram:Structure diagram of classification by operation object:
IO streams refer to Input/Output, that is, Input and Output, centered on memory
-
Input refers to reading data from the outside into memory, for example, reading a file from disk into memory, reading data from the network into memory, and so on
-
Output refers to writing data from memory to the outside world. For example, writing data from memory to a file, writing data from memory to the network, and so on
** Why read data into memory? ** Since our Java code is running in memory, the data must also be read into memory in the form of strings, byte arrays, and so on
What is the difference between byte streams and character streams, and why do they exist?
First, specify the size of Byte and Byte and Character Character:
-
1 byte = 8 bit
-
1 char = 2 byte = 16 bit (Java default UTF-16 encoding)
While a bit is really the smallest unit of data, a bit is too little information.
To represent a useful piece of information, you need several bits together. So in most cases, the byte is the smallest basic unit of data, except for the one bit register that exists at the hardware level. The basic models we’re familiar with are all multiples of 8 bits (that is, 1 byte) : short is 2 bytes, int is 4 bytes, long is 8 bytes, and so on
Originally for the Western world, probably no characters at all, a byte problem is solved, because a byte 8 bits, up to 256 character encoding, English 26 letters, plus a few common symbols, punctuation, 256 code points, this is familiar ASCII code.
However, it can not solve the language of more countries, with the problem of various types of encoding isO-8859-1, GBK, UTF-8, UTF-16 and many other encoding types (not introduced here), in short, everything is byte stream, it can be said that there is no such thing as character stream. A character is simply a translation of the byte stream according to the encoding set. InputStream and OutputStream are the basis of everything, the actual bus flow only byte stream, need to do special decoding of the byte stream to get the character stream
-
When a byte stream is read, a byte is returned when a byte is read. A character stream reads one or more bytes (two bytes in Chinese, three bytes in UTF-8 code table) using a byte stream
-
The byte stream has no buffer and is output directly, whereas the character stream is output to the buffer. So while the byte stream is output without calling the colse() method, the character stream is output only when the close() method is called to close the buffer. To output information while the character stream is open, you need to call the flush() method manually
-
Byte streams are the most basic. All subclasses of InputStrem and OutputStream are byte streams, which are used to process binary data. Byte streams are processed in bytes. Character streams are the Writer and Reader superclasses that manipulate characters, character arrays, or strings, converting bytes into 2-byte Unicode characters
-
Byte stream and flow between the two characters through InputStreamReader OutputStreamWriter (transition flow) to associate, is in fact associated by byte [] and String
Next we’ll learn to use byte streams and character streams
Before I do that, I need to introduce the use of File objects and Path objects. In computer systems, files are very important storage methods. Java’s standard library java.io provides File objects to manipulate files and directories. We construct a File object by passing in the path of the File (either relative or absolute). The Windows platform uses \ as the path separator, \ is required in Java strings to represent a \, and Linux uses/as the path separator
File f = new File("d:\\test.txt");
Copy the code
A File object can represent either a File or a directory.
Note, in particular, that constructing a File object does not cause any disk operations, even if the File or directory passed in does not exist. Only when we call some method of the File object do we actually do disk operations. The File object allows you to create and delete files, traverse files and directories, and so on. The Java standard library also provides a Path object in the java.nio.file package. The Path object is similar to the File object, but simpler:
Path p1 = Paths.get(".", "a", "b"); // Construct a Path object Path p2 = p1.toAbsolutePath(); // Convert to absolute Path p3 = p2.normalize(); // Convert to the canonical path File f = p3.tofile (); For (Path p: paths.get ("..") ).toAbsolutePath()) {// Path system.out.println (" "+ p); }Copy the code
Streams can be divided into byte streams and character streams, and can be functionally divided into node streams (which can read and write data from or to a specific location or node) and processing streams (which are chlinks and encapsulations of an existing stream).
1, input the byte stream InputStream
InputStream is the parent class of all input byte streams. It is an abstract class
-
FileInputSream: File input stream, which is usually used to read files
-
BufferedInputStream: A buffered stream, embellished and enhanced with an internal buffer that holds bytes, instead of one or two bytes being sent each time the buffer is full. More efficient. (BufferedInputStream is better for large files than FileInputStream)
-
ByteArrayInputStream: byte array input stream. The function of this class is to read in bytes from a byte array (byte[]), that is, to store the resource files in bytes into the byte array from which we retrieve them
-
PipedInputStream: pipe byte input stream, used with PipedOutputStream to implement pipe communication between multiple threads
-
DataInputStream: DataInputStream used to decorate other input streams that “allow applications to read basic Java data types from the underlying input stream in a machine-independent manner.”
-
ObjectInputStream: ObjectInputStream used to provide persistent storage of “basic data or objects”. In layman’s terms, the ability to transfer objects directly (used in deserialization)
-
FilterInputStream: decorator in decorator mode, which decorators inherit, so all subclasses of this class are used to decorate other streams, i.e. processing classes
2. Output a byte stream OutputStream
OutputStream is the parent of all OutputStream streams. It is an abstract class (as opposed to the InputStream stream above, usually used in pairs).
ByteArrayOutputStream and FileOutputStream are two basic media streams that write data to Byte arrays and local files, respectively. PipedOutputStream writes data to a pipe shared with other threads. BufferedOutputStream is a buffered stream, decorated with buffers, better suited for handling large files
ObjectOutputStream and all subclasses of FilterOutputStream are decorator streams (used in serialization)
Which do you prefer to use, byte stream or character stream?
Personally, I prefer to use character streams because they are newer. Many features that exist in character streams do not exist in byte streams. For example, using BufferedReader instead of BufferedInputStreams or DataInputStream, using the newLine() method to read the next line, but in the byte stream we need to do something extra
Java’s IO standard library provides InputStream which, depending on the source, may include but is not limited to:
IO flow tools: IOUtils readLines (), FileUtils. ReadFileToString readAllLines in (), Files tools ();
-
FileInputStream: Reads data from a file and is the final data source
-
ServletInputStream: Reads data from HTTP requests and is the final data source
-
Socket.getinputstream () : Reads data from the TCP connection and is the final data source
Filter mode for function supplement:
The JDK classifies InputStream into two broad categories: one is the basic InputStream that provides data directly, for example:
FileInputStream, ByteArrayInputStream, ServletInputStream; One is the InputStream that provides additional functions, such as BufferedInputStream, DigestInputStream, and CipherInputStream.
When we need to attach various functions to a “base” InputStream, we can first identify the InputStream that provides the data source, and then we can wrap the base class that provides the data source with additional function classes. To overlay multiple FilterInputStreams, we only need to hold the outermost InputStream, and when the outermost InputStream is closed (automatically at the end of the try(Resource) block), The close() method of the inner InputStream is also called automatically and eventually to the core “base” InputStream, so there is no resource leakage
The path of the configuration file is different in different environments. How to read the file (is there a path independent way to read the file?)
Reading files from the classpath avoids the problem of inconsistent file paths in different environments: if we put the xxx.properties file in the classpath, we don’t care where it is actually stored. For resource files in the classpath, the path always begins with a /. We get the current Class object and then call getResourceAsStream() to read any resource file directly from the classpath. If the resource file does not exist, it will return NULL. Therefore, we need to check if the returned InputStream is null, if null, the resource file is not found in the classpath
try ( InputStream input = getClass().getResourceAsStream("/xxx.properties")) { if (input ! = null) { // TODO: }}Copy the code
What does SequenceInputStream do?
This is useful when copying multiple files to a target file. SequenceInputStream Can merge two or more other InputStreams into one. First, SequenceInputStream reads all the bytes in the first InputStream, and then all the bytes in the second InputStream. That’s why it’s called SequenceInputStream, because instances of InputStream are read sequentially
Serialization in IO streams
Serialization means turning a Java object into binary content, essentially a byte[] array. Why serialize A Java object? Since byte[] can be saved to a file or transferred remotely over the network, Java objects can be stored to a file or transferred over the network. Where there is serialization, there is deserialization, that is, converting a binary content (that is, an array of Byte []) back into a Java object. With deserialization, the Byte [] array saved to a file can be “converted back” to Java objects, or byte[] can be read from the network and “converted” back to Java objects
Security: Because Java’s serialization mechanism can cause an instance to be created directly from the Byte [] array without a constructor, it has some security implications. A carefully constructed byte[] array can be deserialized to execute specific Java code, leading to a serious security breach. In fact, the object-based serialization and deserialization mechanisms provided by Java itself have both security and compatibility issues. A better way to serialize is through a generic data structure like JSON, which outputs only the content of the basic types (including String) without storing any code-specific information
Examples of byte stream code:
public static List<String> readFile1(File file) { List<String> list = new ArrayList<>(); StringBuilder StringBuilder = new StringBuilder(); try { InputStream inputStream = new FileInputStream(file); int byteToint; while ((byteToint = inputStream.read()) ! = -1) { char c = (char) byteToint; if (c ! = 'r' && c ! = 'n') { stringBuilder.append(c); } else { if (! stringBuilder.toString().equals("")) { list.add(stringBuilder.toString()); stringBuilder.delete(0, stringBuilder.length()); } } } } catch (Exception e) { throw new RuntimeException(e); } return list;Copy the code
3, Enter the character stream Reader
Reader is the parent of all input character streams. It is an abstract class. The purpose and methods of the Reader classes are basically the same as those of the InputStream classes, because many of the Reader implementation classes are implemented using the underlying methods of the InputStream implementation classes
-
InputStreamReader is a bridge between a byte stream and a character stream that converts a byte stream into a character stream
-
FileReader is a character stream that operates on files, and the source code clearly uses methods to convert FileInputStream into Reader. We can get some tricks out of this class
-
BufferedReader is obviously a decorator, and it and its subclasses decorate other Reader objects. (BufferedReader is more efficient with large files wrapped than FileReader)
-
CharReader and StringReader are two basic media streams that read data from a Char array and a String, respectively
-
PipedReader reads data from pipes shared with other threads
-
FilterReader is the parent of all custom concrete decorator streams. Its subclass, PushbackReader, decorates a Reader object by adding a line number
4, output character stream Writer
Writer is the parent of all output character streams and is an abstract class. The functions and methods of each class in Writer are basically the same as those of the classes in OutputStream, because many of the methods of the implementation class of Writer are implemented using the underlying methods of the implementation class of OutputStream
-
FileWriter is a stream of characters that operate on files
-
BufferedWriter is a decorator that provides buffering for writers. (Compared with FileWriter, BufferedWriter is more efficient in handling large files.)
-
CharArrayWriter and StringWriter are two basic media streams that write data to Char arrays and Strings, respectively
-
PipedWriter writes data to a pipe shared with other threads
-
PrintWriter and PrintStream are very similar in function and usage
-
OutputStreamWriter is a bridge between OutputStream and Writer, and its subclass FileWriter is a concrete class that implements this function
Character stream and byte stream conversion:
InputStreamReader is a byte to character conversion, OutputStreamWriter is a character to byte conversion, conversion stream function, text files stored in the hard disk byte stream read by InputStreamReader into character streams for program processing. The stream of characters processed by the program is converted to a byte stream for storage through OutputStreamWriter. (Can be converted by specified code)
The System class supports IO:
For some frequent device interactions, the Java language system has three stream objects that can be used directly:
-
System.in (standard input), usually for keyboard input
-
System.out (standard output) : Usually written to the display
-
System.err (Standard error output) : Usually written to the monitor
What is the use of pipeline flow?
The main purpose of a pipe flow is to communicate between two threads. One thread serves as the pipe output stream and the other as the pipe input stream. Just call connect to connect the pipes of the two threads before starting the thread. This is a very convenient implementation of communication between the two threads
RandomAccessFile?
It is a special class in the java.io package that is neither an input stream nor an output stream; it can do both. It is a direct subclass of Object. Typically, a stream has only one function, either read or write. But RandomAccessFile can read files as well as write them. DataInputStream and DataOutStream have methods that are present in RandomAccessFile
NIO stream learning
Before we start, we need to learn three models of network communication:IO, NIO and AIO:
IO (synchronization blocking* * * * : The traditional network communication model, namely BIO, synchronously blocks IO. Each session of the client will connect to the ServerSocket of the server, and the server will create a Socket and a thread to communicate with the client. The client and the server conduct blocking communication. When the server does not return data, the client cannot do anything else. When the client requests too much, the service becomes overloaded and crashes. It is suitable for places with few connections and large service consumption.
Lao Li is boiling water, the machine is boiling water, others sit and wait, nothing can do, waiting for the water to boil, water to continue to do the next thing
NIO (Synchronous non-blocking) : NIO is a synchronous non-blocking IO based on the Reactor model. In fact, it is equivalent to a thread processing a large number of client requests, through a thread polling a large number of channels, each time to obtain a batch of channels with events, and then start a thread processing for each request. So the core of this is non-blocking, just that selector a thread can poll a channel over and over again, and all the client requests don’t block, they just come in, they just wait in line. The core of this optimization BIO is that a client does not interact with data all the time, so there is no need for a thread to be idle, so the client chooses to let the thread rest and only initiate notification when the client has a corresponding action, creating a thread to handle the request
Lao Li boiling water, feel just a little stupid, so people walked away, every period of time to see the water opened, did not open to continue to do other, opened to connect water
AIO (asynchronous non-blocking **) ** : AIO is NIO 2. NIO 2, an improved version of NIO introduced in Java 7, is an asynchronous, non-blocking IO model. Asynchronous IO is implemented based on events and callbacks, meaning that an action is applied and returned, not blocked, and when the background processing is complete, the operating system notifies the appropriate thread to proceed with the subsequent action. AIO is short for asynchronous IO. Although NIO provides a non-blocking method for network operations, NIO’s I/O behavior is synchronous. For NIO, our business thread is notified when the IO operation is ready, and then the thread performs the IO operation itself, which is synchronous. Looking around the Internet, I found that AIO is not widely used so far, and Netty tried and abandoned AIO before
Lao Li boiling water, feel or trouble, bought a belt to remind the boiling water machine on the net, the person went away to do other things, water boiling after the machine blowing whistle, Lao Li heard after the water opened, come to receive water
In JDK1.7, the following four asynchronous channels were added under the java.nio. channels package:
Get to the subject:
NIO in Java is made up of Channels, Buffers, and Selectors. Other components, such as Pipe and FileLock, It is nothing more than a utility class that is used with the three core components
Whereas traditional IO operates on byte streams and character streams, NIO operates on channels and buffers, where data is always read from a Channel into a Buffer or written from a Buffer into a Channel. A Selector is used to listen for events on multiple channels (such as a connection opening, data arrival), so a single thread can listen for multiple data channels
IO is stream-oriented, NIO is buffer-oriented. Stream-oriented means that one or more bytes are read from the stream at a time until all bytes are read, and the process is blocked with no place to cache; It cannot move data in a stream back and forth, and if you want to move data read from a stream back and forth, you need to cache it into a buffer first. Buffer-oriented means that data is read into a buffer that it processes later and can be moved back and forth in the buffer as needed. This adds flexibility to the process. It is because of the above flow orientation that IO is blocking and NIO is non-blocking. Threads typically spend idle time of non-blocking IO performing IO operations on other channels, so a single thread can now manage multiple input and output channels.
Channel
First, let’s talk about Channel, which is mostly translated as “Channel” in China. A Channel is of the same rank as a Stream in IO, except that a Stream is one-way, for example: InputStream, OutputStream. A Channel is bidirectional. Data can be read from a Channel to a Buffer or written from a Buffer to a Channel
Below are implementations of some of the major channels in JAVA NIO, covering UDP and TCP network IO, as well as file IO
-
FileChannel
-
DatagramChannel
-
SocketChannel
-
ServerSocketChannel
Buffer
The key Buffer implementations in NIO are: ByteBuffer, CharBuffer, DoubleBuffer, FloatBuffer, IntBuffer, LongBuffer, ShortBuffer, corresponding to the basic data types respectively: Byte, char, double, float, int, long, short. Java NIO also has a MappedByteBuffer, which is used to represent memory-mapped files
Selector
Selector allows a single thread to handle multiple channels. If your application has multiple connections (channels) open, but each connection has low traffic, using Selector can be handy. For example, in a chat server. Here’s an illustration of handling three channels in a single thread with a Selector:
To use a Selector, you register a Channel with a Selector and then call its select() method. This method blocks until a registered channel is ready for an event. Once the method returns, the thread can process the events, such as new connections coming in, data receiving, and so on
conclusion
Lovely you, should not be stingy move a small hand of praise and forward, not not?