An overview,

NIO has three core parts: channels, buffers, and selectors.

The first big difference between NIO and traditional IO is that WHILE IO is stream-oriented, NIO is buffer-oriented. Java IO stream-oriented means that one or more bytes are read from the stream at a time until all bytes are read without being cached anywhere. In addition, it cannot move data back and forth in a stream. If you need to move data read from the stream back and forth, you need to cache it into a buffer first. NIO’s buffer-oriented approach is slightly different. The data is read into a buffer that it processes later and can be moved back and forth in the buffer as needed. This adds flexibility to the process. However, you also need to check that the buffer contains all the data you need to process. Also, you need to make sure that when more data is read into the buffer, you don’t overwrite the unprocessed data in the buffer.

The various streams of IO are blocked. This means that when a thread calls read() or write(), the thread blocks until some data is read, or data is written entirely. The thread can’t do anything else in the meantime. NIO’s non-blocking mode allows a thread to send a request from a channel to read data, but it only gets what is currently available, and if no data is currently available, it gets nothing. Instead of keeping the thread blocked, it can continue doing other things until the data becomes readable. The same is true for non-blocking writes. A thread requests to write some data to a channel, but without waiting for it to write completely, the thread can do something else in the meantime. Threads typically spend the idle time of non-blocking IO performing IO operations on other channels, so a single thread can now manage multiple input and output channels.

(1) Channel

“Channel” is mostly translated as “Channel” in China. A Channel is about the same level as a Stream in IO. Only the Stream is one-way, for example: InputStream, OutputStream. A Channel is bidirectional and can be used for both read and write operations.

The main implementations of channels in NIO are:

FileChannel2, DatagramChannel3, SocketChannel4, ServerSocketChannel

It can correspond to file IO, UDP, and TCP (Server and Client) respectively. The following examples are basically about these four types of channels.

(2) Buffer

The key Buffer implementations in NIO are: ByteBuffer, CharBuffer, DoubleBuffer, FloatBuffer, IntBuffer, LongBuffer, ShortBuffer, corresponding to the basic data types respectively: Byte, char, double, float, int, long, short. Of course, there are MappedByteBuffer, HeapByteBuffer, DirectByteBuffer and so on in NIO.

(3) Selector

Selector runs on a single thread to handle multiple channels, which can be handy if your application has multiple channels open, but traffic per connection is low. For example, in a chat server. To use a Selector, you register a Channel with a Selector and then call its select() method. This method blocks until a registered channel is ready for an event. Once the method returns, the thread can process the events, such as new connections coming in, data receiving, and so on.

Second, the FileChannel

After reading the above statement, it was very confusing for the students who came into contact with NIO for the first time. They only mentioned some concepts and didn’t remember anything, let alone how to use them. To give you a sense of what NIO is all about, we start by comparing traditional IO with the new NIO.

(I) Traditional IO vs NIO

Case 1 uses FileInputStream to read the contents of a file:

Output: (slightly) case is corresponding NIO (operated by RandomAccessFile here, of course, can also pass a FileInputStream. GetChannel (operating) :

By carefully comparing Case 1 and Case 2, you should be able to get an idea, at the very least, of how NIO is implemented. With a general impression we can move on to the next step.

(2) Use of Buffer

In case 2, it can be concluded that using Buffer generally follows the following steps:

1, allocate space (ByteBuffer buf = ByteBuffer. Allocate (1024); Buffer(int bytesRead = filechannel.read (buf);) Flip (); filp(); Print ((char)buf.get()); 5. Call either the clear() or compact() methods

A Buffer is, as its name suggests, a container, a contiguous array. Channel Provides a Channel for reading data from files and networks. However, the read and write data must pass through Buffer. The diagram below:


Write data to Buffer:

Write from Channel to Buffer (filechannel.read (buf))
Buffer put() (buf.put(…)) )
Read data from Buffer:
3, Read from Buffer to Channel (channel.write(buf))
4. Use get() to read from Buffer (buf.get())

A Buffer can be thought of simply as a list of elements of a basic data type that holds the current position state of the data through several variables: Capacity, position, limit, mark:


For example, we create an 11-byte array buffer using the bytebuffer.allocate (11) method. The initial state is shown above, position is 0, and capacity and limit are both array lengths by default. When we write 5 bytes, the change looks like this:

We need to write 5 bytes of buffer data to the Channel’s communication Channel, so we call bytebuffer.flip () and change position back to 0 and limit to the previous position:

At this point, the underlying operating system can correctly read the five-byte data from the buffer and send it. The clear() method is called again before the next write, and the index position of the buffer is back to its original position.

The clear() method is called: Position is set back to 0 and limit is set to Capacity. In other words, the Buffer is cleared. If there is some unread data in the Buffer, call the clear() method and the data will be “forgotten,” meaning there will no longer be any markers telling you which data has been read and which has not. If there is still unread data in the Buffer and it is needed later, but you want to write some data first, use the Compact () method. The compact() method copies all unread data to the start of the Buffer. Position is then set directly after the last unread element. The limit property is set to Capacity as in the clear() method. The Buffer is now ready to write data, but does not overwrite unread data.

A specific position in a Buffer can be marked by calling buffer.mark (), which can later be restored by calling buffer.reset (). The buffer.rewind () method sets position back to 0, so you can re-read all the data in Buffer. Limit remains the same and still indicates how many elements can be read from Buffer.

Third, a SocketChannel

With FileChannel and Buffer out of the way, you should be familiar with the use of Buffer, but here we continue our discussion of NIO using SocketChannel. Part of NIO’s power comes from the non-blocking nature of channels; some operations on sockets may block indefinitely. For example, a call to the Accept () method might block waiting for a client connection; A call to the read() method might block because there is no data to read until new data arrives at the other end of the connection. In general, I/O calls such as creating/receiving connections or reading and writing data can block and wait indefinitely until something happens to the underlying network implementation. Slow, lossy networks, or simple network failures can cause arbitrary delays. Unfortunately, you can’t tell if a method is blocking until you call it. An important feature of NIO’s channel abstraction is that its blocking behavior can be configured to achieve a non-blocking channel.

Calling a method on a non-blocking channel always returns immediately. The return value of such a call indicates the degree to which the requested operation has been completed. For example, calling the Accept () method on a non-blocking ServerSocketChannel returns the client SocketChannel if a connection request comes in, or null otherwise.

Here’s an example of a TCP application where the client uses NIO and the server still uses BIO.

Client code (Case 3) :

Server code (Case 4) :

Output result :(omitted) summarize the usage of SocketChannel based on the case analysis.

Open a SocketChannel:

Close:

Read data:

Note that the socketChannel.write () method is called in a while loop. The write() method does not guarantee how many bytes will be written to SocketChannel. So, we call write() repeatedly until the Buffer has no bytes to write.

In non-blocking mode, the read() method may return before any data has been read. So you need to pay attention to its int return value, which will tell you how many bytes were read.

TCP server NIO writing method

So far, none of the cases we’ve done involve selectors. Take your time with good things. The Selector class can be used to avoid the wasteful “busy etc” methods of blocking clients. For example, consider an IM server. Like QQ or Wangwang, there may be tens of thousands or even tens of millions of clients connected to the server at the same time, but at any moment is only a very small amount of information.

Need to read and distribute. This requires a way to block and wait until at least one channel is available for I/O operations, indicating which channel. NIO’s selectors do just that. A Selector instance can simultaneously check the I/O status of a set of channels. In technical terms, a selector is a multiplexing switch selector, because a selector can manage I/O operations over multiple channels. However, the traditional way to deal with so many clients is to loop through all the clients one by one to see if they have I/O operations. If the current client has I/O operations, the current client may be thrown to a thread pool for processing. If there are no I/O operations, the next polling. When all clients have polled and then start polling from scratch; This method is very stupid and very wasteful of resources, because most clients do not have I/O operations, we also need to check; A Selector, on the other hand, internally manages multiple I/ OS at the same time, so when a channel has an I/O, it notifies the Selector, and the Selector remembers that the channel has an I/O, and it knows what kind of I/O it is, is it a read? Is to write? Or accept new connections; So if you use Selector, it only returns two results, one is zero, which is no client that needs I/O at the time you call it, and the other is a set of clients that need I/O, and then you don’t need to check at all, because it’s going to give you exactly what you want. Such a notification is much more efficient than active polling!

To use a Selector, you create an instance of a Selector (using the static factory method open()) and register it on the channel you want to monitor (note that this is done using the method of a channel, not the method of a Selector). Finally, the select() method of the selector is called. This method blocks waiting until one or more channels are ready for I/O operations or wait times out. The select() method returns the number of channels available for I/O operations. Now, in a single thread, you can check that multiple channels are ready for I/O by calling the SELECT () method. If no channel is ready after a certain amount of time, the select() method returns 0 and allows the program to continue with other tasks.

Here’s how to rewrite the TCP server code to NIO (case 5) :

Walk through the code slowly

(1) ServerSocketChannel

Open the ServerSocketChannel:

Close the ServerSocketChannel:

Listen for incoming connections:

ServerSocketChannel can be set to non-blocking mode. In non-blocking mode, the Accept () method returns immediately, or null if no new connection has been entered. Therefore, check whether the returned SocketChannel is null. Such as:

(2) Selector

Selector = Selector. Open;
In order to Channel and the Selector, the Channel must be registered with the Selector, through SelectableChannel. The register () method to implement, use case 5 parts of the code:

When used with Selector, a Channel must be in non-blocking mode. This means that FileChannel cannot be used with a Selector because FileChannel cannot switch to non-blocking mode. Socket channels work.

Notice the second parameter to the register() method. This is an “interest set,” meaning what event is of interest when listening to a Channel with Selector. You can listen for four different types of events:

The channel raised an event meaning that the event is ready. Therefore, a channel that successfully connects to another server is said to be “connection-ready”. A server socket channel that is ready to receive incoming connections is called “receive ready”. A channel with data to read can be said to be “read ready”. A channel waiting for data to be written can be said to be “write ready”.

The four kinds of events are represented by the four constants of SelectionKey:

(3) SelectionKey

When registering a Channel with a Selector, the register() method returns a SelectionKey object. This object contains some properties of interest to you:

1. Interest set
2, The ready set
3, the Channel
4, the Selector
5. Additional objects (optional)

Interest collection: As described in the Registering channels with Selector section, an interest collection is a collection of events that you select that you are interested in. Interest collections can be read and written by SelectionKey.

The Ready collection is a collection of operations for which the channel is ready. After a Selection, you first access the Ready set. Selection is explained in the next section. The ready collection can be accessed like this:

You can detect what events or actions are already in place in a channel in the same way you detect interest collections. However, you can also use the following four methods, all of which return a Boolean type:

Accessing channels and selectors from SelectionKey is as simple as this:

An object or more information can be attached to the SelectionKey to make it easy to identify a given channel. For example, you can attach a Buffer that is used with a channel, or an object that contains aggregated data. The usage is as follows:

We can also attach an object when we register a Channel with a Selector () method:

4. Select a channel by Selector

Once one or more channels have been registered with a Selector, several overloaded select() methods can be called. These methods return channels that are ready for the event you are interested in (such as connect, receive, read, or write). In other words, if you are interested in “read-ready” channels, the select() method returns those channels for which the read event is ready.

Here is the select() method:

1, int the select ()
2, int select(long timeout)
3, int selectNow ()

Select () blocks until at least one channel is ready on the event you registered.

Select (long timeout) is the same as select(), except that it blocks at most timeout milliseconds.

SelectNow () does not block and returns immediately whenever any channel is ready. If no channels have become selectable since the previous selection, this method returns zero. .

The int returned by the select() method indicates how many channels are ready. That is, how many channels have become ready since the select() method was last called. If you call select(), it returns 1 because one channel is ready, and if you call select() again, it returns 1 again if the other channel is ready. If nothing is done to the first ready channel, there are now two ready channels, but only one is ready between each select() method call.

Once the select() method is called and the return value indicates that one or more channels are ready, the ready channels in the selectedKey Set can be accessed by calling the Selector selectedKeys() method. As follows:

When registering a Channel with a Selector, the channel.register () method returns a SelectionKey object. This object represents the channel registered with that Selector.

Note the keyiterator.remove () call at the end of each iteration. The Selector does not remove the SelectionKey instance from the selected key set itself. It must be removed by itself when the channel is processed. The next time the channel becomes ready, the Selector puts it into the selected key set again.

The channel returned by the selectionkey.channel () method needs to be converted to the type you want to work with, such as ServerSocketChannel or SocketChannel.

For a complete example of using Selector and ServerSocketChannel, see the Selector () method in Case 5.

5. Memory mapped files

JAVA process large documents, commonly used BufferedReader, BufferedInputStream such buffer IO classes, but if the file is big, faster way MappedByteBuffer is adopted.

MappedByteBuffer is a file memory mapping scheme introduced by NIO with high read and write performance. The main thing about NIO is that it supports asynchronous operations. One method registers a SOCKET channel with a Selector, and calls its select method from time to time to return the matched SelectionKey, which contains information about SOCKET events. That’s the SELECT model.

SocketChannel reads and writes are handled by a class called ByteBuffer. The class itself is well designed and much more convenient than manipulating byte[] directly. ByteBuffer has two modes: direct/indirect. The most typical (and only one) indirect pattern is the HeapByteBuffer, which operates on heap memory (byte[]). But memory is limited, so what if I want to send a file of 1 GIGAByte? It’s impossible to actually allocate 1 gigabyte of memory. In this case, you must use the “direct” mode, MappedByteBuffer, file mapping.

Let’s pause and talk about memory management for the operating system. General operating system memory is divided into two parts: physical memory; Virtual memory. Virtual memory typically uses page image files, that is, some special file on the hard disk. The operating system is responsible for the page file content, speaking, reading and writing, this process is called “page break/switch”. MappedByteBuffer is also similar, you can put the entire file (no matter how much they file) as a ByteBuffer. MappedByteBuffer Just a special ByteBuffer, which is a subclass of ByteBuffer. The MappedByteBuffer maps files directly to memory (virtual memory, not physical memory). In general, you can map the entire file, or if the file is large, you can map it in segments, just by specifying that part of the file.

(1) Concept

FileChannel provides map methods to map files to memory images: MappedByteBuffer map(int mode,long position,long size); It is possible to map the size area of the file starting with position into a memory image file. Mode specifies how the memory image file can be accessed:

READ_ONLY, (read-only) : Attempts to modify the resulting buffer will result in ReadOnlyBufferException.(mapmode.read_only)
READ_WRITE (read/write) : Changes to the resulting buffer will eventually be propagated to the file; The change may not be visible to other programs that map to the same file. (MapMode.READ_WRITE)
3. PRIVATE: Changes to the resulting buffer are not propagated to the file and are not visible to other programs that map to the same file; Instead, a dedicated copy of the modified portion of the buffer is created. (MapMode.PRIVATE)
MappedByteBuffer is a subclass of ByteBuffer that extends three methods:
4, force() : the buffer is in READ_WRITE mode, this method forcibly writes the modification of the buffer to the file;
5. Load () : loads the contents of the buffer into memory and returns a reference to the buffer.
IsLoaded () : true if the contents of the buffer are in physical memory, false otherwise;

(II) Case comparison

Method3 () uses MappedByteBuffer to read files “SRC /1.ppt” with a size of about 5M. Method4 () corresponds to ByteBuffer.

By running in the entry function main() :

Output (running on a common PC) :

We can see the difference between the two files in the output. An example may be accidental. Replace a 5M file with a 200M file.

You can see the gap widening

Note: MappedByteBuffer has resource release issues: files opened by MappedByteBuffer are closed only for garbage collection, and this point is undefined. In Javadoc it is described here: A mapped byte buffer and the file mapping that it represents the buffer itself is garbage-collected Refer to resources 5 and 6 for details.


Other functions

After reading the above statement, we have a certain understanding of NIO in detail, the following mainly through a few cases, to illustrate the rest of NIO functions, the following code amount is more, the functional description is less.

Scatter/Gatter

Reading from a Channel is when the data is read and written to multiple buffers. Therefore, a Channel “scatters” the data read from a Channel into multiple buffers.

To gather data into a Channel means to write data from multiple buffers to the same Channel during a write operation. Therefore, a Channel “gathers” data from multiple buffers and sends it to a Channel.

Scatter/Gather is often used when you need to separate the data to be transmitted. For example, when you transmit a message consisting of a header and a body, you might split the body and the header into different buffers so you can easily process the body and the header.

Case study:

(2) transferFrom & transferTo

FileChannel’s transferFrom() method transfers data from the source channel to the FileChannel

The input argument to the position method indicates that data is written to the target file from position, and count indicates the maximum number of bytes transferred. If the free space of the source channel is less than count bytes, the number of bytes transferred is less than the number of bytes requested. Also note that in the implementation of SoketChannel, SocketChannel will only transmit data that is ready at the moment (possibly less than count bytes). Therefore, SocketChannel may not transfer all of the requested data (count bytes) into the FileChannel.

The transferTo() method transfers data from the FileChannel to another channel

The problem with SocketChannel mentioned above also applies to the transferTo() method. SocketChannel transmits data until the target buffer is full.

(3) Pipe

A Java NIO pipe is a one-way data connection between two threads. Pipe has a source channel and a sink channel. Data will be written to the sink channel and read from the source channel.

(4) DatagramChannel

DatagramChannel in Java NIO is a channel for sending and receiving UDP packets. Because UDP is a connectionless network protocol, it cannot be read or written like other channels. It sends and receives packets of data.