Topic of this article:
-
Use of traditional Java IO
-
Why is the design implementation inside Java IO like this?
-
IO artifact Okio
Traditional Java IO doesn’t work very well
Remember the first time I used Java IO to manipulate a file, I copied a piece of code like this from the Internet:
InputStream in = null;
InputStream binStream = null;
try {
in = new FileInputStream("./test.txt");
binStream = new BufferedInputStream(in);
byte[] data = new byte[128];
while(binStream.read(data) ! = -1) {
/ /...}}catch (IOException e) {
e.printStackTrace();
}finally {
if(in ! =null) {try {
in.close();
} catch(IOException e) { e.printStackTrace(); }}if(binStream ! =null) {try {
binStream.close();
} catch(IOException e) { e.printStackTrace(); }}}Copy the code
At the time, I thought it was very troublesome. I didn’t understand why I wrote it this way. Because did not understand the principle behind, after every time write similar code will go to the Internet copy.
Now, when I write this code, my first instinct is to try to figure out why I can’t just pull this code out with my bare hands. Learning technology is about understanding the nature behind it. So there was a journey of discovery:
One of the most perplexing aspects of this exploration is that
You can read the contents of a file directly with a FileInputStream, so what’s the significance of a BufferedInputStream? If BufferedInputStream, as its name suggests, has a caching function, why don’t you just use it? Is it really necessary to inject a FileInputStream into it?
To solve this puzzle, take a look at the implementation mechanism behind the source code:
class FileInputStream extends InputStream{}class BufferedInputStream extends FilterInputStream {}class FilterInputStream extends InputStream {
protected volatile InputStream in;
protected FilterInputStream(InputStream in) {
this.in = in;
}
public int read(a) throws IOException {
return in.read();
}
public void close(a) throws IOException {
in.close();
}
/ /...
}
Copy the code
As you can see, FileInputStream normally inherits the InputStream abstraction class, but BufferedInputStream, it inherits the FilterInputStream, The implementation of this class simply overrides the methods of the superclass, and the implementation does nothing more than call the methods of the superclass.
The puzzle here is, what is the purpose of this design? It’s just a simple encapsulation. I don’t see the point.
If you are familiar with the decorator design pattern, it is easy to see that the design here is the application of the decorator design pattern. If you don’t, you’re probably as confused as I am.
An important feature of the decorator design pattern application scenario is that the decorator class adds enhancements related to the original class
BufferedInputStream is the decorator class that provides the enhancement: the caching capability. By providing a buffer, the input stream can be placed in the buffer and then output to the destination (memory or network). The benefit is to reduce the number of read interactions with memory, which can be performance intensive.
Here’s an example of how buffering can improve performance:
Suppose you have an 8K file, and if you only use InputStream to read it, you need to read it eight times each time you read 1K, which means you need to interact with the file eight times. But if you use a BufferedInputStream, the first time you read a file, you’re going to read 8K of data from the file into the buffer at once, even though you’re going to end up reading 1K of data from the buffer eight times, But these eight reads are from the buffer, which is far better than directly interacting with the file. In addition, the larger the file data, the more obvious the efficiency improvement through the buffer mode.
As for why we use decorator pattern here? Here is a brief discussion.
What happens if we assume that without decorator patterns, enhancements are simply inherited?
For example, if we want file caching, we’re going to add a BufferedFileInputStream, which is a class that’s perfectly acceptable.
If we need additional enhancements, such as support for reading data by basic data types (int, Boolean, long, and so on) (named DataInputStream). In this case, if we continue with the inheritance approach, we need to derive classes like DataFileInputStream and DataPipedInputStream. If we need to support both read data caching, also according to the basic types of classes, it will continue to derive BufferedDataFileInputStream, BufferedDataPipedInputStream, etc.
That’s just two enhancements added, let’s say there are m enhancements and n base classes. So there are m by n classes by inheritance. At the same time, the class inheritance structure becomes complex, and the code is not easy to extend and maintain.
But with decorator design patterns, developers need to put together what features themselves. This allows you to solve the problem of inheritance explosion by grouping, requiring only M + N classes.
Here’s a look at how the JDK implements the decorator pattern for IO reading byte streams. There are quite a few classes:
Now that you understand the core idea behind the design, the first piece of code is easy to understand.
But it still doesn’t feel very good, because I just want to read and write a file, and it’s still a little bit cumbersome to do that. Is there an easier way, such as an API that encapsulates these operations, or a new IO framework?
I’m sure my question must have been shared by many people. I googled and found an IO artifact. Interestingly, the name of the artifact was seen in the source code of the Okhttp framework – Okio.
IO artifact Okio
Here’s the official introduction to Okio:
Okio is a library that complements
java.io
andjava.nio
to make it much easier to access, store, and process your data. It started as a component of OkHttp, the capable HTTP client included in Android. It’s well-exercised and ready to solve new problems.
Google translate it into “human language” :
Okio complements Java.io and java.nio by making it easier to access, store, and process data. It started as a component of OkHttp, the powerful HTTP client included with Android. It has been well trained and is ready to solve new problems.
It makes it easier to access, store, and process data. Since Okio is a complement to Java.io, is it better than traditional IO?
Take a look at what Okio does. Try using it to read and write a file:
// OKio write file
private static void writeFileByOKio(a) {
try (Sink sink = Okio.sink(new File(path));
BufferedSink bufferedSink = Okio.buffer(sink)) {
bufferedSink.writeUtf8("write" + "\n" + "success!");
} catch(IOException e) { e.printStackTrace(); }}/ / OKio read files
private static void readFileByOKio(a) {
try (Source source = Okio.source(new File(path));
BufferedSource bufferedSource = Okio.buffer(source)) {
for(String line; (line = bufferedSource.readUtf8Line()) ! =null;) { System.out.println(line); }}catch(IOException e) { e.printStackTrace(); }}Copy the code
The use of () parentheses after a try is a new Java7 feature: resources in the try parentheses are automatically released at the end of the try statement, provided that they implement the Java.lang. AutoCloseable interface.
As you can see from the code, the key step in reading and writing a file is to create a BufferedSource or BufferedSink object. With these two objects, you can read and write files directly.
But it’s not much cleaner than traditional IO usage. Because this example is simple, as mentioned earlier, traditional IO uses decorator design patterns to provide enhancements.
To make the scene a little more complicated, if you want to read an integer or a floating point, you need to enhance it with a DataInputStream, and you need caching for efficiency, so you need a layer of BufferInputStream. Something like this (core code) :
fileStream = new FileInputStream(path);
binStream = new BufferedInputStream(fileStream);
dataInputStream = new DataInputStream(binStream);
dataInputStream.readInt();
Copy the code
But Okio provides us with the BufferedSink and BufferedSource that do almost all of the above without needing to string a bunch of decorators. Something like this (core code) :
Source source = Okio.source(new File(path));
BufferedSource bufferedSource = Okio.buffer(source)) {
bufferedSource.readInt()
Copy the code
As you can see here, Okio simplifies the complex scenarios of traditional IO and does make it easier to access data. At this point, my original need has been solved.
Now I wonder how Okio was designed to work so well. Take a look at its class design:
In the use of Okio read and write, the key classes are Source, Sink, BufferedSource, BufferedSink.
The Source and Sink
Source and Sink are interfaces, similar to traditional IO InputStream and OutputStream, and have input and OutputStream functions.
Sourece interface is mainly used to read data, and the source of data can be disk, network, memory, etc
public interface Source extends Closeable {
long read(Buffer sink, long byteCount) throws IOException;
Timeout timeout(a);
@Override void close(a) throws IOException;
}
Copy the code
The Sink interface is mainly used to write data
public interface Sink extends Closeable.Flushable {
void write(Buffer source, long byteCount) throws IOException;
@Override void flush(a) throws IOException;
Timeout timeout(a);
@Override void close(a) throws IOException;
}
Copy the code
BufferedSource and BufferedSink
BufferedSource and BufferedSink are extensions to the Source and Sink interfaces. Okio packages commonly used method in BufferedSource/BufferedSink interface, the underlying byte stream directly processed into data types, slam the door in the Java IO nested all kinds of input and output flows, and provides many convenient apis, such as readInt, readString ()
public interface BufferedSource extends Source.ReadableByteChannel {
Buffer getBuffer();
int readInt() throws IOException;
String readString(long byteCount, Charset charset) throws IOException;
}
Copy the code
public interface BufferedSink extends Sink.WritableByteChannel {
Buffer buffer(a);
BufferedSink writeInt(int i) throws IOException;
BufferedSink writeString(String string, int beginIndex, int endIndex, Charset charset)
throws IOException;
}
Copy the code
RealBufferedSink and RealBufferedSource
Both the BufferedSource and BufferedSink are interfaces, and their corresponding implementation classes are RealBufferedSink and RealBufferedSource.
final class RealBufferedSource implements BufferedSource {
public final Buffer buffer = new Buffer();
@Override public String readString(Charset charset) throws IOException {
if (charset == null) throw new IllegalArgumentException("charset == null");
buffer.writeAll(source);
return buffer.readString(charset);
}
/ /...
}
Copy the code
final class RealBufferedSink implements BufferedSink {
public final Buffer buffer = new Buffer();
@Override public BufferedSink writeString(String string, int beginIndex, int endIndex,
Charset charset) throws IOException {
if (closed) throw new IllegalStateException("closed");
buffer.writeString(string, beginIndex, endIndex, charset);
return emitCompleteSegments();
}
/ /...
}
Copy the code
Both the RealBufferedSource and RealBufferedSink maintain a Buffer object internally. In the implementation method, the final implementation is transferred to the Buffer object processing. So the Buffer class is the soul of Okio. More on this below.
Buffer
The benefit of Buffer is that data is read from the InputStream in terms of data segments, which is more efficient than reading a single byte. It is a space-for-time strategy.
public final class Buffer implements BufferedSource.BufferedSink.Cloneable.ByteChannel {
Segment head;
@Override public Buffer getBuffer(a) {
return this;
}
@Override public String readString(long byteCount, Charset charset) throws EOFException {
checkOffsetAndCount(size, 0, byteCount);
if (charset == null) throw new IllegalArgumentException("charset == null");
if (byteCount > Integer.MAX_VALUE) {
throw new IllegalArgumentException("byteCount > Integer.MAX_VALUE: " + byteCount);
}
if (byteCount == 0) return "";
Segment s = head;
if (s.pos + byteCount > s.limit) {
// If the string spans multiple segments, delegate to readBytes().
return new String(readByteArray(byteCount), charset);
}
String result = new String(s.data, s.pos, (int) byteCount, charset);
s.pos += byteCount;
size -= byteCount;
if (s.pos == s.limit) {
head = s.pop();
SegmentPool.recycle(s);
}
return result;
}
/ /...
}
Copy the code
As you can see from the code, this Buffer is an aggregator, implementing the BufferedSink and BufferedSource interfaces, which means it can read and write simultaneously.
It internally maintains a data block Segment, what is it?
final class Segment {
// The size is 8KB
static final int SIZE = 8192;
// The starting position of the read data
int pos;
// The start position of the write data
int limit;
/ / the subsequent
Segment next;
/ / before following
Segment prev;
// Remove the current Segment object from the bidirectional list and return the next node in the list as the head node
public final @Nullable Segment pop(a) { Segment result = next ! =this ? next : null;
prev.next = next;
next.prev = prev;
next = null;
prev = null;
return result;
}
// Insert a new Segment node object after the current node in the list and move next to point to the newly inserted node
public final Segment push(Segment segment) {
segment.prev = this;
segment.next = next;
next.prev = segment;
next = segment;
return segment;
}
// If there is not enough space for a single Segment to store the written data, an attempt is made to split it into two segments
public final Segment split(int byteCount) {
/ /...
}
// Merge some adjacent segments
public final void compact(a) {}}Copy the code
It can be seen from pop and push methods that Segment is a two-way linked list data structure. A Segment size is 8KB. It is the Segment that makes IO read and write operations so efficient.
Closely related to Segment is a SegmentPoll.
final class SegmentPool {
static final long MAX_SIZE = 64 * 1024;
static @Nullable Segment next;
// If there are free segments in the pool, create a new Segment
static Segment take(a) {
synchronized (SegmentPool.class) {
if(next ! =null) {
Segment result = next;
next = result.next;
result.next = null;
byteCount -= Segment.SIZE;
returnresult; }}return new Segment(); // Pool is empty. Don't zero-fill while holding a lock.
}
// Recycle the segment for reuse to improve efficiency
static void recycle(Segment segment) {
if(segment.next ! =null|| segment.prev ! =null) throw new IllegalArgumentException();
if (segment.shared) return; // This segment cannot be recycled.
synchronized (SegmentPool.class) {
if (byteCount + Segment.SIZE > MAX_SIZE) return; // Pool is full.
byteCount += Segment.SIZE;
segment.next = next;
segment.pos = segment.limit = 0; next = segment; }}}Copy the code
A SegmentPool is a pool of cached segments. It has a size of 64KB or 8 segments in length. Since it is a pool, it acts like a thread pool in order to reuse previously reclaimed segments. The recycle() method recycles a Segment object. The reclaimed Segment object will be inserted into the head of the single linked list in the SegmentPool for further reuse.
The purpose of the SegmentPool is to prevent the reclamation of applied resources, increase the reuse of resources, and reduce GC, which can degrade performance when GC is too frequent
As you can see, Okio has put a lot of effort into memory optimization to improve resource utilization, which in turn improves performance.
It’s also important to note that for OKio, its Buffer is an external tool. To write data to a Buffer, OKio uses the source read method, not the xxx.write method:
try (Source source = Okio.source(new File(path))) { Buffer buffer = new Buffer(); Source. Read (buffer, 1024); System.out.println("okio buffer read:" + buffer.readUtf8Line()); } catch (IOException e) { e.printStackTrace(); }Copy the code
conclusion
Not only that, but Okio also offers other useful features:
For example, it provides a series of convenient tools
- GZip transparent processing
- Supports md5 and SHA1 data calculation, which is very convenient for data verification
For example, provides the timeout mechanism processing, the internal design is also very interesting, interested in reference
Okio (part 4)
The purpose of this article is to make sure that you can use Okio more efficiently and easily than traditional IO. However, it is not as flexible as traditional IO with the desired enhancements.
The author of this article consciously wrote is not very good, if you think did not understand can also refer to the following article
This is a more detailed explanation of the source:
Okio principle analysis
This article explains the content in detail, mentioning Socket communication
The Socket is used for communication
Other relevant bits and pieces:
The buffer size of the input and output streams is appropriate