Brief analysis of Java character stream, byte stream

This is the seventh day of my participation in the More text Challenge. For details, see more text Challenge

Book connected, we have seen the concept of bytes and character, didn’t see please see byte String and character is analysed first, then we could look at the byte stream and the characters of the flow, first we need to understand a concept: all computer data transmission are byte stream in the form of information interaction, the so-called character flow is a result of coded byte only;

Since the character stream is the result of the byte stream encoding, let’s look at the byte stream first, step by step slowly walk;

First of all, I post the code I will test below, so as not to mess up ** (the code is just for easy reading of the source code) **

InputStream fileInputStream = new FileInputStream("C:\\Users\\Desktop\\new 50.txt"); BufferedInputStream bf = new BufferedInputStream(fileInputStream); InputStreamReader inputStreamReader = new InputStreamReader(fileInputStream); char[] a =new char[1024]; byte[] b = new byte[1024]; int len; while((len = inputStreamReader.read(a))! =-1){ System.out.println(new String(a,0,len)); } while((len = fileInputStream.read(b))! =-1){ System.out.println(new String(b,0,len)); }Copy the code

Byte stream InputStream

Since our object type is FileInputStream, we have to go to the read method in the FileInputStream to call the abstract class inputStream.read (). It calls the underlying local method (read it one at a time, because there are batch reads below)

The read () method is called when we add an array to it

private native int readBytes(byte b[], int off, int len) throws IOException;
Copy the code

In this case, is equivalent to a byte stream has a buffer, and the buffer is your into the size, when you are in the byte array to a little boy, the console: congratulations, part of data is garbled, so we can see our Demo usually general to 1024 bytes, because you need to put together all the bytes. If they are not put together, the character is only half a byte and cannot be parsed; (as for why you rarely encounter garbled characters, we will talk about the characters later) PS :(for example, the read method reads all garbled characters without arguments, why? Since Chinese characters take up two bytes, we can write it to a file and give it a sleep time in the process to see how the file will change from garbled to normal at different times.

Byte BufferedInputStream

Let’s see what does BufferedInputStream do to the InputStream when it initializes the object

InputStream fileInputStream = new FileInputStream("C:\\Users\\Desktop\\new 50.txt");
BufferedInputStream bf = new BufferedInputStream(fileInputStream);
Copy the code

It calls its this constructor in bufferedinputStream.java, saving the byte stream and defining an array of 8192 bytes

public BufferedInputStream(InputStream in, int size) {
        super(in);
        if (size <= 0) {
            throw new IllegalArgumentException("Buffer size <= 0");
        }
        buf = new byte[size];
    }
Copy the code

So when you call its read() method, it can read into the buffer. If you add the argument read(byte), it still executes the read method repeatedly up to the length of the argument;

** Note: ****new FileInputStream () ** just opens the file

I’m sure you can see the difference here, if you don’t have a BufferedInputStream you just read the disk file (depending on your read parameter below), if you have a BufferedInputStream you can take 8192 bytes at a time and put it into memory and then read it from memory, and whether you have a buffer or not, In fact, as soon as you add a parameter to read, you’re using cache;

To sum up, the larger the file, the better the buffer effect;

Character stream InputStreamReader

Why this thing will be characters flow, arguably, byte stream is enough, the answer is garbled, first look at the encoding of the characters of the flow process, read the byte stream file first, and then for transcoding operation, if deposited in the disk again, at this time also need to change the characters into bytes, then deposit, it has already been said, anything inside the computer is a byte interaction, In this process, we can also see that the character stream is suitable for text files. For non-text files, if the character stream can not be found in the bytes-to-characters process, then it is garbled, and then the byte transfer is still an unknown number, it is messy;

The InputStreamReader is initialized to receive the InputStream as an Object. The default InputStreamReader encoding is UTF-8, or you can specify the encoding

InputStreamReader inputStreamReader = new InputStreamReader(fileInputStream, StandardCharsets.UTF_16); public static Charset defaultCharset() { if (defaultCharset == null) { synchronized (Charset.class) { String csn = AccessController.doPrivileged( new GetPropertyAction("file.encoding")); Charset cs = lookup(csn); if (cs ! = null) defaultCharset = cs; else defaultCharset = forName("UTF-8"); } } return defaultCharset; }Copy the code

And then watch him execute the read() method with no arguments,

Hey, do you have that smell? Read two bytes at a time to generate one character for char, but why do char give two? Not one character at a time? If you don’t understand, go to my last article to…

It’s just BMP Chinese or some other symbol and they might need four bytes to express it so they need two chars;

Read (char[] a) reads a number of characters at a time.

Character buffer stream BufferedReader

With all that said, this is similar, an object that receives a stream of inputStreamReaders,

Then create a stream of buffered characters using an input buffer of the specified size (8192). Then the read() method reads a single character, and the read(char[] a) method reads a character of length a.

conclusion

In summary, for non-text files, if you don’t need to manipulate the data but just read it, you use a byte stream. If the file is too large, you use a byte buffer stream. Use a character stream if you need to manipulate data for a text file, or a character buffer stream if the file is too large

Brief analysis of Java character stream, byte stream

Byte stream InputStream

Byte BufferedInputStream

Character stream InputStreamReader

Character buffer stream BufferedReader

conclusion

Related Posts

Express Combat (3) : Express basics

Talk about Dubbo’s ValidationFilter

Docker to install Nginx