preface
I believe that as JAVAER, usually the most used encoding must be String String, and I believe that there should be many people familiar with the String API, but have not seen the source code implementation, in fact, I personally think for the use of API, the beginning of the stage is to see its official documents, And with the accumulation of development experience, should try to look at the source code implementation, which is crucial to their ability to improve. After you understand the source code, the use of API will be more handy!
Note: The following records are based on the JDK8 environment
String is just a class
String is really just a class, and we can roughly analyze it from the following angles:
- Class inheritance relation
- Class member variable
- Class constructor
- Class member methods
- Dependent static method
Inheritance relationships
The UML class diagram of String exported from the plugin of IDEA is as follows:
String implements Serializable, Comparable, and CharSequence. String implements Serializable, Comparable, and CharSequence
Serializable
: The class that implements this interface will have serialization capability. This interface has no implementation, but only serves as a constant identifier.Comparable
: Classes implementing this interface have the ability to compare sizes, such as lists (and arrays) of objects implementing this interfaceCollections
Class static methodssort
Do automatic sorting.CharSequence
: character sequence unified I interface. Provides common operations on character sequences, usually someread-onlyMethod, many character-related classes implement this interface to operate on character sequences, such as:String
.StringBuffer
And so on.
The String class is defined as follows:
public final class String
implements java.io.Serializable.Comparable<String>, CharSequence{... }Copy the code
The final modifier indicates that the String class is an immutable class that cannot be inherited.
Class member variable
Here we mainly introduce the most critical member variable value[], which is defined as follows:
/** The value is used for character storage. */
private final char value[];
Copy the code
A String is a String composed of char characters, so a String is actually an array of characters, represented by value[]. Note that value[] is final, indicating that the value cannot be modified.
Class constructor
There are a number of overloaded String constructors, as described below:
- The null argument constructor initializes an instance of a string, defaulting to a null character. Theoretically you don’t need this constructor, but actually defines a null character
String = ""
It will initialize an empty stringString
Object, and this constructor also takes null charactersvalue[]
Copy it again, the source code is as follows:public String(a) { this.value = "".value; } Copy the code
- Constructed from a string argument
String
Object that will actually take the parametervalue
和hash
Assigning a value to an instance object as initialization is equivalent to making a deep copy of a parameterString
Object, source code as follows:public String(String original) { this.value = original.value; this.hash = original.hash; } Copy the code
- Build a new one from an array of characters
String
Object, used hereArrays.copyOf
Method to copy an array of characterspublic String(char value[]) { this.value = Arrays.copyOf(value, value.length); } Copy the code
- Build a new character array from the source character array by intercepting the offset (starting position) and the number of characters
String
Object.public String(char value[], int offset, int count) { // If the offset is less than 0, an out-of-bounds exception is thrown if (offset < 0) { throw new StringIndexOutOfBoundsException(offset); } if (count <= 0) { // If the number of characters is less than zero, an out-of-bounds exception is thrown if (count < 0) { throw new StringIndexOutOfBoundsException(count); } // If the offset is within the length of the string, a null character is returned if (offset <= value.length) { this.value = "".value; return; }}// Note: offset or count might be near -1>>>1. // If the offset is greater than the total character length - the truncated character length, an out-of-bounds exception is thrown if (offset > value.length - count) { throw new StringIndexOutOfBoundsException(offset + count); } // Use the array. copyOfRange static method to take an array of characters from offset, length offset+count, and assign them to the array of the current instance this.value = Arrays.copyOfRange(value, offset, offset+count); } Copy the code
- From the source integer array, a new one is constructed by intercepting the offset (starting position) and the number of characters
String
Object. This is an array of integersThe ASCII integer value corresponding to a characterpublic String(int[] codePoints, int offset, int count) { // If the offset is less than 0, an out-of-bounds exception is thrown if (offset < 0) { throw new StringIndexOutOfBoundsException(offset); } if (count <= 0) { // If the number of characters is less than zero, an out-of-bounds exception is thrown if (count < 0) { throw new StringIndexOutOfBoundsException(count); } // If the offset is within the length of the string, a null character is returned if (offset <= codePoints.length) { this.value = "".value; return; }}// Note: offset or count might be near -1>>>1.If the offset is greater than the total character length - the length of the truncated character, an out-of-bounds exception is thrown//if (offset > codePoints.length - count) { throw new StringIndexOutOfBoundsException(offset + count); } final int end = offset + count; // Calculate the exact size of the character array n, filter out some invalid int data int n = count; for (int i = offset; i < end; i++) { int c = codePoints[i]; if (Character.isBmpCodePoint(c)) continue; else if (Character.isValidCodePoint(c)) n++; else throw new IllegalArgumentException(Integer.toString(c)); } // Initialize the array with the size calculated in the previous step final char[] v = new char[n]; // Iterate over the fill character array for (int i = offset, j = 0; i < end; i++, j++) { int c = codePoints[i]; if (Character.isBmpCodePoint(c)) v[j] = (char)c; else Character.toSurrogates(c, v, j++); } // The array of characters assigned to the current instance this.value = v; } Copy the code
- Initialize by intercepting length from offset in the source byte array
String
Instance, and you can specify a character encoding.public String(byte bytes[], int offset, int length, String charsetName) throws UnsupportedEncodingException { // If the character encoding argument is null, a null pointer exception is thrown if (charsetName == null) throw new NullPointerException("charsetName"); Static methods check whether the index of a byte array is out of bounds checkBounds(bytes, offset, length); // Use stringcoding. decode to decode a byte array into a string in a range, with length truncated from offset this.value = StringCoding.decode(charsetName, bytes, offset, length); } Copy the code
- Similar to the sixth construct, except that the encoding parameter overload is
Charset
typepublic String(byte bytes[], int offset, int length, Charset charset) { if (charset == null) throw new NullPointerException("charset"); checkBounds(bytes, offset, length); this.value = StringCoding.decode(charset, bytes, offset, length); } Copy the code
- Construct an instance of a string from the source byte array, specifying the character encoding, by calling the sixth constructor, starting at 0 and truncating to the length of the byte array
public String(byte bytes[], String charsetName) throws UnsupportedEncodingException { this(bytes, 0, bytes.length, charsetName); } Copy the code
- From the source byte array, construct an instance of a string, specifying the character encoding. The implementation actually calls the seventh constructor, starting at position 0, and truncating to the length of the byte array
public String(byte bytes[], Charset charset) { this(bytes, 0, bytes.length, charset); } Copy the code
- Initialize by intercepting length from offset in the source byte array
String
Example, unlike the sixth constructor, uses the system default character encodingpublic String(byte bytes[], int offset, int length) { // Check if the index is out of bounds checkBounds(bytes, offset, length); // Decodes byte arrays into character arrays using the system default character encoding this.value = StringCoding.decode(bytes, offset, length); } Copy the code
- From the source byte array, construct an instance of a string, using the system default encoding, which actually calls the 10th constructor, starting at position 0, and truncating to the length of the byte array
public String(byte bytes[]) { this(bytes, 0, bytes.length); } Copy the code
- will
StringBuffer
Build a new oneString
What’s special about this method is that it hassynchronized
Locking allows only one thread on this at a timebuffer
To build aString
Object that is thread-safepublic String(StringBuffer buffer) { // Lock the current StringBuffer object synchronized(buffer) { // Copies the StringBuffer character array to the character array of the current instance this.value = Arrays.copyOf(buffer.getValue(), buffer.length()); }}Copy the code
- will
StringBuilder
Build a new oneString
Unlike the 12th constructor, this constructor is not thread-safepublic String(StringBuilder builder) { this.value = Arrays.copyOf(builder.getValue(), builder.length()); } Copy the code
Class member methods
-
Get the string length, actually get the character array length
public int length(a) { return value.length; } Copy the code
-
To determine whether the string is empty, we need to determine whether the length of the complex character array is zero
public boolean isEmpty(a) { return value.length == 0; } Copy the code
-
Gets characters based on index parameters
public char charAt(int index) { // If the index is less than 0 or greater than the character array length, an out-of-bounds exception is thrown if ((index < 0) || (index >= value.length)) { throw new StringIndexOutOfBoundsException(index); } // Returns an array of positional characters return value[index]; } Copy the code
-
Get the specified character ASSIC (int) from the index argument
public int codePointAt(int index) { // If the index is less than 0 or greater than the character array length, an out-of-bounds exception is thrown if ((index < 0) || (index >= value.length)) { throw new StringIndexOutOfBoundsException(index); } // Return a classic at the index position (int) return Character.codePointAtImpl(value, index, value.length); } Copy the code
-
Return the ASSIC code (int) of the element preceding the index position.
public int codePointBefore(int index) { // Get the index position of the element preceding index int i = index - 1; // Check if the index is out of bounds if ((i < 0) || (i >= value.length)) { throw new StringIndexOutOfBoundsException(index); } return Character.codePointBeforeImpl(value, index, 0); } Copy the code
-
The codePointCount () method returns the number of code points, the actual number of characters, similar to length(). For a normal String, the length method is no different from codePointCount, which returns the number of characters. But there is a difference when strings are of Unicode type. For example: String STR = “/uD835/uDD6B” (even ‘Z’), length() = 2,codePointCount() = 1
public int codePointCount(int beginIndex, int endIndex) { if (beginIndex < 0 || endIndex > value.length || beginIndex > endIndex) { throw new IndexOutOfBoundsException(); } return Character.codePointCountImpl(value, beginIndex, endIndex - beginIndex); } Copy the code
-
CodePointOffset = 5 codePointOffset = 5 codePointOffset = 5 codePointOffset = 5
public int offsetByCodePoints(int index, int codePointOffset) { if (index < 0 || index > value.length) { throw new IndexOutOfBoundsException(); } return Character.offsetByCodePointsImpl(value, 0, value.length, index, codePointOffset); } Copy the code
-
This is a private method that is called inside String because it has no access modifiers and only allows classes in the same package to access arguments: DST [] is the destination array, and dstBegin is the offset of the destination array. To copy the past starting position (from what position in the destination array), copy the entire String value into the DST character array
void getChars(char dst[], int dstBegin) { System.arraycopy(value, 0, dst, dstBegin, value.length); } Copy the code
-
Gets an array of char characters by copying the characters of a string into the target character array using the getChars() method. Parameters: SrcBegin was the starting position of cend, srcEnd was the back of cend characters to be copied by the original string (the replication region did not include srcEnd). DST [] was the target character array, dstBegin was the copy offset of cend characters, The copied characters are overwritten starting at dstBegin in the target character array.
public void getChars(int srcBegin, int srcEnd, char dst[], int dstBegin) { if (srcBegin < 0) { throw new StringIndexOutOfBoundsException(srcBegin); } if (srcEnd > value.length) { throw new StringIndexOutOfBoundsException(srcEnd); } if (srcBegin > srcEnd) { throw new StringIndexOutOfBoundsException(srcEnd - srcBegin); } System.arraycopy(value, srcBegin, dst, dstBegin, srcEnd - srcBegin); } Copy the code
-
Gets a byte array of strings, decoding the string to a byte array according to the specified character encoding
public byte[] getBytes(String charsetName) throws UnsupportedEncodingException { if (charsetName == null) throw new NullPointerException(); return StringCoding.encode(charsetName, value, 0, value.length); } Copy the code
-
Gets a byte array of strings, decoding the string to a byte array according to the specified character encoding
public byte[] getBytes(Charset charset) { if (charset == null) throw new NullPointerException(); return StringCoding.encode(charset, value, 0, value.length); } Copy the code
-
Gets a byte array of strings, decoding the string to a byte array according to the system default character encoding
public byte[] getBytes() { return StringCoding.encode(value, 0, value.length); } Copy the code
Simple summary
String
By the modifierfinal
Modifier is an immutable class that cannot be inheritedString
implementationSerializable
Interface, can be serializedString
implementationComparable
Interface that can be used to compare sizesString
implementationCharSequence
Interface, represents the sequential character sequence, to achieve a general character sequence methodString
Is a sequence of characters, and the internal data structure is actually an array of characters around which all operations are performed.String
“Is used frequentlySystem
Of the classarraycopy
Method to copy an array of characters
The last
Due to the lack of space, the first summary of String will be here first, and the following part will be recorded again. We will push it to the official account [Zhang Shaolin] as soon as possible. Welcome to follow us!