preface
String class belongs to a java.lang package under a class, in the daily development of the most frequent use of a class, this article is mainly in the process of reading String source notes.
Member variables
private final char value[];
private int hash;
Copy the code
String is a final declared class, String is stored by char[], and String hashCode is stored using hash variables.
The constructor
String provides more constructors, as follows:
There are mainly the following categories:
- Construct from char[].
- Construct from int[]
- Construct according to byte[].
- Construct from StringBuffer.
- Construct from StringBuilder.
Among them, the constructors based on int[] are mainly applicable to code points, and the constructors based on byte[] are mainly divided into Unicode, ASCII and byte. Since I have no in-depth knowledge of code points and Unicode, I will not analyze these constructors and related functions for the time being. I hope I can make up for this shortcoming in the future.
Construct a String from char[]
public String(char value[]) {
this.value = Arrays.copyOf(value, value.length);
}
Copy the code
Char [] is stored in the heap, so if this.value=value is used, subsequent operations on this.value will affect value[], so we use array.copyof to complete the deep copy.
Construct according to byte[]
public String(byte bytes[], int offset, int length, String charsetName)
throws UnsupportedEncodingException {
if (charsetName == null)
throw new NullPointerException("charsetName");
checkBounds(bytes, offset, length);
this.value = StringCoding.decode(charsetName, bytes, offset, length);
}
private static void checkBounds(byte[] bytes, int offset, int length) {
if (length < 0)
throw new StringIndexOutOfBoundsException(length);
if (offset < 0)
throw new StringIndexOutOfBoundsException(offset);
if (offset > bytes.length - length)
throw new StringIndexOutOfBoundsException(offset + length);
}
Copy the code
Before construction, the source code uses the checkBounds to check whether offset+length is out of bounds, that is, offset+length > bytes. if it is true, an exception is thrown.
After passing the verification, use String. Decode to convert bytes to String.
Construct from StringBuffer
public String(StringBuffer buffer) {
synchronized(buffer) {
this.value = Arrays.copyOf(buffer.getValue(), buffer.length()); }}Copy the code
Since StringBuffer is thread-safe, synchronized is added to prevent buffer from manipulating the string while buffer.getValue() is still available. A StringBuilder is constructed much like a StringBuffer, but without syncronized.
CharAt source code analysis
public char charAt(int index) {
if ((index < 0) || (index >= value.length)) {
throw new StringIndexOutOfBoundsException(index);
}
return value[index];
}
Copy the code
CharAt is the value array of these operations, and the index parameter can be used to obtain the corresponding value.
GetChars source code analysis
public void getChars(int srcBegin, int srcEnd, char dst[], int dstBegin) {
if (srcBegin < 0) {
throw new StringIndexOutOfBoundsException(srcBegin);
}
if (srcEnd > value.length) {
throw new StringIndexOutOfBoundsException(srcEnd);
}
if (srcBegin > srcEnd) {
throw new StringIndexOutOfBoundsException(srcEnd - srcBegin);
}
System.arraycopy(value, srcBegin, dst, dstBegin, srcEnd - srcBegin);
}
Copy the code
The above functions verify the validity of the input parameter. After the validation, value[] is copied by system. arrayCopy.
Equals source code analysis
public boolean equals(Object anObject) {
if (this == anObject) {
return true;
}
if (anObject instanceof String) {
String anotherString = (String)anObject;
int n = value.length;
if (n == anotherString.value.length) {
char v1[] = value;
char v2[] = anotherString.value;
int i = 0;
while(n-- ! =0) {
if(v1[i] ! = v2[i])return false;
i++;
}
return true; }}return false;
}
Copy the code
Most classes of equlas use the this == anObject method first to check for equality, and return true if the condition is met. If not, the content is determined to be equal.
String equals checks whether the length of the String is equal before checking whether the contents are equal. If not, the String is looped bit by bit. If there are unequal characters, false is returned.
Source code analysis of nonSyncContentEquals
private boolean nonSyncContentEquals(AbstractStringBuilder sb) {
char v1[] = value;
char v2[] = sb.getValue();
int n = v1.length;
if(n ! = sb.length()) {return false;
}
for (int i = 0; i < n; i++) {
if(v1[i] ! = v2[i]) {return false; }}return true;
}
Copy the code
It is used to determine whether a String instance is equal to a StringBuffer and A StringBuilder instance. The principle is to compare the character length first, and then compare the character bit by bit.
ContentEquals source code analysis
public boolean contentEquals(CharSequence cs) {
// Argument is a StringBuffer, StringBuilder
if (cs instanceof AbstractStringBuilder) {
if (cs instanceof StringBuffer) {
synchronized(cs) {
returnnonSyncContentEquals((AbstractStringBuilder)cs); }}else {
returnnonSyncContentEquals((AbstractStringBuilder)cs); }}// Argument is a String
if (cs instanceof String) {
return equals(cs);
}
// Argument is a generic CharSequence
char v1[] = value;
int n = v1.length;
if(n ! = cs.length()) {return false;
}
for (int i = 0; i < n; i++) {
if(v1[i] ! = cs.charAt(i)) {return false; }}return true;
}
Copy the code
Subclasses of the CharSequence class include StringBuilder, StringBuffer, and String, so StringBuilder and StringBuffer need to be compared in another way.
Cs instanceof AbstractStringBuilder Then use cs instanceof StringBuffer to check whether synchronized needs to be added.
Check whether cs is a String class by checking cs instanceof String. If so, call String. Equals.
If none of the preceding conditions exist, you need to determine whether characters are equal according to the CharSequence principle.
EqualsIgnoreCase source code analysis
public boolean equalsIgnoreCase(String anotherString) {
return (this == anotherString) ? true: (anotherString ! =null)
&& (anotherString.value.length == value.length)
&& regionMatches(true.0, anotherString, 0, value.length);
}
Copy the code
After ignoring the case of the string, determine whether the string is equal. Before invoking regionMatches, ensure that anotherString is the same length as this and this.value. The regionMatches method is analyzed below.
RegionMatches method
On the above unified conversion of uppercase characters after a comparison, if not successful, and then unified conversion to lowercase characters for a time, through the annotation, the main degree of other letters can not be converted to uppercase a compensation process.
HashCode source code analysis
public int hashCode(a) {
int h = hash;
if (h == 0 && value.length > 0) {
char val[] = value;
for (int i = 0; i < value.length; i++) {
h = 31 * h + val[i];
}
hash = h;
}
return h;
}
Copy the code
String of hashCode calculation for: 31 ^ s [0] * (n – 1) + s [1] 31 ^ * (n – 2) +… + s[n-1], where S [I] is the ith character of the string, n is the length of the string, and ^ represents the power. (The hash value of an empty string is zero.)
From the source code, we can see that the calculation of hashCode uses 31 as the multiplier, mainly considering the following reasons:
- 31 is a moderate prime number, one of the preferred primes to be used as a hashCode multiplier. Other close primes, such as 37, 41, 43, etc., are also good choices. So why did YOU pick 31? Look at the second reason.
- 31 can be optimized by the JVM, 31 ∗ I = (I < < 5) -i.
CompareTo source code analysis
public int compareTo(String anotherString) {
int len1 = value.length;
int len2 = anotherString.value.length;
int lim = Math.min(len1, len2);
char v1[] = value;
char v2[] = anotherString.value;
int k = 0;
while (k < lim) {
char c1 = v1[k];
char c2 = v2[k];
if(c1 ! = c2) {return c1 - c2;
}
k++;
}
return len1 - len2;
}
Copy the code
The source code for this method is well understood, which compares two strings alphabetically based on the Unicode value of each character in the string. When two strings have different characters, the difference in Unicode values is returned. When two strings are identical, the difference in Unicode values is returned.
reference
- Java training guide: high frequency source code analysis