1. The string
1.1. String Creation (JDK8)
1.1.1. char[] array creation
String s = new String(new char[] {'a'.'b'.'c'});
Copy the code
Byte [] Array creation
String s = new String(new byte[] {97.98.99});
Copy the code
Byte [] will be converted to char[] when constructed, depending on the character set (size).
In GBK character set conversion, two byte 0xD5 and 0xC5 are converted to a char 0x5F20.
String s = new String(new byte[] {(byte) 0xD5, (byte) 0xC5}, Charset.forName("gbk"));
Copy the code
In utF-8 character set conversion, the three byte types 0xE5, 0xBC, and 0xA0 are converted to a char type 0x5F20.
String s = new String(new byte[] {(byte) 0xE5, (byte) 0xBC, (byte) 0xA0}, Charset.forName("utf-8"));
Copy the code
1.1.3. int[] Array creation
Sometimes we also need two chars to represent a character, such as 😂, which is 0x1F602 in Unicode. The storage range is already beyond the maximum char can represent 0xFFF, so int[] is used to construct such a string.
String s = new String(new int[] {0x1F602}, 0.1);
Copy the code
An int 0x1F602 is converted to two char 0xD83D and 0xDE02 (emoji [😂]).
1.1.4. Create from an existing string
Pass in a source string and create a new string.
public String(String original) {
this.value = original.value;
this.coder = original.coder;
this.hash = original.hash;
}
Copy the code
Method of use
String s1 = new String(new char[] {'a'.'b'.'c'});
String s2 = new String(s1);
Copy the code
According to the constructor, the values in both strings refer to the same char[] array.
1.1.5. Literal creation
The most common creation method.
int i = 10; // 10 is a literal
String s = "abc"; // "ABC" is a string literal
Copy the code
[Non-object]
A literal is not a string object until the code runs into its statement.
After the above Java code is compiled into a class file, the “ABC” is stored in the class file constant pool.
Lazy loading
Objects are created only when the literal code is executed.
// Suppose there are now 100 String objects
System.out.println("a"); // There are 101 String objects after execution
System.out.println("b"); // There are 102 String objects after execution
System.out.println("c"); // There are 103 String objects after execution
Copy the code
【 Not repeated 】
For the same literal in the same class, there is only one literal in the class file constant pool at compile time and only one string object created at run time.
String s1 = "abc";
String s2 = "abc";
System.out.println(s1 == s2); // true
Copy the code
The same literal in different classes, multiple literals in the class file constant pool at compile time, only one string object is created at run time.
public class StringTest {
public static void main(String[] args) {
String s1 = "abc";
String s2 = "abc";
StringTest2.main(newString[]{s1, s2}); }}public class StringTest2 {
public static void main(String[] args) {
String s = "abc";
System.out.println(s == args[0]); // true
System.out.println(s == args[1]); // true}}Copy the code
1.1.6. Splicing creation
Concatenate two strings into a new string using the plus operator +.
//
String s1 = "a" + "b";
// = = = = = = = =
final String x = "b";
String s2 = "a" + x;
// ③ Literals + variables
String x = "b";
String s3 = "a" + x;
// 4
String s4 = "a" + 1;
Copy the code
From the perspective of principle, ① and ② have the same principle, ③ and ④ have the same principle.
① : No real splicing operation. At compile time, “A” and “B” are already strung together.
Constant pool:
#1 = Methodref #4.#20 // java/lang/Object."<init>":()V
#2 = String #21 // ab
Copy the code
② : There is no real splicing operation. Final means that the value of x cannot be changed, and all other references to x are safely replaced with “b” at compile time.
Constant pool:
#1 = Methodref #5.#22 // java/lang/0bject. "<init>":()V
#2 = String #23 // b
#3 = String #24 // ab
Copy the code
③ : there is a real splicing operation. At compile time, the plus operator + is replaced with StringBuilder for string concatenation.
Constant pool:
#1 = Methodref #9.#26 // fava/1ang/0bject. "<init>":()V
#2 = String #27 // b
#3 = Class #28 // java/lang/StringBuilder
#4 = Methodref #3.#26 // java/lang/StringBuilder. "<init>":()V
#5 = String #29 // a
Copy the code
Since there is no true string concatenation, the source code is compiled into StringBuilder for execution.
String x = "b";
String s3 = new StringBuilder().append("a").append(x).toString();
Copy the code
In effect, the toString() method creates a new String object from the values array it maintains.
public final class StringBuilder extends AbstractStringBuilder
implements java.io.Serializable.Comparable<StringBuilder>, CharSequence {
AbstractStringBuilder > AbstractStringBuilder > AbstractStringBuilder > AbstractStringBuilder
char[] value; // JDK 9 uses byte[] value;
public String toString(a) {
return new String(value, 0, count); }}Copy the code
④ : The principle is exactly the same as ③.
1.2. Changes to JDK 9
1.2.1. Memory structure changes
To save memory, byte[] instead of char[] stores characters.
(1) If the string contains only Latin characters, byte indicates that the string occupies only one byte, while char indicates that the string occupies two bytes.
String s = new String(new byte[] {97.98.99});
Copy the code
② If the string contains only Chinese characters, the memory space cannot be saved.
String s = new String(new byte[] {(byte) 0xD5, (byte) 0xC5}, Charset.forName("gbk"));
Copy the code
③ If the string contains both Latin and Chinese characters, the Latin characters will be treated as Unicode characters, occupying two bytes, thus saving no memory space.
String s = new String(new byte[] {(byte) 0xD5, (byte) 0xC5.97}, Charset.forName("gbk"));
Copy the code
1.2.2. Changes in stitching methods
JDK 8 replaces the + operator with StringBuilder for string concatenation. By default, JDK 9 uses the bytecode instruction InvokeDynamic, which reflects the implementation of concatenation methods for string concatenation.
public static void main(String[] args) throws Throwable {
String x = "b";
// String s = "a"+ x;
// The following equivalent bytecode is generated
// the compiler provides a lookup to lookup the MethodHandle
MethodHandles.Lookup lookup = MethodHandles.lookup();
CallSite callSite = StringConcatFactory.makeConcatWithConstants(
lookup,
// The method name is not important, the compiler will automatically generate
"arbitrary".The first String is the return value type followed by the input parameter type
MethodType.methodType(String.class, String.class),
// Concrete prescription format, where \1 means placeholder for variable, to be replaced by x in the future
"a\1"
);
// callsite.gettarget () returns a MethodHandle object that reflects the execution of the concatenation method
String s = (String) callSite.getTarget().invoke(x);
}
Copy the code
Why bother! ! ! Mainly in order to string splicing to do a variety of extension optimization, the extension of the way. Is one of the most important MethodHandle, it USES the Strategy pattern generation, the JDK can provide all of the strategies in StringConcatFactory. Found in the Strategy:
Policy name | Internal calls | explain |
---|---|---|
BC_SB | Bytecode stitching generates StringBuilder code | Equivalent to new StringBuilder() |
BC_SB_SIZED | Bytecode stitching generates StringBuilder code | Equivalent to new StringBuilder(n), where n is the estimated size |
BC_SB_SIZED_EXACT | Bytecode stitching generates StringBuilder code | Equivalent to new StringBuilder(n), where n is the exact size |
MH_SB_SIZED | MethodHandle generates StringBuilder code | Equivalent to new StringBuilder(n), where n is the estimated size |
MH_SB_SIZED_EXACT | MethodHandle generates StringBuilder code | Equivalent to new StringBuilder(n), where n is the exact size |
MH_NLINE_SIZED_EXACT | MethodHandle constructs strings directly internally using byte arrays | The default policy |
If you want to change the policy, you can add JVM parameters at run time, such as changing the policy to BC_SB
-Djava.lang.invoke.stringConcat=BC_SB -Djava.lang.invoke.stringConcat.debug=true - Djava. Lang. Invoke. StringConcat. DumpClasses = anonymous class path are derivedCopy the code
1.2.3. Default splicing policy
The default policy is MH_NLINE_SIZED_EXACT, which uses byte arrays to construct strings directly.
For example, have the following string concatenation code
String x = "b";
String s = "a"+ x + "c"+ "d";
Copy the code
If the MH_NLINE_SIZED_EXACT policy is used, the following equivalent calls are made internally
String x = "b";
// Preallocates an array of bytes required by the string
byte[] buf = new byte[4];
// create a new string with an internal byte array value of [0,0,0,0]
String s = StringConcatHelper.newString(buf, 0);
// perform [concatenation], string internal byte array value is [97,0,0,0]
StringConcatHelper.prepend(1, buf, "a");
// perform [concatenation], the internal byte array value of the string is [97,98,0,0]
StringConcatHelper.prepend(2, buf, x);
// perform [concatenation], string internal byte array value is [97,98,99,100]
StringConcatHelper.prepend(4, buf, "cd");
// This is the end of the line.
Copy the code
2. StringTable
2.1. Domestic and wild
Of the six ways to create a String, all strings are wild, with the exception of literal strings that are domestic.
- Domestic: Strings created literally are added to StringTable. Strings managed by StringTable are not repeated.
- Wild:
char[]
.byte[]
.int[]
.String
, as well as+
The way is essentially usenew
Creating new string objects in the heap, regardless of string duplication, is a heavy memory drain.
The JDK uses a StringTable (which is a hash table in its data structure), where a string object acts as a key in the hash table. Non-repeatability of keys is a basic feature of hash tables.
[Example code]
String s1 = "abc"; / / domestic
String s2 = "abc"; / / domestic
String s3 = new String(new char[] {'a'.'b'.'c'}); / / wild
String s4 = "a" + "bc"; / / domestic
String x = "a";
String s5 = x + "bc"; / / wild
System.out.println(s1 == s2); // true
System.out.println(s1 == s3); // false
System.out.println(s1 == s4); // true
System.out.println(s1 == s5); // false
Copy the code
2.2. Storage Location of StringTable
-
JDK1.6 and earlier: StringTable is in the method area.
-
JDK1.8 and later: StringTable is in heap memory.
2.3. Intern () method
Strings provide the intern() method for de-duplication, giving string objects a chance to be managed by StringTable.
// Try to put the caller into StringTable
public native String intern(a);
Copy the code
2.3.1. StringTable already exists
Intern () always returns an existing string object in a StringTable.
String x = new String(new char[] {'a'.'b'.'c'}); / / wild
String y = "abc"; // add "ABC" to StringTable
String z = x.intern(); // If StringTable has "ABC", return "ABC" in StringTable.
System.out.println(z == x); // false
System.out.println(z == y); // true
Copy the code
2.3.2. StringTable does not exist (≥JDK7)
Intern () adds the caller to StringTable and returns the string object that already exists in StringTable.
String x = new String(new char[] {'a'.'b'.'c'}); / / wild
String z = x.intern(); // StringTable does not have "ABC", add x first, return "ABC"
String y = "abc"; // if StringTable contains "ABC", it is used directly
System.out.println(z == x); // true
System.out.println(z == y); // true
Copy the code
2.3.3. StringTable does not exist (≤JDK6)
Intern () copies the caller, adds it to StringTable, and returns the string object that already exists in StringTable.
String x = new String(new char[] {'a'.'b'.'c'}); / / wild
String z = x.intern(); // StringTable does not have "ABC", add "ABC" first, return "ABC"
String y = "abc"; // if StringTable contains "ABC", it is used directly
System.out.println(z == x); // false
System.out.println(z == y); // true
Copy the code
Principle of 2.3.4.
- Computes the hash value based on char[] and length.
- Calculates the bucket subscript of a Hashtable based on the hash value.
- Check if the string already exists in the hashtable:
- If it already exists: returns directly.
- If not: create a new string object, add it to the hashTable, and return.
2.4. The G1 to heavy
In JDK 8U20 and later, you can turn on the G1 garbage collector and enable string de-duplication.
-XX:+UseG1GC -XX:+UseStringDeduplication
Copy the code
The principle is to save memory by having multiple string objects reference the same char[].
In comparison with intern, G1 has the advantage of automatic de-duplication. However, the disadvantage is that even if the char[] is not repeated, the string object itself needs some memory (object header, value reference, hash), whereas intern only stores one copy of the string object, which is less memory.
3. The interview questions
① Judgment output
String str1 = "string"; / / domestic
String str2 = new String("string"); / / wild
String str3 = str2.intern(); / / domestic
System.out.println(str1 == str2); // false
System.out.println(str1 == str3); // true
Copy the code
② Judgment output
String baseStr = "baseStr";
final String baseFinalStr = "baseStr";
String str1 = "baseStr01"; / / domestic
String str2 = "baseStr" + "01"; / / domestic
String str3 = baseStr + "01"; / / wild
String str4 = baseFinalStr + "01"; / / domestic
String str5 = new String("baseStr01").intern(); // From wild to domestic
System.out.println(str1 == str2); // true
System.out.println(str1 == str3); // false
System.out.println(str1 == str4); // true
System.out.println(str1 == str5); // true
Copy the code
③ Judge the output (different versions)
String str2 = new String("str") + new String("01");
str2.intern(); // 1.6 Copy a copy to add; 1.7 Direct Add
String str1 = "str01";
System.out.println(str1 == str2); // [1.6] false, [1.7] true
Copy the code
④ Judge the output
String str1 = "str01";
String str2 = new String("str") + new String("01");
str2.intern(); // str2 itself does not change, str1==str3 when the return value is received by str3
System.out.println(str1 == str2); // false
Copy the code
5.String s = new String("xyz");
How many string objects are created?
2.” Xyz “is created in StringTable, and new String() is created in heap memory. But both refer to the same char[] array.
⑥ Judge output
String str1 = "abc"; / / domestic
String str2 = "abc"; / / domestic
System.out.println(str1 == str2); // true
Copy the code
⑦ Judgment output
String str1 = new String("abc"); / / wild
String str2 = new String("abc"); / / wild
System.out.println(str1 == str2); // false
Copy the code
⑧ Judge output
String str1 = "abc"; / / domestic
String str2 = "a";
String str3 = "bc";
String str4 = str2 + str3; // wild, variable + variable, StringBuilder splice
System.out.println(str1 == str4); // false
Copy the code
⑨ Judgment output
String str1 = "abc"; / / domestic
final String str2 = "a";
final String str3 = "bc";
String str4 = str2 + str3; // constant + constant, compile phase stitching
System.out.println(str1 == str4); // true
Copy the code
⑩ Determine output
String s = new String("abc");
String str1 = "abc";
String str2 = new String("abc");
System.out.println(s == str1.intern()); // false
System.out.println(s == str2.intern()); // false
System.out.println(str1 == str2.intern()); // true
Copy the code