String literals String constant pool When String literals enter the String constant pool String s=new String(“xyz”) refers to several object String “+” symbol implementation reference links

In Java, a constant pool can be added to a Class file’s constant pool at run time by using an intern() method that does not exist in the constant pool. And add a reference to a string object to the global string constant pool.

This section follows the previous section on the String source code, the String class intern() method for a more in-depth analysis of the summary.

String literals

String literals are defined in the Java™ language specification’s 3.10.5.String literals. String literals are created with double quotation marks (“”). After the object is created in the heap, the reference is inserted into the string constant pool (after jdk1.7). They can be used globally and do not need to be created again if the literal has the same content. Here’s an example:

String str1 = "abc";    // The runtime creates a new "ABC" object in the heap, stores its reference to the string constant pool, and returns it to str1

String str2 = new String("abc");    // The runtime first checks the string constant pool to see if there is a reference to the "ABC" object. An "ABC" object is then created in the heap and a reference to that object is returned to STR2

Copy the code

String constant pool

In the previous section, we talked about the location and content of the string constant pool. Here we go into more details.

The first is the validation of what is stored in the string constant pool. In JDK6, the constant pool is located in the permanent generation (method area) where objects are stored. In JDK7, the location of the constant pool is in the heap, and at this point, the constant pool stores references. In JDK8, permanent generations (method areas) are replaced by meta-spaces. Let’s verify this with an example:

String s1 = new String("abc");

String s2 = s1.intern();

String s3 = "abc";

System.out.println(s1 == s3);

System.out.println(s2 == s3);

System.out.println(System.identityHashCode(s1));

System.out.println(System.identityHashCode(s3));





String s4 = new String("3") + new String("3");

String s5 = s4.intern();

String s6 = "33";

System.out.println(s4 == s6);

System.out.println(s5 == s6);

System.out.println(System.identityHashCode(s4));

System.out.println(System.identityHashCode(s6));

Copy the code

Execution Result:

jdk6

false

true

536468534

796216018

false

true

1032010069

1915296511



jdk7

false

true

1163157884

1956725890

true

true

356573597

356573597

Copy the code

To better explain, let’s use a graphical analysis of what’s going on.

JDK6

String s1 = new String(“abc”); The runtime creates two objects, an “ABC” object in the heap and an “ABC” object in the string constant pool, and returns the address of the object in the heap to S1.

String s2 = s1.intern(); Find the object with the same content as s1 variable in the constant pool, find the object with the same content “ABC” already exists, return the address of the object, assign the value to S2.

String s3 = “abc”; If there is an object with the same content in the constant pool, return the address of the object “ABC” and assign it to S3.

String s4 = new String(“3”) + new String(“3”); The runtime creates four objects, a “33” object in the heap and a “3” object in the constant pool. There are also two anonymous new strings (“3”) in the middle, and we won’t talk about them here.

String s5 = s4.intern(); Search the constant pool for an object with the same content as the object “33”. If no object “33” is found, create an object “33” in the constant pool and return the address of the object “33” to S5.

String s6 = “33”; If there is an object with the same content in the constant pool, return the address of object “33” and assign it to s6.

System.out.println(s4 == s6); The s4 and S6 addresses do not refer to the same object, so return false.

JDK7



String s1 = new String("abc"); The runtime creates two objects, an “ABC” object in the heap and an “ABC” object created in the heap, and stores the reference address of the “ABC” object in the constant pool.

String s2 = s1.intern(); Find the object reference with the same content as s1 variable in the constant pool, find that there is already a reference to the same content of the object “ABC”, return the object reference address, assign the value to S2.

String s3 = “abc”; If there are any references to objects with the same content in the constant pool, return the reference address of object “ABC” and assign the value to s3.

String s4 = new String(“3”) + new String(“3”); The runtime creates four objects, a “33” object in the heap and a “3” object created in the heap, and stores the reference address of the “3” object in the constant pool. There are also two anonymous new strings (“3”) in the middle, and we won’t talk about them here.

String s5 = s4.intern(); Search the constant pool for object reference with the same content as “33” object reference, find no “33” object reference, save the address of “33” object corresponding to S4 to the constant pool, and return to S5.

String s6 = “33”; If there is a reference to the same object in the constant pool, return the reference address of object “33” and assign the value to s6.

System.out.println(s4 == s6); The s4 and S6 addresses refer to the same object, so return true.

The string constant pool holds objects differently in JDK6 than it does in JDK7.

To make sense of the intern() method, tweak the code above and see what happens.

String s1 = new String("abc");

String s3 = "abc";

String s2 = s1.intern();

System.out.println(s1 == s3);

System.out.println(s2 == s3);





String s4 = new String("3") + new String("3");

String s6 = "33";

String s5 = s4.intern();

System.out.println(s4 == s6);

System.out.println(s5 == s6);

Copy the code

Execution Result:

jdk6

false

true

false

true



jdk7

false

true

false

true

Copy the code

If (jdk7+ = jdk7+ = jdk7+ = jdk7+ = jdk7+ = jdk7+ = jdk6) The address of the object in the constant pool and the address of the variable do not refer to the same object, which is false.

JDK6


JDK7


Insert a picture description here

A string literal is when it enters the string constant pool

/ / code

String s4 = new String("3") + new String("3");

String s6 = "33";

String s5 = s4.intern();

System.out.println(s4 == s6);

System.out.println(s5 == s6);



/ / code 2

String s4 = new String("3") + new String("3");

String s6 = "33";

String s5 = s4.intern();

Copy the code

Compile the code, and then view its bytecode through the Javap command.

Insert a picture description here


You can see from the bytecode file that there is “33” in the Class file constant pool, but at runtime, depending on the intern() method location, it is executed in code one
String s5 = s4.intern();There is no reference to the “33” object in the string constant pool when executed in code 2
String s5 = s4.intern();Statement finds a reference to the “33” object in the string constant pool, the difference being
String s6 = "33";So when does a literal in the Class file constant pool go into the string constant pool? In the last section
Method areas and constant pools in JavaThe association of the three constant pools in Chinese has been explained. If you don’t know, you can refer to Zhihu
When does a “literal” in a new String(” literal “) enter the String constant pool?The great god explained this in great detail.

In a nutshell:

  • In the implementation of HotSpot VM, when a class is loaded, those string literals go into the runtime constant pool of the current class, not into the global string constant pool. After resolve (resolve), instances of string objects from the constant pool of these class files are created in the heap and their references reside in the string constant pool.
  • When literals are assigned, they are translated into bytecode LDC instructions, which trigger lazy resolution actions

To the runtime constant pool of the current class In HotSpot VM, ConstantPool + ConstantPoolCache) finds the index item and resolves if it has not already resolved. When it encounters a String constant, the resolve procedure returns a java.lang.String reference if it finds that StringTable already has a java.lang.String reference that matches the content. If StringTable does not already have a reference to a matching String instance, it creates a corresponding String object in the Java heap, records the reference in StringTable, and returns the reference.

String s=new String(“xyz”) involves several objects

It’s always been the case that when creating a string, go to the string constant pool to see if there’s a literal, and if there isn’t one, create one. That’s never true.

String s = new String(“xyz”); How many String instances have been created

String s = new String(“xyz”); String s = new String(“xyz”); How many String objects are created?” If there is a reference to an “xyz” object in the constant pool, then only one object has been created; Otherwise, two objects are created.

After learning about String objects and understanding of JVM memory during this period of time, looking back at this question, I will feel that there is ambiguity in the first question, the main idea is not clear, of course, there is no reasonable answer. Object creation and class loading mechanisms will be covered next.

To illustrate the creation of an object:


From the figure we can see that the steps for object creation are as follows

  • Execute new instruction
  • Check that the instruction argument can locate a symbolic reference to a class in the constant pool and that the class represented by the symbolic reference has been loaded, parsed, and initialized.
  • If the class is not loaded, the class is loaded first
  • If the class is already loaded, the object is allocated memory in the JVM heap.
  • During the VM initialization operation, the allocated space of the VM is initialized to zero.
  • Execute the init method to initialize the properties of the object, and the object is created.
  • Reference in the Java virtual machine stack executes the object we just created.

Java bytecode evolved code

0new  #2//class java/lang/String

3: dup  

4: ldc  #3//String xyz

6: invokespecial    #4//Method java/lang/String." ":(Ljava/lang/String;) V

9: astore_1 

Copy the code

In the Java language, the “new” expression is responsible for creating the instance, which calls the constructor to initialize it; The constructor itself returns a value of type void. Instead of “the constructor returns a reference to the newly created object,” the value of the new expression is a reference to the newly created object.

In contrast, in the JVM, the “new” bytecode instruction only creates instances (allocating space, setting types, setting default values for all fields, and so on) and pushes references to newly created objects to the top of the operand stack. At this point, the reference cannot be used directly and is in the uninitialized state. If method A contains code that attempts to call any instance method with a reference to an uninitialized state, method A will fail the JVM’s bytecode check and be rejected by the JVM.

The only thing you can do with an uninitialized reference is call the instance constructor through it, represented at the Class file level as a special initialization method “”. The actual call instruction is invokespecial, and the required parameters are pushed onto the operand stack in sequence before the actual call. In the bytecode example above, the instructions for pressing arguments include dUP and LDC, which push the hidden argument (a reference to the newly created instance, which is “this” to the instance constructor) and the first actual argument explicitly declared (a reference to the “xyz” constant) onto the operand stack, respectively. After the constructor returns, the reference to the newly created instance is ready for normal use.

Here we introduce the concept of class loading. It is important to note that most of the loading we usually refer to is not the class loading mechanism, but the first step of the class loading mechanism. Details are as follows:

After the code is compiled, a binary byte stream file (*.class) is generated that the JVM (Java Virtual Machine) recognizes. The JVM loads the Class description data from a Class file into memory, verifies, transforms, and initializes the data into Java types that the JVM can use directly. This simple but complex process is called the JVM’s Class loading mechanism.

The “classes” in the Class file have seven life cycle phases from loading into JVM memory to unloading out of memory. The class loading mechanism consists of the first five phases.

As shown below:


The start sequence of loading, verification, preparation, initialization and uninstallation is determined. Note that the start sequence is only in order, and the end sequence is not certain. The parsing phase may begin after initialization.

In addition, class loading does not have to wait for “first use” in the program, and it is allowed for the JVM to preload certain classes. (Class loading timing)

After the class-loading phase, the string literal goes into the string constant pool, including the initial value set for the static variable assignment. For a tutorial on JVM class loading, see the JVM Class loading Process.

String s=new String(“xyz”) This line of code runs in two phases: the class-loading phase and when the snippet itself executes. So when the question is “String s=new String(“xyz”) involves several objects at runtime”, the reasonable answer is:

Two, one is an object created in the heap by the String literal “xyz” and its reference resides (intern) in the globally shared String constant pool, and the other is an object created and initialized in the heap by new String(String) with the same content as “xyz”

“String s=new String(“xyz”) involves several objects during class loading”, a reasonable answer to the question is one.

Something else, if the question is “String s=new String(” Java “) involves several objects at runtime”, the answer is no longer two, but only one. For a more detailed explanation, see the article: How to understand the String.intern() method in Understanding the Java Virtual Machine (2nd edition).

Simply put, when the above code runs, there are already “Java” string literals in the string constant pool, so no “Java” objects are created during class loading.

Implementation of the String “+” symbol

We often use the + symbol to concatenate strings, but the implementation of the + symbol in strings is a bit tricky. If the addition contains a String, the bottom is concatenated using StringBuilder.

Here are some examples:

int n = 3;

String s1 = new String("3"+"3"+n);

s1.intern();

String s2 = "333";

System.out.println(s1 == s2);//true



String s3 = new String("a"+"bc");

final String s4 = "re";

String s5 = s4+"rt";

Copy the code

View the compiled bytecode file:


When adding a string variable or other primitive type variable, note that it cannot be final. The underlying concatenation is done using StringBuilder. If string objects are added directly, or final variables are added to string objects, they are concatenated directly at compile time, eliminating the need to use StringBuilder.

Refer to the link

  • Please do not use “String s = new String(“xyz”); How many String instances have been created
  • What do you make of the example given in the explanation of the String.intern() method in The Second edition of Understanding the Java Virtual Machine?
  • When you’re new a String, do you really create one in the constant pool if there’s no corresponding literal? I doubt it.
  • JVM class loading process
  • Java Basics: String — String Constant Pool and Intern (2)
  • The JVM’s memory objects are created and accessed in detail