This article mainly introduces some contents related to String in Java, including the implementation of String class and its invariance, the implementation of String related classes (StringBuilder, StringBuffer) and the usage and implementation of String caching mechanism.
Design and implementation of String class
The core logic of the String class is to implement String objects by encapsulating char arrays, but the implementation details have changed several times as the Java version has evolved.
Java 6
public final class String implements java.io.Serializable.Comparable<String>, CharSequence
{
/** The value is used for character storage. */
private final char value[];
/** The offset is the first index of the storage that is used. */
private final int offset;
/** The count is the number of characters in the String. */
private final int count;
/** Cache the hash code for the string */
private int hash; // Default to 0
}
Copy the code
In Java 6, the String class has four member variables: value, a char array, offset, count, and hash. The value array is used to store the sequence of characters, the offset and count attributes are used to locate the position of the string in the value array, and the hash attribute is used to cache the hashCode of the string.
The purpose of using offset and count to locate the value array is to share the value array efficiently and quickly. For example, substring() returns a substring that shares the value array with the original string by recording offset and count, rather than making a new copy. Substring () is implemented as follows:
String(int offset, int count, char value[]) {
this.value = value; // Reuse the original array directly
this.offset = offset;
this.count = count;
}
public String substring(int beginIndex, int endIndex) {
/ /... Omit some boundary checking code......
return ((beginIndex == 0) && (endIndex == count)) ? this :
new String(offset + beginIndex, endIndex - beginIndex, value);
}
Copy the code
This approach, however, is likely to result in memory leaks. For example, in the following code:
String bigStr = new String(new char[100000]);
String subStr = bigStr.substring(0.2);
bigStr = null;
Copy the code
After bigStr is set to null, the value array in it is still referenced by subStr, causing the garbage collector to fail to reclaim it. As a result, we only need 2 characters of space, but we actually use 100000 characters of space.
In Java 6, if you want to avoid this type of memory leak, you can use the following methods:
String subStr = bigStr.substring(0.2) + "";
/ / or
String subStr = new String(bigStr.substring(0.2));
Copy the code
After the statement is executed, the anonymous String returned by the substring method can be collected by the garbage collector because it is not referenced by any other object. It will not continue to refer to the value array in bigStr, thus avoiding a memory leak.
Java 7 & Java 8
public final class String implements java.io.Serializable.Comparable<String>, CharSequence {
/** The value is used for character storage. */
private final char value[];
/** Cache the hash code for the string */
private int hash; // Default to 0
}
Copy the code
In Java 7-Java 8, Java has made some changes to the String class. The String class no longer has the offset and count variables. The substring() method also no longer shares the value array, but instead copies the array from the specified location, thus solving the memory leak that can occur with this method. Substring () is implemented as follows:
public String(char value[], int offset, int count) {
/ /... Omit some boundary checking code......
// Copy from the original array
this.value = Arrays.copyOfRange(value, offset, offset+count);
}
public String substring(int beginIndex, int endIndex) {
/ /... Omit some boundary checking code......
return ((beginIndex == 0) && (endIndex == value.length)) ? this
: new String(value, beginIndex, subLen);
}
Copy the code
Java 9
public final class String implements java.io.Serializable.Comparable<String>, CharSequence {
/** The value is used for character storage. */
private final byte[] value;
/** The identifier of the encoding used to encode the bytes in {@code value}. */
private final byte coder;
/** Cache the hash code for the string */
private int hash; // Default to 0
}
Copy the code
In order to save memory, Java 9 optimizes the implementation of String. The value variable is changed from char[] to byte[], and a new coder variable is added. We know that in Java, char takes two bytes, which is a bit wasteful for characters that only take one byte (for example, a-z, a-z), so in Java 9, char[] is changed to byte[] to store character sequences. The new attribute coder is used to indicate whether the value array contains double-byte encoded characters or single-byte encoded characters. The coder attribute can have two values, 0 for Latin-1 (single-byte encoding) and 1 for UTF-16 (double-byte encoding). When creating a string, if it is determined that all characters can be encoded in a single byte, use latin-1 encoding to compress the space, otherwise use UTF-16 encoding. The main constructor implementation is as follows:
String(char[] value, int off, int len, Void sig) {
if (len == 0) {
this.value = "".value;
this.coder = "".coder;
return;
}
if (COMPACT_STRINGS) {
byte[] val = StringUTF16.compress(value, off, len); // Try to compress the string and store it in single-byte encoding
if(val ! =null) { // Compression is successful and can be stored using single-byte encoding
this.value = val;
this.coder = LATIN1;
return; }}// Otherwise, use double byte encoding for storage
this.coder = UTF16;
this.value = StringUTF16.toBytes(value, off, len);
}
Copy the code
The invariance of the String class
Notice that the String class is decorated with final; All properties are declared private; And all properties other than the hash property are final. This guarantees:
String
Class byfinal
Decorates, so it cannot be inheritedString
Class changes its semantics;- All properties are declared as
private
So can’t inString
externaldirectlyAccess or modify its properties; - In addition to
hash
All other properties are usedfinal
Decoration to indicate that these properties cannot be modified after initial assignment.
Together, these definitions implement an important feature of the String class — immutability. Once a String has been created, nothing can be done to it. The substring(), concat(), replace(), and other methods of String return the newly created String, not the original String.
The reason the hash property is not final is that the hashCode of a String does not need to be evaluated and assigned immediately when the String is created, but rather when the hashCode() method is called.
Why is the String class designed to be immutable?
- ensure
String
Object security.String
Widely used asJDK
As parameters, return values such as network connection, opening files, class loading, and so on. ifString
Object is mutable, soString
Objects can be maliciously modified, raising security concerns. - Thread safe.
String
The immutability of a class naturally makes it thread-safe. - To ensure the
String
The object’shashCode
Invariance of.String
The immutability of thehashCode
Values can be cached after the first calculation and do not need to be repeated thereafter. This makes theString
Object is good forHashMap
Such as the containerKey
And is more efficient than other objects. - implementation
String constant pool
.Java
Designed for string objectsString constant pool
To share strings and save memory. If the string is mutable, then the string object cannot be shared. Because if you change the value of one object, then the values of the other objects will change accordingly.
The class related to the String class
In addition to String, there are two classes related to String: StringBuffer and StringBuilder. These classes can be considered mutable versions of String, providing various methods for modifying strings. The difference is that a StringBuffer is thread safe and a StringBuilder is not thread safe.
StringBuffer/StringBuilder implementation
Both StringBuffer and StringBuilder are inherited from AbstractStringBuilder. AbstractStringBuilder uses a variable char array (changed to a byte array after Java 9) to implement various modifications to strings. Both StringBuffer and StringBuilder call methods in AbstractStringBuilder to manipulate strings. The difference between them is that the StringBuffer class uses synchronized modifier to modify strings. StringBuilder doesn’t, so StringBuffer is thread safe, StringBuilder is not thread safe.
Using Java 8 as an example, look at the AbstractStringBuilder class implementation:
abstract class AbstractStringBuilder implements Appendable.CharSequence {
/** The value is used for character storage. */
char[] value;
/** The count is the number of characters used. */
int count;
}
Copy the code
The value array is used to store a sequence of characters, and the count is used to store the number of characters that have been used in the value array. The real content of a string is the sequence of characters between [0,count] in the value array. The reason why we need the count attribute to record the used space is that the value array in AbstractStringBuilder is not reapplied for every change, but preallocated some extra space in advance. This reduces the number of times the array space has to be reallocated (similar to ArrayList).
The strategy for expanding the value array is as follows: When modifying the string, if the current value array does not meet the space requirements, a larger value array will be allocated. The allocated array size is min(the original array size ×2 + 2, the required array size). For more detailed logic, please refer to the following code:
private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;
private int newCapacity(int minCapacity) {
// overflow-conscious code
int newCapacity = (value.length << 1) + 2; // The size of the original array ×2 + 2
if (newCapacity - minCapacity < 0) { // If less than the required space size, expand to the required space size
newCapacity = minCapacity;
}
return (newCapacity <= 0 || MAX_ARRAY_SIZE - newCapacity < 0)? hugeCapacity(minCapacity) : newCapacity; }private int hugeCapacity(int minCapacity) {
if (Integer.MAX_VALUE - minCapacity < 0) { // overflow
throw new OutOfMemoryError();
}
return (minCapacity > MAX_ARRAY_SIZE)
? minCapacity : MAX_ARRAY_SIZE;
}
Copy the code
AbstractStringBuilder also provides a trimToSize method to free up excess space:
public void trimToSize(a) {
if(count < value.length) { value = Arrays.copyOf(value, count); }}Copy the code
The caching mechanism for String objects
Because strings are so widely used, Java has designed a caching mechanism for strings to improve both time and space efficiency. In the JVM’s runtime data area, there is a String Pool that holds all cached strings. When we say a String is interned, we mean that it is in the String Pool.
We understand the String caching mechanism by answering the following three questions:
- What are the
String
Objects will be cachedString constant pool
? String
Where are objects cached and how are they organized?String
When does the object enterString constant pool
?
Description: Unless otherwise specified, all JVM implementations mentioned in this article refer to Oracle’s HotSpot VM, The test code was run without any additional JVM parameters, regardless of escape analysis, scalar replacement, dead-code elimination, and other optimizations.
Preliminary knowledge
For a better reading experience, before answering the above three questions, we hope that readers have a brief understanding of the following knowledge points:
JVM
Runtime data areaThe class file
The structure of theJVM
Stack-based bytecode interpretation execution engine- The class loading process
Java
Several constant pools in
For the sake of completeness, we will briefly introduce two of the more discussed below.
The class loading process
The entire life cycle of a class from the time it is loaded into the memory of the virtual machine to the time it is unloaded from the memory is as follows: There are seven stages: Loading, Verification, Preparation, Resolution, Initialization, Using, and Unloading. Among them, verification, preparation and parsing are collectively called Linking. Load, validation, preparation, initialization and unload the order of the five phases is certain, the class loading process must, in accordance with the order, step by step while parsing stage does not necessarily: in some cases it can start again after the initialization phase, this is to support the Java language runtime binding (also called dynamic binding or late binding).
Several constant pools in Java
We know that source code files with the Java suffix are compiled by Javac into class files (bytecode files) with the class suffix. Part of the class file is the Constant Pool. This Constant Pool stores two main classes of constants:
- In the code
literal
orConstant expression
The value of the; - Symbolic references, including fully qualified names for classes and interfaces, names and descriptors for fields, and names and descriptors for methods.
2. Run-time Constant Pool In JVM run-time Data Areas, part of the run-time Constant Pool is part of the method area. The run-time Constant Pool is a run-time representation of the Constant Pool for each class or interface in the class file. The contents of the Constant Pool in the class file enter the run-time Constant Pool in the method area after the class is loaded.
The String Pool is the same as the Pool of constants we mentioned earlier that can be used to cache strings. This constant pool is shared globally and is part of the runtime data area.
Which Strings will be cached in the String constant pool?
In Java, there are two types of strings that are cached in the String constant pool: String literals or String constant expressions defined in the code, or when the program actively calls the string.intern () method to cache the current String object into the String constant pool. The following are two ways to make a brief introduction.
1. Implicit cache – Literal or constant string expressions
It’s called implicit caching because we don’t have to actively write caching code, the compiler and JVM do it for us.
String literals The first type of string that will be implicitly cached is a string literal. Literals are the source code representation of a value of type primitive, String, or NULL. Such as:
int i = 100; // Int literal
double f = 10.2; // A literal of type double
boolean b = true; // Boolean type literals
String s = "hello"; // String literals
Object o = null; // Null literal
Copy the code
A string literal consists of zero or more characters enclosed in double quotes. Java creates strings for String literals during execution and adds them to the pool of String constants. For example, “hello” in the above code is a String literal. In the execution process, we first create a String containing “hello”, cache it in the String constant pool, and then refer s to the String.
For more details on String Literals, see the Java Language specification (JLS-3.10.5.String Literals).
Another type of string that can be implicitly cached is a string constant expression. A constant expression refers to an expression that represents a value of a simple type or a String. A constant expression is an expression whose value can be determined at compile time. A String constant expression is a constant expression that represents a String. Such as:
int a = 1 + 2;
double d = 10 + 2.01;
boolean b = true & false;
String str1 = "abc" + 123;
final int num = 456;
String str2 = "abc" +456;
Copy the code
Java creates a String object for the String constant expression during execution and adds it to the String constant pool. For example, in the code above, we create two strings, “abc123” and “abc456”, which are cached in the String constant pool. Str1 points to the String with the value “abc123” in the constant pool. Str2 will point to a String in the constant pool with a value of “abc456”.
See the Java Language specification (JLS-15.28 Constant Expressions) for more details on Constant Expressions.
2. Active cache – string.intern () method
In addition to being declared as String literals/String constant expressions, strings obtained in other ways can also be actively added to the String constant pool. Such as:
String str = new String("123") + new String("456");
str.intern();
Copy the code
In the above code, after the first sentence, there are two strings in the constant pool with contents “123” and “456”, but no String “123456”, but after the execution of STR. Intern (); After that, the String containing “123456” is added to the String constant pool.
Let’s look at the caching mechanism in detail with the String. Intern () method’s comments:
When the intern method is invoked, if the pool already contains a string equal to this String object as determined by the equals(Object) method, then the string from the pool is returned. Otherwise, this String object is added to the pool and a reference to this String object is returned. It follows that for any two strings s and t, s.intern() == t.intern() is true if and only if s.equals(t) is true.
Simple translation:
When the INTERN method is called, if the constant pool already contains strings of the same content (as determined by the equals (Object) method, or in the case of strings, the same sequence of characters), then the String Object in the constant pool is returned. Otherwise, the String is added to the constant pool and a reference to the String is returned. Thus, for any two strings s and t, the result of s.inntern () == t.inntern () is true if and only if the result of s.innterals (t) is true.
Where are strings cached and how are they organized?
In HotSpot VM, there is a global table for recording cached Strings called StringTable, which is similar in structure and implementation to Java HashMap or HashSet. It is a hash table that uses the zipper method to resolve hash conflicts. You can simply think of it as HashSet
, noting that it only stores references to strings, not instances of strings. In general, when we say that a string is in the string constant pool we mean that there’s a reference to it in this StringTable, and conversely, if it’s not in there we mean that there’s no reference to it in StringTable.
Real String objects are stored in a different area. In Java 6, String objects in the String constant pool are stored in the permanent generation (HotSpot VM’s implementation of the method area before Java 8), and after Java 6, String objects in the String constant pool are stored in the heap.
Java 7 moves objects from the string constant pool to the heap because in Java 6, objects in the string constant pool are created in the permanent generation, and the size of the permanent generation is not set to be too large. If you use the string cache in large numbers, you may cause an OOM exception on the permanent generation.
When does a String enter the pool of String constants?
For a String that is actively cached into the constant pool by calling the string.intern () method in the program, it is obvious that the String is entered into the constant pool when the intern() method is called.
Let’s focus on the two types of values that are implicitly cached (string literals and string constant expressions). There are two main problems:
- We didn’t call it
String
Class constructor, so when are they created? - When they enter
String constant pool
?
Let’s analyze these two problems with the following code example:
public class Main {
public static void main(String[] args) {
String str1 = "123" + 123; // String constant expression
String str2 = "123456"; / / literal
String str3 = "123" + 456; // String constant expression}}Copy the code
Bytecode analysis
After compiling the above code, we use Javap to take a look at the bytecode file. To save space, we only extract the relevant sections: the constant pool table section and the main method information section:
Constant pool:
#1 = Methodref #5.#23 // java/lang/Object."<init>":()V
#2 = String #24 / / 123123
#3 = String #25 / / 123456
/ /... Omit...
#24 = Utf8 123123
#25 = Utf8 123456
/ /... Omit...
public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
flags: ACC_PUBLIC, ACC_STATIC
Code:
stack=1, locals=4, args_size=1
0: ldc #2 // String 123123
2: astore_1
3: ldc #3 // String 123456
5: astore_2
6: ldc #3 // String 123456
8: astore_3
9: return
LineNumberTable:
line 7: 0
line 8: 3
line 9: 6
line 10: 9
LocalVariableTable:
Start Length Slot Name Signature
0 10 0 args [Ljava/lang/String;
3 7 1 str1 Ljava/lang/String;
6 4 2 str2 Ljava/lang/String;
9 1 3 str3 Ljava/lang/String;
Copy the code
In the constant pool, there are two types of constants associated with strings, CONSTANT_String and CONSTANT_Utf8. CONSTANT_String is used to represent a constant object of type String. Its contents are only the index value of a constant pool. The member at index must be of type CONSTANT_Utf8. A constant of type CONSTANT_Utf8 is used to store the actual string contents. For example, items 2 and 3 in the constant pool are CONSTANT_String and store indexes 24 and 25, respectively. Items 24 and 25 in the constant pool are CONSTANT_Utf8 and store values “123123” and “123456”, respectively.
In the method information of a class file, the Code attribute is one of the most important parts in the class file, which contains the vm instruction corresponding to the execution statement, exception table, local variable information, etc., where LocalVariableTable is the information of the local variable, and Slot can be understood as the index position in the LocalVariableTable. The function of the LDC instruction is to extract data from the run-time constant pool at the specified index position and push it into the stack; The astore_
instruction pops a value of a reference type off the stack and stores it in the local variable table at the location specified by
. You can see that the bytecode instructions for all three assignment statements are the same:
ldc #<index> // First push the String in the constant pool to the stack
astore_<n> // Then pop the String from the stack and save it to the specified location of the local variable
Copy the code
Operation process analysis
Again with the above code in mind, let’s examine the creation and caching timing of literal and constant string expressions in conjunction with the compilation to execution process.
First, the first step is for Javac to compile the source code into a class file. During source compilation, the two types of value string literals (“123456”) and string constant expressions (“123” + 456) mentioned above are stored in the constant pool of the compiled class file. The constant type is CONSTANT_String. Two points to note:
String constant expression
The true value will be calculated at compile timeclass
Of the fileConstant pool
In the. Such as in the source code above"123" + 123
This expression is inclass
The constant pool representation of a file is123123
."123" + 456
This expression is inclass
The constant pool representation of a file is123456
;- The value of the same
String literals
orString constant expression
inThe class file
There is only one constant item in the constant pool ofCONSTANT_String
The type andCONSTANT_Utf8
Each has only one term). For example, in the above source code, although the two constants are declared as"123456"
and"123" + 456
But in the endclass
The file’s constant pool has only one value of123456
theCONSTANT_Utf8
Constant term and a correspondingCONSTANT_String
Constants.
During the JVM runtime, when the Main class is loaded, the JVM creates a runtime constant pool based on the class file’s constant pool. The contents of the class file’s constant pool are entered into the method area’s runtime constant pool when the class is loaded. Symbolic references in the constant pool of class files are converted to real values during the resolve phase of class loading. In HotSpot, however, the Resolution of symbolic references is not necessarily performed immediately upon class loading, but is deferred until the first execution of the relevant instruction (jLS-5.4.3.resolution). This is done with “lazy” or “late” resolution.
- For some basic types of constant terms, such as
CONSTANT_Integer_info
.CONSTANT_Float_info
.CONSTANT_Long_info
.CONSTANT_Double_info
During the class loading phaseclass
File constant pool value toRuntime constant pool
, respectivelyC++
In theint
.float
.long
.double
Type; - for
CONSTANT_Utf8
Is converted to during the parse phase of class loadingSymbol
Object (HotSpot VM
One of the layersC++
Object). At the same timeHotSpot
useSymbolTable
(the structure andStringTable
Similar) to cacheSymbol
Object, so after the class is loaded,SymbolTable
Should have allCONSTANT_Utf8
Constant correspondingSymbol
Object; - And for
CONSTANT_String
Type, since its content is a symbolic reference (point to)CONSTANT_Utf8
The index value of a constant of type), so it needs to be parsed, which is converted tojava.lang.String
Object corresponding tooop
(can be understood asJava
Objects in theHotSpot VM
Layer) and useStringTable
To cache. butCONSTANT_String
Constants of type, as mentioned aboveDelay resolution
That is, parsing is not performed immediately upon class loading, but when the relevant instruction is executed for the first time (generallyldc
Instruction) is actually parsed.
As mentioned above, the JVM performs real parsing when the instruction is first executed. For the above code, look at the bytecode and you can see that the LDC instruction uses symbolic references, so parsing is required when executing the LDC instruction. So what does the LDC directive actually do?
The LDC directive looks for the constant item corresponding to the specified index from the run-time constant pool and pushes it onto the stack. If the item is not resolved, it needs to be parsed to convert the symbolic reference to a concrete value before it is pushed onto the stack. If the unparsed item is a constant of type String, we first look for a String object with the same content in the String constant pool. If so, we push the object directly from the String constant pool. If not, a new String is created and added to the String constant pool, and the created new object is pushed onto the stack. If you declare multiple String literals or String constant expressions of the same content in your code, you will only create a String when the LDC instruction is executed for the first time, and then the constant at the corresponding position will be parsed when the same LDC instruction is executed and pushed directly onto the stack.
To summarize:
- During compilation, source code
String literals
orString constant expression
Conversion toThe class file
Of the constant poolCONSTANT_String
Constants. - During the class loading phase,
The class file
theConstant pool
In theCONSTANT_String
The constant term is storedRuntime constant pool
, but the saved content is still a symbolic reference, unparsed. - In the instruction execution phase, when the first execution
ldc
Instruction,Runtime constant pool
In theCONSTANT_String
The item is not parsed yet, it will actually be parsed, and it will be created during parsingString
Object and addstring
The constant pool.
Cache critical source analysis
As you can see, the LDC directive is very similar to the logic of the string.intern () method when parsing a String constant:
ldc
Parsing in instructionsString
Constant: first fromString constant pool
To find out whether there is the same contentString
Object, pushes it onto the stack if it exists, creates a new object if it does not existString constant pool
And push it on the stack.String.intern()
Method: Start withString constant pool
To find out whether there is the same contentString
Object, returns a reference to the object if it exists, or adds itself if it does notString constant pool
And return.
In fact, on the HotSpot internal implementation, the LDC instruction calls the same internal method as the native method corresponding to string.intern (). We in its eight source code, for example, analyse the process simple, code is as follows (source location: SRC/share/vm/classfile/SymbolTable CPP) :
// The string.intern () method calls this method
// The "oop String "argument represents the string in which the intern() method was called
oop StringTable::intern(oop string, TRAPS)
{
if (string == NULL) return NULL;
ResourceMark rm(THREAD);
int length;
Handle h_string (THREAD, string);
jchar* chars = java_lang_String::as_unicode_string(string, length, CHECK_NULL); // Convert a String to a sequence of characters
oop result = intern(h_string, chars, length, CHECK_NULL);
return result;
}
// This method is called when the LDC instruction is executed
// The parameter "Symbol* Symbol "is a Symbol object in the run-time constant pool corresponding to the LDC directive's parameter (index position)
oop StringTable::intern(Symbol* symbol, TRAPS) {
if (symbol == NULL) return NULL;
ResourceMark rm(THREAD);
int length;
jchar* chars = symbol->as_unicode(length); // Convert the Symbol object to a sequence of characters
Handle string;
oop result = intern(string, chars, length, CHECK_NULL);
return result;
}
// Both methods call this method
oop StringTable::intern(Handle string_or_null, jchar* name, int len, TRAPS) {
// Try to find it from the string constant pool
unsigned int hashValue = hash_string(name, len);
int index = the_table()->hash_to_index(hashValue);
oop found_string = the_table()->lookup(index, name, len, hashValue);
// Return if found
if(found_string ! = NULL) { ensure_string_alive(found_string);return found_string;
}
/ /... Omit part of the code......
Handle string;
// Try to reuse the original string. If it cannot be reused, a new string is created
// The implementation here is a bit different in JDK 6. Only when string_or_null already exists in the permanent generation will it be reused
if(! string_or_null.is_null()) { string = string_or_null; }else {
string = java_lang_String::create_from_unicode(name, len, CHECK_NULL);
}
/ /... Omit part of the code......
oop added_or_found;
{
MutexLocker ml(StringTable_lock, THREAD);
// Add string to StringTable
added_or_found = the_table()->basic_add(index, string, name, len,
hashValue, CHECK_NULL);
}
ensure_string_alive(added_or_found);
return added_or_found;
}
Copy the code
Case analysis
Note: Because the string constant pool was moved from the permanent generation to the heap after Java 6, there may be some code in Which Java 6 behaves differently from later versions. So the following code is tested using Java 6 and Java 7 separately. If not specified, the results are the same on both versions, and if they are different, they are indicated separately.
final int a = 4;
int b = 4;
String s1 = "123" + a + "567";
String s2 = "123" + b + "567";
String s3 = "1234567";
System.out.println(s1 == s2);
System.out.println(s1 == s3);
System.out.println(s2 == s3);
Copy the code
Results:
false
true
false
Copy the code
Explanation:
- Third row, because
a
Is defined as constant, so"123" + a + "567"
Is aConstant expression
Is compiled to"1234567"
, so it will be inString constant pool
Created in the"1234567"
.s1
Point to theString constant pool
In the"1234567"
; - In the fourth row,
b
Is defined as a variable,"123"
and"567"
isString literals
So first of all in theString constant pool
Created in the"123"
and"567"
And then throughStringBuilder
Implicit splices are created in the heap"1234567"
.s2
Pointing to the heap"1234567"
; - The fifth line,
"1234567"
Is aString literals
Because at this pointString constant pool
Is already there"1234567"
, sos3
Pointing to stringString constant pool
In the"1234567"
.
String s1 = new String("123");
String s2 = s1.intern();
String s3 = "123";
System.out.println(s1 == s2);
System.out.println(s1 == s3);
System.out.println(s2 == s3);
Copy the code
Results:
false
false
true
Copy the code
Explanation:
- The first line,
"123"
Is aString literals
So first of all in theString constant pool
Is created in"123"
Object, and then useString
The constructor of creates one in the heap"123"
Object,s1
Pointing to the heap"123"
; - Second row, because
String constant pool
It’s already there"123"
, sos2
Point to theString constant pool
In the"123"
; - The third row, again because
String constant pool
It’s already there"123"
, sos3
Point to theString constant pool
In the"123"
.
String s1 = String.valueOf("123");
String s2 = s1.intern();
String s3 = "123";
System.out.println(s1 == s2);
System.out.println(s1 == s3);
System.out.println(s2 == s3);
Copy the code
Results:
true
true
true
Copy the code
Explanation: The difference is that the string.valueof () method returns a String as a value and does not create a new object on the heap, so s1 also points to “123” in the String constant pool. All three variables refer to the same object.
String s1 = new String("123") + new String("456");
String s2 = s1.intern();
String s3 = "123456";
System.out.println(s1 == s2);
System.out.println(s1 == s3);
System.out.println(s2 == s3);
Copy the code
The above code has different results in Java 6 and Java 7. In Java 6:
false
false
true
Copy the code
Explanation:
- The first line,
"123"
and"456"
isString literals
So first of all in theString constant pool
Created in the"123"
and"456"
.+
Operator throughStringBuilder
Implicit splices are created in the heap"123456"
.s1
Pointing to the heap"123456"
; - The second row is going to be
"123456"
The cache toString constant pool
becauseJava 6
In theString constant pool
Objects in the permanent generation are created, so will be inString constant pool
(permanent generation) create one"123456"
There is one in the heap and one in the permanent generation"123456"
.s2
Point to theString constant pool
(permanent generation)"123456"
; - The third row,
"123456"
isString literals
Because at this pointString constant pool
Exists in (permanent generation)"123456"
, sos3
Point to theString constant pool
(permanent generation)"123456"
.
In Java 7:
true
true
true
Copy the code
Explanation: The difference with Java 6 is that since objects in the String constant pool are created on the heap in Java 7, the second line String s2 = s1.intern(); Instead of creating a new String, we directly add a reference to s1 to StringTable, so all three objects point to “123456” in the constant pool, which is the object created in the heap in the first row.
In Java 7, s1 == s2 and the result is true. We assume that if “123456” was not resolved lazily, but was resolved and entered the constant pool when the class was loaded, s1.intern() would return the value “123456” that exists in the constant pool, instead of adding the “123456” object in the heap that S1 points to to the constant pool. So s2 should be equal to s3 instead of s1.
String s1 = new String("123") + new String("456");
String s2 = "123456";
String s3 = s1.intern();
System.out.println(s1 == s2);
System.out.println(s1 == s3);
System.out.println(s2 == s3);
Copy the code
Results:
false
false
true
Copy the code
Explanation:
- The first line,
"123"
and"456"
isString literals
So first of all in theString constant pool
Created in the"123"
and"456"
.+
Operator throughStringBuilder
Implicit splices are created in the heap"123456"
.s1
Pointing to the heap"123456"
; - The second line,
"123456"
Is a string literal, which does not exist in the string constant pool"123456"
And so onString constant pool
Created in the"123456"
.s2
Point to theString constant pool
In the"123456"
; - Line 3, because the string constant pool already exists at this point
"123456"
, sos3
Point to theString constant pool
In the"123456"
.
reference
- Java substring() method memory leak issue and fix
- java – substring method in String class causes memory leak – Stack Overflow
- Jls-3.10.5. String Literals
- Jls-15.28 Constant Expressions
- String. Intern in Java 6, 7 and 8 — String pooling
- When does a “literal” in a new String(” literals “) enter the String constant pool? – Wooden girl’s answer – Zhihu
- Drill down into String#intern
- JLS – 5.4.3. Resolution
- String s = new String(“xyz”); How many String instances have you created?
- JVM Internals
- Inside the JVM
- Java virtual machine principle diagram