Java String class

The String class is the most commonly used reference class, and its importance is self-evident, so we need to explore it from the bottom up to get a better understanding of it.

The attributes of the String class are

  1. Char value[] (used to store characters)
  2. Int hash (default hash is 0 for caching)

The String class can be defined in two ways:

  1. For example, String s=” ABC “;

  2. Through the new String (” ABC “); The String constructor has multiple overloaded String constructors:

String()
/* Creates an empty string that assigns the value of "" to the value */ of the currently created object    
String(String original)
/* This string is created by assigning the value in the argument string to the value */ of the currently created object    
String(char value[])
/* This string assigns value from the current object via array.copyof (value, value.length). Array.copyof is used because assignments between Arrays reference the same address, so changing one array will change the values in memory, resulting in changing the values of the other array. Because strings are immutable objects, assisting Arrays directly will conflict with the idea of designing strings. Therefore, this constructor creates another block of memory for storing the value of the current object via array.copyof, so that the immutable properties of strings are guaranteed    
String(char value[], int offset, int count)
/* offset: char The offset of the array, the position of the first character in the string count: the total length of the string Check whether the value of count is less than or equal to 0. If the value of count is less than or equal to 0, throw an exception directly. If the value of 0 is equal to 0, the value of offset+count will be null and return. CopyOfRange (value, offset, offset+count) is called if the above conditions are not met. This method is used for the same reason as the above constructor to ensure the immutable properties of strings */    
String(int[] codePoints, int offset, int count)
/ * codePoints: Int array, which is the ASCll value and what this constructor means is that we're going to create a String class from the ASCll array that we passed in and we're going to do the same thing with offset and count and then we're going to do the same thing with the int array if it's in the char range, Continue (int +1, int +1, int +1); continue (int +1, int +1); In the cycle, it judges whether the current value is within the CHAR range. If so, it is directly converted and assigned to the CHAR array; if not, it Narrows the value to within the CHAR range, invoking lowSurrogate and highSurrogate respectively. So a value that is not in the char range takes up the space of two char values. * /    
// The following two constructors are deprecated
/*String(byte ascii[], int hibyte, int offset, int count)
String(byte ascii[], int hibyte)*/
String(byte bytes[], int offset, int length, String charsetName)  
CharsetName ==null; offset == length; Decode (String charsetName, byte[] ba, int off, int len) to assign value */    
String(byte bytes[], int offset, int length, Charset charset)
If charset==null, offset == length; if charset==null; Decode (Charset cs, byte[] ba, int off, int len) to assign value */     
String(byte bytes[], String charsetName)
/* Call String(byte bytes[], int offset, int length, String charsetName) where offset=0, length=bytes.length */
String(byte bytes[], Charset charset)
/* Call String(byte bytes[], int offset, int length, Charset Charset) where offset=0, length=bytes.length */    
String(byte bytes[], int offset, int length)
String coding. decode(byte[] ba, int off, int len) */	 
String(byte bytes[])
/* Call String(byte bytes[], int offset, int length) where offset=0, length=bytes.length */    
String(StringBuffer buffer)
/* Array.copyof (buffer.getValue(), buffer.length())) assigns values to their objects for the same reason that StringBuffer is thread-safe. So use synchronized code blocks for thread safety */
String(StringBuilder builder)
Array.copyof (Builder.getValue (), Builder.length ())); /* Array.copyof (Builder.getValue (), Builder.length ())    
String(char[] value, boolean share)    
/* Assign value to the created object. The share variable does nothing */
Copy the code

How many objects are created in the JVM by calling new String(“ab”)?

public class StringTest {
    public static void main(String[] args) {
        String s=new String("ab"); }}Copy the code

Corresponding JVM instructions

JVM instructions show that it creates objects in the heap memory and in the string constant pool, respectively

So how many objects does the following code create?

public class StringTest07 {
    public static void main(String[] args) {
        String s=new String("a") +new String("b"); }}Copy the code

Corresponding JVM instructions

1. New creates a StringBuilder object

2, new String (” a “);

3. Create “A” object in the string constant pool

4, new String (” b “);

Create “b” object in string constant pool

Five in all? The answer is no

We see that we end up calling the toString() method in StringBuilder to reference s

StringBuilder toString() source code

The String constructor is called. Note that this constructor only creates objects in the heap space, not in the String constant pool

The toString() method creates a String() object

So there are six objects

Intern method of the String class

Since String is composed of a char array, most String methods operate on a char array. There are so many methods that I’m not going to explain them here

But there is one important method in the String class, intern(). Let’s explain it

First we look at the comments in the source code for the intern() method

  /**
     * Returns a canonical representation for the string object.
     * <p>
     * A pool of strings, initially empty, is maintained privately by the
     * class {@code String}.
     * <p>
     * When the intern method is invoked, if the pool already contains a
     * string equal to this {@code String} object as determined by
     * the {@link #equals(Object)} method, then the string from the pool is
     * returned. Otherwise, this {@code String} object is added to the
     * pool and a reference to this {@code String} object is returned.
     * <p>
     * It follows that for any two strings {@code s} and {@code t},
     * {@code s.intern() == t.intern()} is {@code true}
     * if and only if {@code s.equals(t)} is {@codetrue}. * <p> * All literal strings and string-valued constant expressions are * interned. String literals are defined in Section 3.10.5 of the * <cite> the Java&trade; Language Specification</cite>. * *@return  a string that has the same contents as this string, but is
     *          guaranteed to be from a pool of unique strings.
     */
Copy the code

Through the comments of the source code:

Calling intern() first checks the constant pool for corresponding values and compares them using equals. If there is a value, the reference address of the object is copied into the character constant pool, and the reference address of the character constant pool string pool is returned

The test code

public class StringTest {
    public static void main(String[] args) {
        String s1=new String("ab");/ / 1
        s1.intern();/ / 2
        String s2="ab";/ / 3System.out.println(s1==s2); }}Copy the code

The code test results are FASLE

Reasons: first, the implementation of 1 time to create two objects, respectively in the heap memory and string constants in the pool, then perform 2 time due to the constant pool with “ab”, don’t string constants in the pool created, so at the time of the three will directly quoted string constant pool corresponding address, so the result is false

What about the following code?

public class StringTest {
    public static void main(String[] args) {
        String s1=new String("a") +new String("b");/ / 1
        s1.intern();/ / 2
        String s2="ab";/ / 3System.out.println(s1==s2); }}Copy the code

The code test results are true

Reason: perform 1 will not generate the objects in the string constant pool, when executed 2 will put a copy of 1 after the execution of the address in string constants in the pool, so the three will go to string constants in the pool, discovered that a copy of objects and their values are equal, will return to the address, so return true

The immutable nature of the String class

Immutable means that once an object is created, its internal state will never change. Therefore, no thread can modify its internal state and data, and its internal state never changes of its own. So it’s this feature that makes the String class safe and reliable in a multithreaded environment

Is the immutable nature of the String class guaranteed?

Open source discovery

1. The String class is modified by the final modifier, and classes modified by the final modifier cannot be inherited, which makes it impossible to access the parent class through subclasses

2. The most important property of the String class is the char array, which holds the data of the current String class. Because char arrays are private and final, the value property cannot be manipulated by objects

3. In the String class, methods that modify the String class will return a new String, such as

This operation also ensures the immutable nature of String

But is String immutability really unbreakable? The answer is no, so let’s test it out

String s1="abcd";
String s2=s1;
Class<? extends String> aClass = s2.getClass();
Field value = aClass.getDeclaredField("value");
value.setAccessible(true);
char[] o = (char[])value.get(s2);
o[0] ='e';
System.out.println(s1);
System.out.println(s2);
System.out.println(s1==s2);
Copy the code

Through testing, we found that the immutable nature of the String class can be broken using reflection methods.

conclusion

The immutable nature of the String class in Java is a convention, a specification, that we should adhere to. The immutable property of String class is worth learning, in the future need to use this immutable property when we can imitate the String class to achieve this property.

String memory allocation

There are eight basic data types in the Java language and a special type, String. These types provide a constant pool concept in order to make them faster and more memory efficient during execution.

A constant pool is like one. Caches provided at the Java system level. The constant pool for the eight basic data types is system-coordinated, with the String constant pool being a special case. It can be used in two main ways.

  • Strings declared directly in double quotes are stored directly in the constant pool.
    • For example: String info = “ABC”;
  • If a String object is not declared in double quotes, you can use the String supplied intern () method.

Java 6 and prior, string constant pools were stored in the persistent generation.

In Java 7, Oracle engineers made a major change to the string pool logic by moving the string constant pool into the Java heap.

  • All strings are stored in the Heap, just like any other normal object, which allows you to adjust the Heap size only when tuning applications.
  • The String constant pool concept was used a lot, but this change is reason enough to reconsider using string.intern () in Java 7.

Java8 meta space, string constants in the heap.

The underlying structure of a String Pool

A HashTable consists of an array that stores the hash value of a String, and a linked list that stores String data with the same hash value

The default String Pool size of String is 1009. If there are too many strings to put into a StringPool, there will be too many Hash collisions and the linked list will be too long. The immediate effect of a long linked list is that it will degrade performance when you call string.intern.

Use an XX: StringTableSize to set the length of a StringTable

In JDK6, stringTables are fixed, with a length of 1009, so if there are too many strings in the constant pool, the efficiency drops quickly. StringTableSize There is no requirement for setting

In JDK7, the default StringTable length is 60013

Starting with JDk8,1009 is the minimum StringTable length that can be set

String splicing operation

Constant and constant concatenation with + sign will be optimized in the compiler, as we can see from the bytecode file

Java code

Bytecode file

Second, literals and variables are concatenated with + signs, which we analyze according to JVM instructions

Java code

Java code corresponds to instructions

Analysis of instructions:

Create a StringBuilder object

2. Call the Append method in StringBuilder to concatenate “ab”, and then call the Append method to concatenate the values in S1 to get the final StringBuilder object

3. Assign s2 by calling toString

But once you use a variable you create a StringBuilder object to concatenate, right? , we continue to test using code

Java code

Class file code

As you can see from the Class file, variables decorated with final are optimized at compile time. Because variables modified by final modifiers are loaded into the class’s constant pool at compile time, they are optimized at compile time

Finally, let’s do a little quiz

Java code

The test results

Analysis of the results:

Const string = true; const string = true; const string = true; const string = true; const string = true

False because concatenating variables with the + sign creates a StringBuilder underneath and then calls the Append method, which creates a chunk of heap memory

3. For the same reason as 2

True because final String variables are stored directly in the constant pool at compile time

5. For the same reason as 2

6. For the same reason as 2