preface

A common question you might ask in an interview is: What is the difference between String, StringBuilder, and StringBuffer? If you concatenate a String with a String +, you create a lot of temporary variables and the performance is poor. But is this true?

Concatenation of strings

So when String concatenates strings with +, do you create temporary variables at all? In fact, this problem is very simple, just need javap decompile the generated class file, look at the String in the class file to do. Let’s use the string section in Ideas for Java programming as an example.

Let’s start with the following code:

public class Test {
    public static void main(String[] args){
        String mango = "mango";
        String s = "abc" + mango + "def"+ 47; System.out.println(s); }}Copy the code

This code is typical for concatenating strings with +. Next we compile this code with javac test.java and decompile the generated test.class file with javap -c test.class. Removing some extraneous bits and showing mainly the bytecode of the code in main() resulted in the following bytecode. You’ll find something really interesting happening.

  public static void main(java.lang.String[]);
    descriptor: ([Ljava/lang/String;)V
    flags: ACC_PUBLIC, ACC_STATIC
    Code:
      stack=2, locals=3, args_size=1
         0: ldc           #2 // String mango
         2: astore_1
         3: new           #3 // class java/lang/StringBuilder
         6: dup
         7: invokespecial #4 // Method java/lang/StringBuilder."
      
       ":()V
      
        10: ldc           #5 // String abc
        12: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;) Ljava/lang/StringBuilder;
        15: aload_1
        16: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;) Ljava/lang/StringBuilder;
        19: ldc           #7 // String def
        21: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;) Ljava/lang/StringBuilder;
        24: bipush        47
        26: invokevirtual #8 // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
        29: invokevirtual #9 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
        32: astore_2
        33: getstatic     #10 // Field java/lang/System.out:Ljava/io/PrintStream;
        36: aload_2
        37: invokevirtual #11 // Method java/io/PrintStream.println:(Ljava/lang/String;) V
        40: return
}
Copy the code

This involves assembly language, and the reader can search the Internet for bytecode instructions. There are many lists of instructions, but I provide one here, so the reader can look at the table to understand the meaning of each instruction, but I won’t go into detail here. Each instruction may be followed by // and // to indicate the object on which the instruction code operates. Careful readers will have noticed that the compiler automatically introduces java.lang.stringBuilder (where the L before Java stands for reference types, and those who want to learn more about it should check out the chapter on class file structures in Understanding the Java Virtual Machine). Although we didn’t use StringBuilder in our source code, the compiler took the liberty of using it because it was more efficient.

If you look at the bytecode above, you can see that after the compiler creates the StringBuilder object, it concatenates each string with the + sign using the append() method four times, and finally calls the toString() method to generate the result. (Note: Replace the above code with StringBuilder if you’re interested, recompile it with Javac Javap, and you’ll see that the bytecode generated in the main() method is the same).

conclusion

From the above example, we can see that when we concatenate a string with +, the compiler will automatically optimize it for us to concatenate it with StringBuilder. This does not cause the drawbacks of creating temporary variables that are mentioned on the web. Note: Since StringBuidler was introduced after JDK5.0, it was stitched together with StringBuffer before JDK5.0, which interested readers can verify for themselves.

extension

Now, we’ll be happy if we can use strings as we please, since the compiler has optimized them for us. Don’t get too excited, because sometimes compiler optimizations aren’t what you want. Let’s look at the following code:

The following program generates a String in two ways: method one uses multiple strings; Method two uses a StringBuidler in the code.

public class Test {
    public String testString(String[] fields) {
        String result = "";
        for (int i = 0; i < fields.length; i++) {
            result += fields[i];
        }
        return result;
    }

    public String testStringBuilder(String[] fields){
        StringBuilder result = new StringBuilder();
        for (int i = 0; i<fields.length; i++){
            result.append(fields[i]);
        }
        returnresult.toString(); }}Copy the code

The two methods in the above code perform similarly, passing in an array of strings and then concatenating the array strings through a for loop, except that the first method concatenates the array strings using String and the second method concatenates the array strings using StringBuilder. We then decompile the code again through Javap, strip out the extraneous parts, and see the bytecodes for both methods.

The first is the bytecode of the testString() method:

public java.lang.String testString(java.lang.String[]);
    descriptor: ([Ljava/lang/String;)Ljava/lang/String;
    flags: ACC_PUBLIC
    Code:
      stack=3, locals=4, args_size=2
         0: ldc           #2 // String
         2: astore_2
         3: iconst_0
         4: istore_3
         5: iload_3
         6: aload_1
         7: arraylength
         8: if_icmpge     38
        11: new           #3 // class java/lang/StringBuilder
        14: dup
        15: invokespecial #4 // Method java/lang/StringBuilder."
      
       ":()V
      
        18: aload_2
        19: invokevirtual #5 // Method java/lang/StringBuilder.append:(Ljava/lang/String;) Ljava/lang/StringBuilder;
        22: aload_1
        23: iload_3
        24: aaload
        25: invokevirtual #5 // Method java/lang/StringBuilder.append:(Ljava/lang/String;) Ljava/lang/StringBuilder;
        28: invokevirtual #6 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
        31: astore_2
        32: iinc          3, 1
        35: goto          5
        38: aload_2
        39: areturn
Copy the code

If you look at if_ICmpge in line 8, the bytecode instruction table will show that this instruction is the for loop that starts when the value of I is equal to a certain value, and the 38 after it means that it breaks out of the loop at line 38. The body of the loop here is lines 8 to 35. Line 35 means: Return to the starting point of the loop body (line 5). Then we look at line 11 in the body of the loop (lines 8 to 35), which is a new instruction, which is all too familiar: create an object. But it’s inside the body of the loop, which means that every time you loop, a new StringBuilder object is created. This is clearly unacceptable.

TestStringBuilder () bytecode

public java.lang.String testStringBuilder(java.lang.String[]);
    descriptor: ([Ljava/lang/String;)Ljava/lang/String;
    flags: ACC_PUBLIC
    Code:
      stack=3, locals=4, args_size=2
         0: new           #3 // class java/lang/StringBuilder
         3: dup
         4: invokespecial #4 // Method java/lang/StringBuilder."
      
       ":()V
      
         7: astore_2
         8: iconst_0
         9: istore_3
        10: iload_3
        11: aload_1
        12: arraylength
        13: if_icmpge     30
        16: aload_2
        17: aload_1
        18: iload_3
        19: aaload
        20: invokevirtual #5 // Method java/lang/StringBuilder.append:(Ljava/lang/String;) Ljava/lang/StringBuilder;
        23: pop
        24: iinc          3, 1
        27: goto          10
        30: aload_2
        31: invokevirtual #6 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
        34: areturn
Copy the code

As you can see, not only is the code for the body part of the loop shorter and simpler, but new is only called once at the beginning, indicating that only one StringBuilder object is generated.

Extends the conclusion

So, when you write a toString() method for a class, if string manipulation is easy, you can rely on the compiler to properly construct the final string result for you. However, if you’re going to use loops in the toString() method, you’ll need to create a StringBuilder object of your own. Of course, when in doubt, you can always use Javap to analyze your program.

One question: Enumeration is probably well known, but do you know how it actually works in the JVM? (It’s easy to see how this works. You can also write enumeration code and decompile it using Javap.)

reference

– Ideas for Java Programming