This is the 27th day of my participation in Gwen Challenge

This article is part 14 of jakewharton’s series on D8 and R8.

  • D8 Optimizations
  • Originally written by jakewharton
  • Translator: Antway

No, that’s not a typo! While the optimizations in this series so far have been done by R8 during overall program tuning, D8 can also perform some simple optimizations.

D8 was introduced to Android in this article as the new Java-to-Dalvik bytecode compiler. It handles porting of Java8 (and Java9 and later) language features to run on Android. It can also resolve vendor – and version-specific errors in the platform.

We’ve seen this from the D8 series so far, but it has two other responsibilities that we’ll cover in this and the next installment:

  • There is no oldAPILevel of place to useBackportingMethods.
  • Perform local optimizations to reduce bytecode size and/or improve performance.

We will discuss API adaptation in the next article in this series. Now, let’s look at some of the local optimizations that D8 might perform.

1. Rewrite the Switch

In the past two articles, we’ve covered tuning the Switch. Both D8 and R8 are slightly lying about the bytecode they generate for certain switch statements. Let’s look at one of those examples again.

enum Greeting {
  FORMAL, INFORMAL;
  
  static String greetingType(Greeting greeting) {
    switch (greeting) {
      case FORMAL: return "formal";
      case INFORMAL: return "informal";
      default: throw newAssertionError(); }}}Copy the code

The lookupswitch bytecode shown in the Java bytecode above for the greetingType method uses the offset of the jump position when matching values.

static java.lang.String greetingType(Greeting);
  Code:
     0: getstatic     #2      // Field Main$1.$SwitchMap$Greeting:[I
     3: aload_0
     4: invokevirtual #3      // Method Greeting.ordinal:()I
     7: iaload
     8: lookupswitch  {
                   1: 36
                   2: 39
             default: 42
        }
    36: ldc           #4      // String formal
    38: areturn
    39: ldc           #5      // String informal
    41: areturn
    42: new           #6      // class java/lang/AssertionError
    45: dup
    46: invokespecial #7      // Method java/lang/AssertionError."<init>":()V
    49: athrow
Copy the code

Tableswitch Java bytecode is rewritten as Packed -switch bytecode when converted to Dalvik bytecode.

[000584] Main.greetingType:(LGreeting;) Ljava/lang/String;0000: sget-object v0, LMain$1; .$SwitchMap$Greeting:[I0002: invoke-virtual {v2}, LGreeting; .ordinal:()I0005: move-result v1
0006: aget v0, v0, v1
0008: packed-switch v0, 00000017
000b: new-instance v0, Ljava/lang/AssertionError;
000d: invoke-direct {v0}, Ljava/lang/AssertionError; .<init>:()V0010: throw v0
0011: const-string v0, "formal"
0013: return-object v0
0014: const-string v0, "informal"
0016: return-object v0
0017: packed-switch-data (8 units)
Copy the code

If we actually compiled and indexed the source file above with D8, its Dalvik bytecode output would be different.

[0005f0] Main.greetingType:(LGreeting;) Ljava/lang/String;0000: sget-object v0, LMain$1; .$SwitchMap$Greeting:[I0002: invoke-virtual {v1}, LGreeting; .ordinal:()I0005: move-result v1
 0006: aget v0, v0, v1
-0008: packed-switch v0, 00000017+ 0008:const/4 v1, #int 1+ 0009:if-eq v0, v1, 0014
+000b: const/4 v1, #int 2
+000c: if-eq v0, v1, 0017
 000e: new-instance v0, Ljava/lang/AssertionError;
 0010: invoke-direct {v0}, Ljava/lang/AssertionError; .<init>:()V0013: throw v0
 0014: const-string v0, "formal"
 0016: return-object v0
 0017: const-string v0, "informal"
 0019: return-object v0
-0017: packed-switch-data (8 units)
Copy the code

You can see that the Packed -switch at position 008 is replaced with a series of if/else if checks. Based on the index, you might think this would result in a larger binary, but in fact the opposite is true. The original Packed -switch instruction is compiled into Packed -switch-data bytecode and takes up 8 cell lengths. So the whole Packed -switch needs 26 bytes of instruction while if/else if only needs 20 bytes of instruction.

The switch is typically overridden only when bytecode savings are available. This depends on the number of case blocks, whether there is a Fallthrough, and whether the values are continuous. D8 calculates the cost of both forms and chooses the smaller one.

2. String optimization

Back in February, I wrote an article about R8’s string constant manipulation. It describes an OkHttp example where a constant is optimized for a string.length call.

static String patternHost(String pattern) {
  return pattern.startsWith(WILDCARD)
      ? pattern.substring(WILDCARD.length())
      : pattern;
}
Copy the code

When using the old DX compiler, the output was a simple transcoding.

[0001a8] Test.patternHost:(Ljava/lang/String;) Ljava/lang/String;0000: const-string v0, "*."
0002: invoke-virtual {v2, v0}, Ljava/lang/String; .startsWith:(Ljava/lang/String;) Z0005: move-result v1
0006: if-eqz v1, 00100008: invoke-virtual {v0}, Ljava/lang/String; .length:()I0011: move-result v1
0012: invoke-virtual {v2, v1}, Ljava/lang/String; .substring:(I)Ljava/lang/String;000f: move-result-object v2
0010: return-object v2
Copy the code

The bytecode at line 008 is a call to the string. length method and returns a value corresponding to the constant at position 0000.

However, for D8, this method calls a constant, which is detected at compile time and evaluated to the corresponding value.

[0001a8] Test.patternHost:(Ljava/lang/String;) Ljava/lang/String;0000: const-string v0, "*."
 0002: invoke-virtual {v1, v0}, Ljava/lang/String; .startsWith:(Ljava/lang/String;) Z0005: move-result v0
 0006: if-eqz v0, 000d-0008: invoke-virtual {v0}, Ljava/lang/String; .length:()I -0011: move-result v1
+0008: const/4 v0, #int 20009: invoke-virtual {v1, v0}, Ljava/lang/String; .substring:(I)Ljava/lang/String; 000c: move-result-object v1000d: return-object v1
Copy the code

Deleting method calls is not something D8 or even R8 would normally do. It is safe to apply this optimization because String is the last class in the framework with well-defined behavior.

In the nine months since the first article was published, the number of methods that can be optimized on a string has grown dramatically. Both D8 and R8 will evaluate isEmpty(), StartsWith (String), endsWith(String), contains(String), Equals(String), equalsIgnoreCase(String), contentEquals(String), hashC Ode (), Length (), indexOf(String), indexOf(int), lastIndexOf(String), lastIndexOf(int), compareTo(String), compareToIgnoreCase(S) Tring), subString (int), subString (int, int), and trim () on constant strings. Obviously, most of these methods are unlikely to be applied without R8 inlining, but they are there when it happens.

3. The array length is known

Just as you would call the Length () method on a constant string, it is not unusual to call the lenght() method on a fixed-length array.

Let’s take a look at an example from OkHttp:

private fun decodeIpv6(input: String, pos: Int, limit: Int): InetAddress? {
  val address = ByteArray(16)
  var b = 0

  var i = pos
  while (i < limit) {
    if (b == address.size) return null // Too many groups.
Copy the code

Address.size (at the bytecode level, calling lenght) is 16 constants that can be used to avoid duplication. The disadvantage is that each iteration of this parsing loop resolves the array length in the DX output.

[00020c] OkHttpKt.decodeIpv6:(Ljava/lang/String; II)Ljava/net/InetAddress; 0000: const/16 v5,#int 16
0002: new-array v0, v5, [B
0004: const/4 v1, #int 0
0005: const/4 v2, #int 0
0006: if-ge v2, v8, 0036
0008: array-length v6, v0
0009: if-ne v1, v6, 000b
 ⋮
Copy the code

The constant 16 in the bytecode at 0000 is registered in V5 and referenced by the bytecode at 0002, so the length of the array is stored in V0. The loop I < limit from position 0006, inside the loop, the array length identified by V0 at position 0008 is loaded into V6 and used in if at position 0009.

D8 recognizes that the lookup for Lenght is done on an array reference that does not change and whose size is known at compile time.

[00020c] OkHttpKt.decodeIpv6:(Ljava/lang/String; II)Ljava/net/InetAddress; 0000: const/16 v5,#int 16
 0002: new-array v0, v5, [B
 0004: const/4 v1, #int 0
 0005: const/4 v2, #int 00006: IF-GE v2, V8, 0036-0008: Array-length v6, V0-0009: IF-NE v1, v6, 000b +0009: IF-NE v1, V5, 000b ⋮Copy the code

The call to array-length was removed, and the if statement was rewritten to reuse values from V5.

In itself, this pattern is not very common. R8 works well again when it is inlined and the method that checks array.length is inlined into the caller that declares a new array.


Each optimization is small. D8 can only perform optimizations without external visibility and without changing the behavior of the program. This largely limits the optimizations it can make in a single method body.

At run time, you cannot tell if the switch is reconnected to the if/else condition. Cannot tell if the length() call to a constant string has been replaced with its equivalent constant value. Cannot tell if the call to Length on an array initialized in the same method has been replaced with an input size. Every optimization that D8 can perform (and a few others) results in smaller, more efficient bytecode. Of course, their impact is multiplied when you call the full functionality of R8.

In the next article, we’ll start talking about how D8 can be compatible with new apis on existing types in order to work at the old API level.