Exploration on Volume optimization of Douyin Android package: Simplify DEX volume from Class bytecode

preface

It is well known that the size of an application installation package can greatly affect the speed of application download and installation. According to data on the impact of package size on conversion rates released by GooglePlay, the install conversion rate generally decreases as package size increases.

Therefore, for our application, in order to improve our user download conversion rate (that is, the ratio of downloaded, installed and activated users to potential users), we must optimize and control package volume to some extent.

The installation package we provide for users to download in the App Store is in the APK format defined by Android. In essence, it is a ZIP package containing all the required resources for the application, which contains several components as follows:

The main component of this is the DEX file, which is compiled from Java/Kotlin code. In the past two years, the number of DEX in Douyin has increased from 8 to 21, and the total size of DEX has increased from 26M to 48M. It is true that with the rapid development of Douyin and the increase of business complexity, the code weight must be increasing. However, how to conduct general optimization of the code without any sense of business is also a very important optimization direction for us.

Before introducing specific optimization methods, we first need to understand the overall optimization thinking of DEX.

DEX general optimization ideas

During AGP construction, Java or Kotlin source code is compiled to generate Class bytecode files. In this stage, AGP provides Transform to handle bytecode. Proguard works in this stage. After that, the Class files were generated into a bunch of smaller DEX files through dexBuilder, and then merged into the final DEX file through mergeDex, and then into APK. The specific process is shown in the figure below:

Therefore, the optimization timing of DEX file can be divided into three stages, namely. Kt or. Java source file, class file and DEX file:

Processing in the source file is to manually modify the code, which is intrusive to the program design itself, and has strong limitations;
There is no developer awareness at the class bytecode stage, and most optimizations can be done, but not for optimizations involving the DEX format itself, such as cross-dex reference optimizations;
Optimization is best in the DEX file stage, where we can not only optimize the DEX bytecode itself, but also operate on the DEX file format.

In general, the means of optimization are redundancy removal, content simplification, format optimization and so on.

As the construction of Tiktok class bytecode modification tool was relatively mature in the early stage, a lot of package volume optimization was completed by modifying the class bytecode. With the further optimization, a lot of optimization was also handled in the DEX file stage in the later stage. We will introduce the optimization related to DEX stage in the following articles. The optimization related to Class bytecode stage is mainly introduced here, which can be divided into two categories:

Simple removal of useless code instructions, including removal of redundant assignments, code deletion without side effects, etc
In addition to reducing the number of code instructions, it can also reduce the number of methods and fields, thus effectively reducing the number of DEX. We know that the number of referenced methods and fields in DEX cannot exceed 65535, after which a new DEX file needs to be opened. Therefore, reducing the number of methods and fields in DEX can reduce the number of DEX files, such as short method inlining, constant field elimination, and R constant inlining belong to such optimization.

We’ll go into the background, ideas, and benefits of each optimization in detail.

Remove redundant assignments

In our normal code development, we might write the following code:

class MyClass { private boolean aBoolean = false; private static boolean aBooleanStatic = false; private void boo() { if (! aBoolean) { System.out.println("in aBoolean false!" ); } if (! aBooleanStatic) { System.out.println("in aBooleanStatic false!" ); }}}Copy the code

We often manually assign a Class member variable to ensure that its initial value meets our expectations, as in aBoolean and aBooleanStatic in the above code. This is a logically safe thing to do, but is it really necessary?

Java actually official at the virtual machine specification (docs.oracle.com/javase/spec…). When a Class object is loaded from a virtual machine, all static fields (static member variables, collectively referred to as fields) are loaded with a default value.

2.3 Primitive Types and Values

.

The integral types are:

byte, whose values are 8-bit signed two’s-complement integers, and whose default value is zero

short. whose default value is zero

int. whose default value is zero

long. whose default value is zero

char. whose default value is the null code point ('\u0000')

The floating-point types are:

float. whose default value is positive zero

double. whose default value is positive zero

2.4. Reference Types and Values

. The null reference initially has no run-time type, but may be cast to any type. The default value of a reference type is null.

To summarize, both primitive and reference fields in Java are given a default value when the Class is loaded. Byte, short, int, long, float, and double are assigned 0, and char is assigned ‘\u0000’. The reference type is assigned null.

We convert the beginning of the code to bytecode using the command line Java -p -v:

public com.bytedance.android.dexoptimizer.MyClass(); Code: 0: aload_0 1: invokespecial #1 // Method java/lang/Object."<init>":()V 4: aload_0 5: iconst_0 6: putfield #2 // Field aBoolean:Z 9: return static {}; Code: 0: iconst_0 1: putstatic #6 // Field aBooleanStatic:Z 4: return private void boo(); Code: 0: aload_0 1: getfield #2 // Field aBoolean:Z 4: ifne 15 7: getstatic #4 // Field java/lang/System.out:Ljava/io/PrintStream; 10: ldc #5 // String in aBoolean false! 12: invokevirtual #6 // Method java/io/PrintStream.println:(Ljava/lang/String;) V 15: aload_0 16: getfield #3 // Field aBooleanStatic:Z 19: ifne 30 22: getstatic #4 // Field java/lang/System.out:Ljava/io/PrintStream; 25: ldc #7 // String in aBooleanStatic false! 27: invokevirtual #6 // Method java/io/PrintStream.println:(Ljava/lang/String;) V 30: returnCopy the code

From the bytecode above, we can see that although the JVM assigns aBoolean a value of 0 at run time, we still assign 0 to aBoolean again in the bytecode, the same with aBooleanStatic.

public com.bytedance.android.dexoptimizer.MyClass();
    Code:
         0: aload_0
         1: invokespecial #1                  // Method java/lang/Object."<init>":()V
         4: aload_0
         5: iconst_0
         6: putfield      #2                  // Field aBoolean:Z
         9: return
Copy the code

Repeated assignments occur in the red section above, which is removed without affecting runtime logic. Therefore, we consider removing this redundant bytecode to gain packet size benefits during the Class bytecode processing phase.

Optimization idea

Once you understand the cause of the problem, it’s easy to find a solution. First, Field assignments that can be optimized need to satisfy these three conditions:

A Field belongs to its directly defined Class, not its parent Class.
Field assignment is in Classclinit,initIn methods, this is done largely to reduce complexity (because private methods called only in these two methods can do the same optimization, but parsing such methods is highly complex);
Field assignments are the default, and when multiple assignments occur, any assignment after a non-default assignment cannot be optimized.

Let’s use the following code to specify whether each case can be optimized:

Class MyClass {// Can be optimized, directly defined, and the default value private Boolean aBoolean = false; Private Boolean bBoolean = true; Private static Boolean aBooleanStatic = false; Static {// can be optimized, the first occurs, and the default is aBooleanStatic = false; // Other code... ABooleanStatic = false; aBooleanStatic = false; // Other code... // Cannot be optimized because the value is not the default aBooleanStatic = true; // Other code... ABooleanStatic = false; } private void boo() {// Cannot be optimized because the function is non-clinit or init aBoolean = false; }}Copy the code

Specifically, our optimization idea is as follows:

Go through all the methods of Class and find<clinit>and<init>Method to traverse bytecode instructions from the top down
Go through all the bytecode instructions of these two methods, find all the putfield instructions, and use the putfield instructions’ targets ClassName and FieldName-Connect, build a unique Key if
- Putfield Target Class is not the current Class, skip
- The load command before putfield is noticonst_0.fconst_0.dconst_0.lconst_0.aconst_nullAnd put the unique Key associated with the putfield into the set of keys that have been traversed
- Load before putfield isiconst_0.fconst_0.dconst_0.lconst_0.aconst_null, and the unique Key associated with the putfield does not appear in the set of keys traversed, then it is marked as a clearable bytecode instruction
When traversal is complete, delete all bytecode instructions marked as cleanable

Let’s use a simple example to illustrate our thinking:

public com.bytedance.android.dexoptimizer.MyClass(); // 1. Check the <init> method and enter the optimization logic Code: // 2. "<init>":()V 4: aload_0 5: iconst_0 6: putfield #Field MyClass.aBoolean:Z. // 3. 7: aload_0 8: iconST_1 9: putfield #Field myclass. aBoolean:Z // 4 Add myclass-aboolean to the set of keys iterated through. 10: aload_0 11: iconst_0 12: putfield #Field MyClass.aBoolean:Z // 5. Myclass-aboolean is found in the set of keys iterated over. Continue 15: returnCopy the code

Finally, it was found that the part marked red in the above bytecode could be deleted, and the corresponding bytecode instructions could be deleted, and the optimization was completed.

Using ByteX, tiktok’s previous open source bytecode processing framework, it is relatively easy to obtain the Class of Field, traverse all the methods of Class, and the bytecode of all the methods. We have also made this solution open source, interested students can go to see the detailed code:

Github.com/bytedance/B…

Remove side-effect-free code

Redundancy in a class assignment is to use the virtual machine load for the field characteristics of the assignment, the default value assigned to delete redundant instruction, and we are in the code itself also has some effect on online package is not, is the most common log printing, in addition to take up the package volume, also can cause performance problems and security risks, therefore tends to move it away, Let’s use the log. I call as an example to show how to remove useless function calls from your code. Such as the log print statement in the following code:


public static void click() {
    clickSelf();
    Log.i("Logger", "click time:" + System.currentTimeMillis());
}
Copy the code

We started with ProGuard’s -assumenosideEffects, which required us to assume that the method calls to be deleted had nosideeffects and that the method would not modify the values of objects on the heap or method parameters on the stack from a program analysis point of view. With the following configuration, ProGuard will remove log-related method calls for us during the Optimize phase.

-assumenosideeffects class android.util.Log { public static boolean isLoggable(java.lang.String, int); public static int v(...) ; public static int i(...) ; public static int w(...) ; public static int d(...) ; public static int e(...) ; }Copy the code

This deletion is not complete, however. It only removes the method call instruction itself. For example, in the above code, deleting the log. I method call leaves a StringBuilder object created:

public static void click() {
    clickSelf();
    new StringBuilder("click time:")).append(System.currentTimeMillis();
}
Copy the code

The creation of this object is also considered useless, but it is not considered useless from the perspective of simple static program instruction analysis, so ProGuard does not remove it.

Assumenosideeffects are bad to delete so we can do a more thorough optimization ourselves.

Optimization idea

public static void click(); Code: 0: invokestatic #6 // Method clickSelf:()V 3: ldc #7 // String Logger 5: new #8 // class java/lang/StringBuilder 8: dup 9: invokespecial #9 // Method java/lang/StringBuilder."<init>":()V 12: ldc #10 // String click time: 14: invokevirtual #11 // Method java/lang/StringBuilder.append:(Ljava/lang/String;) Ljava/lang/StringBuilder; 17: invokestatic #12 // Method java/lang/System.currentTimeMillis:()J 20: invokevirtual #13 // Method java/lang/StringBuilder.append:(J)Ljava/lang/StringBuilder; 23: invokevirtual #14 // Method java/lang/StringBuilder.toString:()Ljava/lang/String; 26: invokestatic #2 // Method android/util/Log.i:(Ljava/lang/String; Ljava/lang/String;) I 29: popCopy the code

Log. I (“Logger”, “click time:” + System.currentTimemillis ()); Multiple instructions (from LDC to POP) are generated after compilation, and there are many parameter creation and push instructions in addition to the invokestatic instruction called by the target method log. I.

If we want to delete the related method call, mainly is to find this line of code generated by the start and stop instructions, and then the instructions between the start and stop location is all we need to delete instructions. 1. Locate the termination instruction

The search for termination instructions is relatively simple, mainly to find the target method call instruction to delete, and then determine whether to include the subsequent POP or POP2 instruction according to the method return value type.

For example, we can find the target invokestatic #2 by walking through the code above, because the return type of log. I is int, and the termination instruction is the next pop.

Note that the pop directive actively pushes a value of type int off the stack, i.e. does not use the return value of the method. Only in this case can we safely delete the target method, otherwise we cannot delete it. Of course, if the method returns a value of type void, there is no POP instruction.

2. Locate the start command

To find the initial order, we need to have basic knowledge about Java word instruction design: Java bytecode instruction is designed based on the stack, each bytecode instruction will correspond to the several parameters of the operand stack into and out of the stack, and a complete independent code/code block executed before and after the operand stack should be the same.

Therefore, after finding the termination instruction, we traverse the instruction in reverse order, and carry out reverse stack loading and unloading operations according to the function of the instruction. When the size in our stack is reduced to 0, we will find the position of the starting instruction. Note that the type of the parameter should be recorded when the stack is loaded, and the type match verification should be performed when the stack is unloaded. Example above:

The effect of the POP directive is to push single slot parameters (like int, float) off the stack, so we put a slot parameter on the stack
Invokestatic depends on the parameters and return value of the method. The normal effect is that the parameters of the corresponding method are pushed from right to left, and the return value int is pushed. Slot = slot = slot = slot = slot = slot = slot The two String arguments are then pushed from left to right as method arguments.
The normal method call parameters of the Invokevirtual directive are stacked from right to left, then this object is stacked, and finally the method return value String is pushed onto the stack. We pop one of the parameters at the top of the stack, find that it matches String, and then we push the type StringBuilder corresponding to this, so we’re calling toString and we don’t have to push it anymore.
The rest of the instructions in between are similar, until the LDC instruction, which itself puts an int,float, or String constant on the stack, pops up a parameter, finds that it’s a String match, and the stack size goes to zero, and that’s where the starting instruction is.

Package defects

However, the above scheme has two drawbacks:

Because analysis is only analyzed within a single method, in the case of the Log method encapsulation, the encapsulation method must be configured as the target method to remove the complete deletion. For example, the following method requires the configuration of AccountLog.d to remove the StringBuilder created at its invocation.

object AccountLog {
    @JvmStatic
    fun d(tag: String, msg: String) = Log.d(tag, msg)
}
Copy the code

It is possible to mistakenly remove some useful instructions, because it is impossible to assume that the build instructions for both parameters of log. I are useless, we can only be sure that the StringBuilder creation is useless, but some other method calls may change the state of some objects, so there is some risk.

Proguard scheme

The aforementioned program is in our online running after a year, trying to optimize for the above shortcomings, and then found proguard also provides assumenoexternalsideeffects instruction, it allows us to specify methods without any external side-effects.

When specified, it only modifies the instance calling the method itself, but does not modify other objects. The following configuration can remove useless StringBuilder builds.

-assumenoexternalsideeffects class java.lang.StringBuilder {
    public java.lang.StringBuilder();
    public java.lang.StringBuilder(int);
    public java.lang.StringBuilder(java.lang.String);
    public java.lang.StringBuilder append(java.lang.Object);
    public java.lang.StringBuilder append(java.lang.String);
    public java.lang.StringBuilder append(java.lang.StringBuffer);
    public java.lang.StringBuilder append(char[]);
    public java.lang.StringBuilder append(char[], int, int);
    public java.lang.StringBuilder append(boolean);
    public java.lang.StringBuilder append(char);
    public java.lang.StringBuilder append(int);
    public java.lang.StringBuilder append(long);
    public java.lang.StringBuilder append(float);
    public java.lang.StringBuilder append(double);
    public java.lang.String toString();
}
-assumenoexternalreturnvalues public final class java.lang.StringBuilder {
    public java.lang.StringBuilder append(java.lang.Object);
    public java.lang.StringBuilder append(java.lang.String);
    public java.lang.StringBuilder append(java.lang.StringBuffer);
    public java.lang.StringBuilder append(char[]);
    public java.lang.StringBuilder append(char[], int, int);
    public java.lang.StringBuilder append(boolean);
    public java.lang.StringBuilder append(char);
    public java.lang.StringBuilder append(int);
    public java.lang.StringBuilder append(long);
    public java.lang.StringBuilder append(float);
    public java.lang.StringBuilder append(double);
}
Copy the code

However, this configuration only works if only strings are passed into the Log. If int log. w (String tag, Throwable tr) is the case, there is no way to remove the Throwable parameter. That should still use our own implementation of plug-ins to optimize clean.

This optimization gains on the volume of tiktok package, about 520KB.

Short method inline

The two optimizations introduced above are from the perspective of removing useless instructions. At the beginning of DEX optimization, we have mentioned that reducing the number of defined methods or fields to reduce the number of DEX is also one of our common optimization ideas. Short method inlining is to simplify code instructions and reduce the number of defined methods at the same time.

In the process of comparison with overseas competing products, we found that the number of definition methods in a single DEX file was far more than that of the competing products. Further analysis of DEX found that there were a large number of Access and getter-setter methods in Douyin DEX, while there were almost none in the competing products. So we’re going to do some inline optimization for short methods and reduce the number of methods defined.

Before introducing optimization solutions, let’s take a look at the basics of inlining. As the most common method of code optimization, inlining is known as the mother of optimization. Some languages, such as C++ and Kotlin, provide inline keywords for programmers to inline functions, while the Java language itself does not give programmers the opportunity to control or suggest inline functions, or even to inline methods during javac compilation. To make things easier to understand, let’s take a simple example of how inlining works. In this code, callMethod calls print:

public class InlineTest { public static void callMethod(int a) { int result = a + 5; print(result); } public static void print(int result) { System.out.println(result); }}Copy the code

After inlining, the contents of the inlineMethod are expanded directly into the callMethod. From the point of view of bytecode, the changes are as follows:

Inline before:

public static void callMethod(int);
    Code:
       0: iload_0
       1: iconst_5
       2: iadd
       3: istore_1
       4: iload_1
       5: invokestatic  #2                  // Method print:(I)V
       8: return
Copy the code

After the inline:

public static void callMethod(int);
    Code:
       0: iload_0
       1: iconst_5
       2: iadd
       3: dup
       4: istore_0
       5: istore_0
       6: getstatic     #5                  // Field java/lang/System.out:Ljava/io/PrintStream;
       9: iload_0
      10: invokevirtual #6                  // Method java/io/PrintStream.println:(I)V
      13: return
Copy the code

In terms of execution time, there is one less function call, which improves performance. From a space footprint perspective, there is one less function declaration and therefore less code volume.

Are all methods suitable for inlining?

Obviously not, inlining a method for a single call yields both time and space benefits; For methods that are called multiple times, the length of the method itself should be considered. For example, the expanded instruction of the above print method is much longer than the invokestatic instruction itself, but methods like Access and getter-setter are shorter and suitable for inlining.

The access method is inlined

public class Foo { private int mValue; private void doStuff(int value) { System.out.println("Value is " + value); } private class Inner { void stuff() { Foo.this.doStuff(Foo.this.mValue); }}}Copy the code

As mentioned above, it is well known that Java can directly access the private members of the external Foo class in Foo$Inner, but the JVM does not have the concept of an external Foo class and considers it illegal for one class to directly access the private members of another class. To implement this syntactic sugar, the compiler generates the following static methods at compile time:

static int Foo.access$100(Foo foo) {
    return foo.mValue;
}
 static void Foo.access$200(Foo foo, int value) {
    foo.doStuff(value);
}
Copy the code

The inner class object is created with a reference to the outer class, so that when the inner class needs to access the mValue of the outer class or call the doStuff() method, these static methods are called. The reason for generating static methods here is that the members being accessed are private, and private access control is more about constraints at the source level, preventing sabotage of the program’s design. As long as we don’t break the syntactic logic at the bytecode level, we can simply change these private members to public and remove the bridge static methods generated by the compiler.

Optimization idea

Specific optimization is divided into the following steps:

Collect access methods in bytecode.

static int access$000(com.bytedance.android.demo.inline.Foo); descriptor: (Lcom/bytedance/android/demo/inline/Foo;) I flags: ACC_STATIC, ACC_SYNTHETIC Code: stack=1, locals=1, args_size=1 0: aload_0 1: getfield #2 // Field mValue:I 4: ireturn static void access$100(com.bytedance.android.demo.inline.Foo, int); descriptor: (Lcom/bytedance/android/demo/inline/Foo; I)V flags: ACC_STATIC, ACC_SYNTHETIC Code: stack=2, locals=2, args_size=2 0: aload_0 1: iload_1 2: invokespecial #1 // Method doStuff:(I)V 5: returnCopy the code

As the bytecode above shows, it has the obvious characteristics of a compiled method, it has the synthetic tag, and it is a static method with a method name beginning with “access$”. These characteristics make it easy to match related methods in the case of ClassVisitor visitMethod.

Analyze and document the target instruction to be replaced at the access method call.

There are only two types of access for access bridge: field and method. Corresponding instructions are method access instructions (InvokVirtual, InvokSpecial, etc.) and field access instructions (getField, putfield, etc.). You only need to traverse the method to find the corresponding instruction. It also resolves the field or method information accessed by the directive, and then changes the corresponding private member to public. For example, the access$000 method will find the following directive that accesses the mValue of class Foo.

getfield      #2                  // Field mValue:I
Copy the code

Replace invokestatic at the access method invocation with the corresponding target instruction and delete the definition of the Access method.

Iterate over all the call points to the access method, as shown in the invokestatic directive below, which replaces the call method with getField in the access method we collected in the first step, and then deletes the foo.access $000 method itself.

invokestatic #3 // Method com/bytedance/android/demo/inline/Foo.access$000:(Lcom/bytedance/android/demo/inline/Foo;) ICopy the code

Getter – setters inline

Encapsulation is one of the basic features of object-oriented programming (OOP). The use of getter and setter methods is one of the common encapsulation methods in programming. In daily development, we often write getter-setter methods for classes like this:

public class People { private int age; public int getAge() { return this.age; } public void setAge(int age) { this.age = age; }}Copy the code

These methods are exactly the best case for short method inlining.

Optimization idea

The getter-setter inline whole implementation is similar to the Access method, and the whole is also divided into three steps: collection, analysis, and deletion.

public int getAge();
    descriptor: ()I
    flags: ACC_PUBLIC
    Code:
      stack=1, locals=1, args_size=1
         0: aload_0
         1: getfield      #2                  // Field age:I
         4: ireturn

public void setAge(int);
    descriptor: (I)V
    flags: ACC_PUBLIC
    Code:
      stack=2, locals=2, args_size=2
         0: aload_0
         1: iload_1
         2: putfield      #2                  // Field age:I
         5: return
Copy the code

Collect getter-setter method information to be inlined in your code. Refer to the bytecode instructions above, mainly to find only parameter push (LOAD class instruction), field access (GETFIELD, PUTFIELD), RETURN instruction method. The important thing here is to filter methods kept by The ProGuard rule. These deletions are risky because there may be plug-in calls or reflection calls.
Record the instructions for each method to access a field and the target field. If the field access permission is not public, change it to public.
Where getter-setter methods are called, replace them directly with the corresponding field access instruction and remove getter-setter method definitions.

Why not use Proguard

In addition to obfuscating and shrink useless code, Proguard also makes many optimizations for code, including short method inlining, unique method inlining, etc. Why didn’t we use our App directly? Mainly because of robust hot repair, auto-patch has too high level of inline and the builder method is not well supported, which will lead to the failure of patch generation. However, the access method and getter-setter method themselves are very short, and there is only one level of inlining at most, which will not affect the generation of Patch, and ProGuard cannot configure methods for inlining, so we plan to implement it by ourselves.

The inlining of the two short methods on Douyin reduces the number of defined methods by 70,000 +, the DEX file by one, and the package volume gain reaches 1.7m.

Constant field elimination

The above short method inlining is to expand the method content to the call, some constants in our code are similar, we can replace the constant value to use, so as to reduce the declaration of the field, this optimization is the simplest manifestation of constant field elimination.

We know that Javac does some constant field elimination optimizations for variables of final type, such as the following code:

public class ConstJava { public static final int INTEGER = 1024; public static final String STRING = "this is long str"; public static void constPropagation() { System.out.println("integer:" + INTEGER); System.out.println("string:" + STRING); }}Copy the code

After compilation, the constPropagation method will have the following contents: constants are directly replaced by literals. In this way, the corresponding final fields become useless and ProGuard can shrink them.

public static void constPropagation() {
    System.out.println("integer:1024");
    System.out.println("string:this is long  str");
}
Copy the code

But some of the Kotlin code, for example, is compiled as follows without propagation optimization. Of course, if you add the const keyword modification, it will be optimized accordingly.

class ConstKotlin {
    companion object {
        val INTEGER = 1024
        val STRING = "this is long str"
    }

    private val b = 6

    fun constPropagation(){
        println("a:$INTEGER")
        println("s:$STRING")
    }

}
Copy the code

Compiled code:

private static final int INTEGER = 1024;
@NotNull
private static final String STRING = "this is long str";

public final void constPropagation() {
   String var1 = "a:" + INTEGER;
   System.out.println(var1);
   var1 = "s:" + STRING;
   System.out.println(var1);
}
Copy the code

So we can optimize for this kind of case.

In addition, we said above that after constant field elimination and optimization, the corresponding field declaration can be deleted by ProGuard. However, there are many cases of excessive keep in the project. For example, the following rules will cause constant field declaration to be retained, in which case, we can delete the field.

-keep class com.bytedance.android.demo.ConstJava{*; }Copy the code

Optimization idea

Collect variables of static final type and record their literals, excluding some special fields, and finally determining which fields can be deleted. The fields to be excluded include the following two types:

The serialVersionUID field used to represent the serialized object version;
There are fields used by reflection. It is generally unlikely that reflection will access variables of final type, but reflection calls to fields in the code will be analyzed and retained if there is a corresponding access.

For the access of getstatic instruction in the code, analyze the fields it accesses. If the fields collected in the first step, change the corresponding instruction to the constant pushing instruction corresponding to L, and delete the corresponding fields. The following is a call to getStatic on INTEGER, whose final type variable collected in the first step has a literal value of 1.

getstatic     #48                 // Field STRING:Ljava/lang/String;
Copy the code

Change to LDC instruction:

ldc           #25                 // String s:this is long str
Copy the code

Some students may wonder, for example, if a large string is spread across multiple classes, doesn’t it increase the package size?

This is possible, but since all classes in a Dex share a constant pool, propagation would not be negative if two classes were in the same Dex file, and negative if not.

Constant field elimination optimization brings a total package gain of about 400KB.

The r.class constant is inline

Constant field elimination is optimized for the regular final static type, but in our code, there is another type of constant that can also be optimized inline.

In our Android development, we often use the R class, which is the most common way we use resources. In reality, however, there was a lot of irrationality in R file generation that had a significant impact on both our performance and package size. But to understand this problem, first we need to understand again what an R file is.

In our normal code development, we often write the following common code:

public class MainActivity extends AppCompatActivity { @Override protected void onCreate(@Nullable Bundle savedInstanceState) { super.onCreate(savedInstanceState); // Here we use the ID in R to get the layout resource setContentView(r.layout.activity_main) for MainActivity; }}Copy the code

In this example, we used r.layout.activity_main to get the layout resource for MainActivity. What if we converted it to bytecode? This needs to be discussed in two cases:

When MainActivity is under the Application Module, its bytecode is:

protected void onCreate(android.os.Bundle); Code: 0: aload_0 1: aload_1 2: invokespecial #2 // Method android/support/v7/app/AppCompatActivity.onCreate:(Landroid/os/Bundle;) V 5: aload_0 6: ldc #4 // int 2131296285 8: invokevirtual #5 // Method setContentView:(I)V 11: returnCopy the code

You can see that r.layout.activity_main is replaced directly with a constant.

However, when MainActivity is in the Library Module, its bytecode is:

protected void onCreate(android.os.Bundle); Code: 0: aload_0 1: aload_1 2: invokespecial #2 // Method android/support/v7/app/AppCompatActivity.onCreate:(Landroid/os/Bundle;) V 5: aload_0 6: getstatic #3 // Field com/bytedance/android/R$layout.activity_main:I 9: invokevirtual #4 // Method setContentView:(I)V 12: returnCopy the code

You can see that it has changed from importing constants using the LDC directive to accessing the ACTIvity_main field of R$Layout using the getStatic directive.

Why the difference

We know that library Modules are provided to Application Modules in aar form, so in order for Javac to compile when Library Modules are packaged, By default, AGP provides the Library Module with a temporary R.java file (which won’t end up in the library Module package) and sets the field modifier in R to public static to prevent it from being inlined by JavAC. This makes the R field non-constant, and ultimately escapes javAC inlining into application Module compilation.

Why isn’t the Library Module inline

In Android, each of our resource ids is unique, so we need to ensure that we don’t have resources with duplicate ids when we package them. If we specify resource ids in the Library Module, we are vulnerable to resource iD conflicts with other Library modules. AGP therefore provides a way to use domain access where resource ids are used when library Module is compiled, and to record the resources used in R.xt. When application Module is compiled, collect all library Module r.tb files, add application Module R files and input them to AAPT. After aAPT gets global input, Avoid this conflict by generating a unique and unique resource ID for each resource in sequence. At this point, however, the Library Module has been compiled, so only r.Java files can be generated for library Module runtime resource retrieval.

Why isn’t ProGuard optimized

When using ProGuard, Google officially recommends that we bring some keep rules, which are generated by default when creating a new application

buildTypes {
    release {
        proguardFiles getDefaultProguardFile('proguard-android-optimize.txt')
    }
}
Copy the code

The official to keep the rules (android.googlesource.com/platform/sd… R class field), so we add the following rule:

-keepclassmembers class **.R$* {
    public static <fields>;
}
Copy the code

The keep rule preserves all public static fields of R and R inner classes so that they are not optimized. So in our final APK, R.class still exists, which causes our package volume to swell.

In fact, the expansion of our package volume is not only caused by the definition and assignment of R fields. In Android, the upper limit of the number of fields that can be placed in a DEX is fixed at 65536, and if we exceed this limit, we need to split a DEX into two. Multiple DEX results in less reusable data in DEX, which further increases packet volume expansion. Therefore, our optimization of R will also bring great benefits at the DEX level.

The solution

Once you understand the root of the problem, the solution is simple. Since the value of each field in R.class does not change after validation, it is perfectly possible to inline the call to get the resource ID through R and delete the corresponding field to reap benefits.

The optimization idea is as follows:

Go through all the methods and locate all the getStatic directives
If the getStatic directive’s target Class name is of the form **.R or **.R$*

A. If the target Field of the getStatic directive is a public static int, use the LDC directive to replace the getstatic directive with the actual value of the Field.

B. If the target Field of the getStatic directive is of type public static int[], use the newarray directive to replace getStatic and import the array assignment of Field in < Clinit >.
After the traversal is complete, determine whether all fields in R.class are deleted. If all fields are deleted, remove r.class as well.

We use the above case to illustrate as follows:

protected void onCreate(android.os.Bundle); Code: 0: aload_0 1: aload_1 2: invokespecial #2 // Method android/support/v7/app/AppCompatActivity.onCreate:(Landroid/os/Bundle;) 6: getstatic #3 // Field com/bytedance/android/R$layout.activity_main:I 6: ldc #4 // int 2131296285 8: invokevirtual #5 // Method setContentView:(I)V 11: returnCopy the code

In fact, not all ids are inlined, and if our runtime fetched some resource with a given name by reflecting R.class, if we inlined it, it would result in an exception that the runtime could not find the ID. To prevent this, we can add the concept of a whitelist to the schema. Fields in the whitelist will not be inlined. Accordingly, step 2 in the schema needs to be changed to

If the getStatic directive’s target Class name is of the form **.R or **.R$*

A. If the target Field of getStatic is in the whitelist, skip it.

B. If the destination Field of the getStatic directive is a public static int, use the LDC directive to replace getstatic with the actual value of the Field.

C. If the target Field of the getStatic directive is of type public static int[], use the Newarray directive to replace getStatic and import the array assignment of Field in < Clinit >.

After this optimization, the package volume is reduced by 30.5M. The reason why Douyin can generate such a large profit is that the R of Douyin is huge, which contains a large number of fields. Meanwhile, as the maximum number of fields defined by a single DEX is 65536, if no reduction is made, the number of DEX will increase sharply, resulting in the increase of the total volume of DEX.

summary

These optimizations introduced today can greatly reduce the DEX package size, greatly promote the user growth of Douyin, and also optimize the DEX loading time for VMS at startup. However, this is just the tip of the iceberg in terms of bytecode. All of the solutions described in this article are implemented in our previous open source ByteX tool:

Github.com/bytedance/B…

Of course, there are many more dex-related optimizations. For example, we also optimized the code generation of Kotlin, which made a great profit when Kotlin was popular today. At the same time, For the optimization of the format and content of DEX itself, Douyin has also implemented many high-tech solutions. I won’t go into details here for lack of space.

In the subsequent articles of this series, we will continue to explain in depth the technological exploration related to our package volume on Douyin from the aspects of DEX, resources, SO and business governance. We are looking forward to it.

Join us

Tiktok Android basic technology team is a deep pursuit of the ultimate team, we focus on the performance, architecture, package size, stability, basic library, compilation and construction of the direction of deep cultivation, to ensure the super-large team research and development efficiency and the use of hundreds of millions of users experience. At present, Beijing, Shanghai, Hangzhou, Shenzhen are in need of a large number of talents, we welcome people with lofty ideals to work with us to build a global APP with 100 million users!

You can click the following link to enter bytedance’s official recruitment website to inquire about positions related to “Douyin Basic Technology Android” :

Beijing, Shanghai job.toutiao.com/referral/mo 】…

Hangzhou job.toutiao.com/s/8V74RjJ 】

You can also contact us by email: [email protected] or [email protected] for more information or send your resume to us directly!

Exploration on Volume optimization of Douyin Android package: Simplify DEX volume from Class bytecode

preface

DEX general optimization ideas

Remove redundant assignments

Optimization idea

Remove side-effect-free code

Optimization idea

Package defects

Proguard scheme

Short method inline

The access method is inlined

Optimization idea

Getter – setters inline

Optimization idea

Why not use Proguard

Constant field elimination

Optimization idea

The r.class constant is inline

Why the difference

Why isn’t the Library Module inline

Why isn’t ProGuard optimized

The solution

summary

Join us

Related Posts

Java collection

RecyclerView usage

Interthread Communication in Java Concurrent Programming Threads (Part 4)