background

In recent years, with the rapid development of mobile phone business, in order to meet the demands of mobile phone users and the rapid growth of business functions, the technical architecture of mobile terminal has gradually developed from a single large engineering application to modular and componentized. In the case of Amap, Android code has reached the level of one million lines, with more than 100 modules involved in the final build.

If you don’t have a standard dependency detection and monitoring tool, it doesn’t take long for module dependencies to become a mess.

From the module Owner’s perspective, why is dependency analysis important?

  • As a module Owner, the first thing I want to know is “Who depends on me? Which interfaces depend on “. This is the only way to assess the scope of the changes to this module and the rationality of the exposed interfaces.

  • I also want to know “Who am I relying on? What external interfaces are called “and have an idea of the external capabilities required.

From a global perspective, a healthy dependency structure should prevent “lower modules” from directly relying on “upper modules”, but also eliminate cyclic dependencies. By analyzing global dependencies, you can quickly locate unreasonable dependencies and expose service problems in advance.

Therefore, dependency analysis is a very important part of the r&d process.

Common dependency analysis methods

When it comes to Android dependency analysis, the following scenarios probably spring to mind:

  • Analyze Gradle dependency trees.

  • Scan the import statements in your code.

  • Use the analytics features that come with Android Studio.

Let’s analyze them one by one:

1. Gradle depends on trees

/gradlew :

:dependencies –configuration releaseCompileClasspath -q

As you can see, there are two problems with this approach:

  • Declarations are dependencies, and are printed to the result even if no library is used in the code.

  • Analysis can only be done at the module level, not at the method level.

2. ScanimportThe statement

Scan import statements in Java files to get call relationships between files (classes).

Because the mapping between modules and files (classes) is easy to get (scan directories). So, we get file (class) dependencies, that is, file (class) level dependencies between modules.

This solution increases the dimension of the results compared to Gradle dependency scanning, which can be analyzed at the file (class) level. But it also has some disadvantages:

  • The import * case cannot be handled.

  • Scanning the “import with no class” scenario is inefficient (requiring a source string lookup).

3. Use the IDE’s built-in analytics

Trigger the Android Studio menu “Analyze” -> “Analyze Dependencies” to get data for method-level Dependencies between modules. As shown in figure:

Android Studio can accurately analyze “method-level” references between modules, jump to and from the IDE, and scan references to the Android SDK.

This scheme is superior to the previous two mainly in its accuracy. But it also has several problems:

  • Time consuming: A full analysis of the AMap source code takes about 10 minutes.

  • The analysis results cannot be reused by a third party and cannot generate a visual dependency graph.

  • Analysis of forward and reverse dependencies requires two scans.

To summarize the above three schemes:

  • Gralde dependency is based on engineering configuration, granularity is too coarse and results are inaccurate.
  • “Import-scan scheme” can get file-level dependencies but incomplete data.
  • Although IDE scan results are accurate, but the data reuse is difficult, not easy to engineering.

Why use bytecode for analysis?

Refer to the Android build flowchart. All Java source code and R.java files generated by AAPT will be compiled into.class files, and then compiled into dex files, and finally generated into APK files through ApkBuilder. The.class file in the figure is what we call Java bytecode, a binary escape from Java source code.

On Android, common bytecode application scenarios include:

  • Bytecode staking: used to monitor the performance of UI, memory, network and other modules.

  • Modify JAR packages: For libraries without source code, simple logic changes can be made by editing bytecodes.

Returning to the subject of this article, why analyze bytecodes rather than Java code or dex files?

We don’t use Java code because some libraries are provided as jars or AArs and we can’t get the source code. The dex file is not used because it does not have a useful parsing tool. So parsing bytecode is almost our only option.

How do I use bytecode to analyze dependencies?

To get inter-module dependencies, you need to get inter-module class-to-class dependencies. To determine the relationship between classes, analyze the class bytecode statements.

1. When to analyze?

For those of you familiar with the Android build process, transform is a familiar task. It is a bytecode Hook entry provided by the Android Gradle plugin.

In the transform task, all bytecode files (including tripartite libraries) are entered as Input.

Taking JarInput as an example, the name of the module can be obtained by analyzing its file field. Parse the file file to get all the bytecode files of the module.

With the module name and the corresponding path to the class file, we have established the module and class correspondence, which is the first key data we get.

2. What tools are used for analysis?

The most common tools for parsing Java bytecode include Javassit, ASM, and CGLib. ASM is a lightweight library that performs well, but requires direct manipulation of JVM instructions. CGLib is an encapsulation of ASM and provides more advanced interfaces.

Javassist, by contrast, is much simpler, based on a Java API, without manipulating JVM instructions, but less performance (because Javassit adds a layer of abstraction). During the engineering prototype phase, we chose Javassit in order to quickly verify the results.

3. What is the specific plan?

Let’s start with a simple example of how to parse the following code:

1: package com.account;
2: import com.account.B;
3: public class A {
4:     void methodA() { 5: B b = new B(); // initialize instance B of Class B 6: b.methodb (); MethodB = methodB; // methodB = methodB;Copy the code

Step 1: Initialize the environment, load the bytecode A.class, and register the statement parser.

// Initialize the ClassPool to register the bytecode file directory with the Pool. ClassPool pool = ClassPool.getDefault(); pool.insertClassPath('
      
       '
      CtClass CLS = pool.get()"com.account.A"); MyExprEditor editor = new MyExprEditor(ctCls) ctcls. instrument(Editor)Copy the code

Step 2: Customize the expression parser to analyze class A(using parse statement calls as an example).

Class MyExprEditor extends ExprEditor {@override void edit(MethodCall m) {def clsAName = ctcls.name Def in which method the statement is calledwhere= methodinfo.getName () on which line def line = m.linenumber def clsBName = m.className // method called Def methodBName = m.methodname} }Copy the code

ExprEditor’s Edit (MethodCall M) callback intercepts all method calls in Class A.

In addition to the parsing of MethodCall in this case, it supports parsing new, New Array, ConstructorCall, FieldAccess, InstanceOf, cast, and try-catch statements.

Class A = A; Class A = B;


Class1 Class2 Expr method1 method2 lineNo
com.account.A com.account.B NewExpr methodA 5
com.account.A com.account.B methodCall methodA methodB 6

Here’s a simple explanation:

Line 5 of com.account.A (in methodA) calls the com.account.B constructor;

Com.account. A calls methodB (methodA) of com.account.B;

This is dependency data at the “class and method level between classes”. Combined with the “module and class” correspondence obtained in step 1, we finally obtained the “method level dependency data between modules”.

Based on these basic data, we can also customize dependency detection rules, generate global module dependency graph, etc., which will not be expanded in this article.

summary

This paper mainly introduces the importance of module dependency analysis in the research and development process, and analyzes the common Dependency analysis scheme of Android, from Gradle dependency tree analysis, Import scan, using IDE analysis, to the final bytecode analysis, the scheme is gradually progressive. The closer the solution is to the source, the more fundamental it is.