preface
First of all, why parse the installation package? What is the purpose? What prompted us to do this?
- Reduce the size of the package, product or operation Students believe that the smaller the package size, the more downloads
- App market restrictions, such as the App Store and Google Play, require smaller packages
- Reducing the memory footprint, whether Rom or Ram, is definitely proportional to the increase of the application package volume, so reducing the package body is also indirectly optimizing the memory footprint
We find so for several reasons, we do it, indeed in the actual work, especially the junior partner of toC, more apparent, they have been doing package volume optimization, so to optimize the volume, it must be to know the structure of the package, so as to have appropriate optimization method, then we’ll look at the package.
Package composition analysis
Using Android Studio’s visualization tools, we opened a normal APK installation package, as shown below
To be more convincing, I took the WX package to analyze it, as follows:
Through the two pictures, we can distinguish the rough composition of the package as follows:
- Dex, so library. In addition, WX’s SO library only retains armeabi-V7A architecture packages, accounting for 50%, which is quite high.
- R Resource files and assets, where to store pictures, audio files, and resource files
- Resources. Arsc files are also up to 6.8MB. This is the index table for resources. It is generated by the AAPT tool during the packaging process.
- Meta-inf Indicates the signature information
- Androidmanifest.xml manifest configuration file
On the whole, Dex, So, R, Assets and Resources. Arsc account for a high proportion, So we must start from these aspects, right?
How can I reduce the size of the installation package
1. Resource compression
Lossless compression of large images, PNG images that do not need the alpha channel, compressed into JPG images, or use smaller WebP images: WebP Introduction For audio files stored in Assets you can choose remote dependency, caching after the first download
2. Reduce and obfuscate through the compiler
The R8 compiler requires the Android Gradle plugin to be 3.4.0 or later. The R8 compiler requires the Android Gradle plugin to be 3.4.0 or later
3. Resources. Arsc file reduction
How can I reduce this file? After querying the data, it is found that the file has certain influence on different languages and different encoding formats.
- For pure English, utF-8 is recommended
- For Chinese, utF-16 is recommended
See aapt related commands for specific implementation operations
4. So the library
From the second picture above, we can see that wX’s SO library only retains the Armeabi-V7a architecture, which is the most popular architecture at present. Why would you dare to retain only one architecture with such a large number of USERS as WX? Remove x86 and ARM64-V8A. This is superficial optimization, but the deeper level is to simplify the so library code, for example, to remove independent libraries and reduce redundant code. It is also recommended that the C++ runtime library use stlport_shared, which also reduces the package size and saves a bit of memory. Note this approach: the application needs to load the required shared library first, and then load other native modules that depend on this shared library
static{
System.loadLibrary("stlport_shared");
System.loadLibrary("xxxxx");
}
Copy the code
5. Optimized the number of Dex files
After we use multiDex, or after the number of methods reaches 65535, we have to subcontract the code. What are the problems with subcontracting?
- The unreasonable allocation of method ids leads to more Dex. Due to the redundancy of method ids, the number of classes that can be placed in each Dex decreases.
- Information storage is redundant, because each dex contains detailed information about the method called. For example, if a class method is referenced by another dex, it will not only contain method information in its own dex, but also store method information in the referenced dex. This causes redundancy, and excessive redundancy leads to an increase in the number of Dex.
So how do you solve the problem? The answer is to try to make method references in the same dex, so as to reduce redundancy and reduce the addition of dex. At present, the best solution is recommended to use: ReDex, an open source compilation tool of Facebook. For specific methods, please refer to the documentation: github.com/facebook/re… I will not expand the description here.
6. Dex compression
This method is still from the Facebook package, its real dex code into assets directory, and through the Xz compression algorithm (the algorithm compression rate is about 30% higher than Zip), and through the application when the first startup decompression, and the use of multi-threaded decompression, the time is not so obvious.
summary
Having said so many optimization methods, if you want to achieve the ultimate, there must be a way, but we are now in the 5G era, will you still feel about 10M or even 100M? This requires a trade-off between user experience, and some extreme optimizations will definitely reduce user experience, so it needs to be done on demand.
Matrix App Checker
Now that we’ve seen the package structure and the common reduction methods, what can App Checker do to help us reduce? Let me find out.
The code directory
Jar package – apktool-lib-2.4.0.jar, which decompilates APK to produce dex, libs, manifest and other files. Let’s look at the code
- The exception directory abstracts two TaskExecuteException, TaskInitException, task execution, and task initialization exceptions for easy capture.
- ApkJob is abstracted from the job directory to manage all tasks and JobResult
- The main function of the output directory is to write the output to a file in Json or HTML format
- The result directory abstracts and implements JobResult and TaskResult
- The implementation of all tasks in the task directory, including CountClassTask number of statistical classes, MethodCountTask number of statistical methods, and UnzipTask decompression task is responsible for decompressing APK into relevant files.
- ApkChecker is responsible for creating the Job and then calling the run method.
The class diagram
Text descriptions are always a bit weak, so draw a class structure diagram to help us understand the code.
ApkChecker is the check startup class that creates the ApkJob, creates the task list from the ApkJob, creates the actual task object from the Factory Factory mode, and returns the result of the task, so it looks like this.
Scan the source code
ApkChecker
public final class ApkChecker {
public static void main(String... args) {
if (args.length > 0) {
// Create the ApkChecker object
ApkChecker m = new ApkChecker();
m.run(args);
} else {
System.out.println(INTRODUCT + HELP);
System.exit(0); }}private void run(String[] args) {
// Create the Job object and call run
ApkJob job = new ApkJob(args);
try {
job.run();
} catch (Exception e) {
e.printStackTrace();
System.exit(-1); }}}Copy the code
Let’s look at the run method for ApkJob:
public void run(a) throws Exception {
// Parse the parameters and create the corresponding task
if (parseParams()) {
// Create a decompression task to decompress the APK
ApkTask unzipTask = TaskFactory.factory(TaskFactory.TASK_TYPE_UNZIP, jobConfig, new HashMap<String, String>());
// Put the task into the preTasks collection
preTasks.add(unzipTask);
// Get the output format configuration from jobConfig.
for (String format : jobConfig.getOutputFormatList()) {
// Get JobJsonResult or JobHtmlResult
JobResult result = JobResultFactory.factory(format, jobConfig);
if(result ! =null) {
jobResults.add(result);
} else {
Log.w(TAG, "Unknown output format name '%s' !", format); }}// Start executing
execute();
} else{ ApkChecker.printHelp(); }}// This function creates tasks based on the parameters passed from the command line and puts them into the taskList collection
private boolean parseParams(a) {
if(args ! =null && args.length >= 2) {
![image.png](https://p6-juejin.byteimg.com/tos-cn-i-k3u1fbpfcp/80454d55247f41b1a2db8e8fe3d5d16b~tplv-k3u1fbpfcp-watermark.image)
int paramLen = parseGlobalParams();
for (int i = paramLen; i < args.length; i++) {
if (args[i].startsWith("-") && !args[i].startsWith("--")) {
Map<String, String> params = new HashMap<>();
paramLen = parseParams(i + 1, args, params);
if(! params.containsKey(JobConstants.PARAM_R_TXT)) { String inputDir = jobConfig.getInputDir();if(! Util.isNullOrNil(inputDir)) { params.put(JobConstants.PARAM_R_TXT, inputDir +"/" + ApkConstants.DEFAULT_RTXT_FILENAME);
}
}
ApkTask task = createTask(args[i], params);
if(task ! =null) { taskList.add(task); } i += paramLen; }}}else {
return false;
}
return true;
}
// Finally look at the execution process
private void execute(a) throws Exception {
try {
// Prioritize the tasks in the preTasks preparation, and then execute the results
for (ApkTask preTask : preTasks) {
// Task initialization
preTask.init();
// The task is executed synchronously and the result is obtained
TaskResult taskResult = preTask.call();
if(taskResult ! =null) {
TaskResult formatResult = null;
// Iterate over the preconfigured result set
for (JobResult jobResult : jobResults) {
// Match taskResult to jobResult configuration
formatResult = TaskResultFactory.transferTaskResult(taskResult.taskType, taskResult, jobResult.getFormat(), jobConfig);
if(formatResult ! =null) {
// Put the decompressed task results in jobResultjobResult.addTaskResult(formatResult); }}}}// Initialize other tasks
for (ApkTask task : taskList) {
task.init();
}
Create a thread pool. By default, there is only one thread.
List<Future<TaskResult>> futures = executor.invokeAll(taskList, timeoutSeconds, TimeUnit.SECONDS);
for (Future<TaskResult> future : futures) {
The thread pool has only one thread, so it appears to be concurrent, but it is not.
TaskResult taskResult = future.get();
if(taskResult ! =null) {
TaskResult formatResult = null;
for (JobResult jobResult : jobResults) {
// The matching result is finally put into jobResult
formatResult = TaskResultFactory.transferTaskResult(taskResult.taskType, taskResult, jobResult.getFormat(), jobConfig);
if(formatResult ! =null) { jobResult.addTaskResult(formatResult); }}}}// Close the thread pool
executor.shutdownNow();
for (JobResult jobResult : jobResults) {
// Output to a file
jobResult.output();
}
Log.d(TAG, "parse apk end, try to delete tmp un zip files");
// Delete the decompressed apk file directory
FileUtils.deleteDirectory(new File(jobConfig.getUnzipPath()));
} catch (Exception e) {
Log.e(TAG, "Task executor execute with error:" + e.getMessage());
throwe; }}Copy the code
The whole process, in fact, is very simple, not complex, the whole process is finished, we do not know, resulting in the task to execute what kind of code, in order to obtain the corresponding information? Let’s look at a few specific Task tasks
MethodCountTask
Why did you choose method counting? Because we often encounter the need for quantitative statistics of project methods, how do we do statistics? Can we use this study to get it done? Come with me. Direct source code
// Take a look at the initialization method first, what does it do?
@Override
public void init(a) throws TaskInitException {
super.init();
// Obtain the decompressed apk file directory
String inputPath = config.getUnzipPath();
if (Util.isNullOrNil(inputPath)) {
throw new TaskInitException(TAG + "---APK-UNZIP-PATH can not be null!");
}
Log.i(TAG, "input path:%s", inputPath);
// Create a File object based on the path and check the File properties to see if they match the rules
inputFile = new File(inputPath);
if(! inputFile.exists()) {throw new TaskInitException(TAG + "---APK-UNZIP-PATH '" + inputPath + "' is not exist!");
} else if(! inputFile.isDirectory()) {throw new TaskInitException(TAG + "---APK-UNZIP-PATH '" + inputPath + "' is not directory!");
}
// If the rule matches, all files in the folder are found
File[] files = inputFile.listFiles();
try {
if(files ! =null) {
for (File file : files) {
// Find the file at the end of dex
if (file.isFile() && file.getName().endsWith(ApkConstants.DEX_FILE_SUFFIX)) {
// Add to the dexFileNameList list cache
dexFileNameList.add(file.getName());
//RandomAccessFile is the most powerful file content access class in Java's input and output stream architecture
// It provides many ways to manipulate files, including read and write support
// Compared with normal I/O streams, it supports arbitrary access mode. Programs can jump to any place to read and write data.
RandomAccessFile randomAccessFile = new RandomAccessFile(file, "rw");
// Add random file objects to the dexFileListdexFileList.add(randomAccessFile); }}}}catch (FileNotFoundException e) {
throw new TaskInitException(e.getMessage(), e);
}
// Get configuration information for grouping, either by class or by package
if (params.containsKey(JobConstants.PARAM_GROUP)) {
if (JobConstants.GROUP_PACKAGE.equals(params.get(JobConstants.PARAM_GROUP))) {
group = JobConstants.GROUP_PACKAGE;
} else if (JobConstants.GROUP_CLASS.equals(params.get(JobConstants.PARAM_GROUP))) {
group = JobConstants.GROUP_CLASS;
} else {
Log.e(TAG, "GROUP-BY '" + params.get(JobConstants.PARAM_GROUP) + "' is not correct!"); }}}Copy the code
Once the initialization is complete, it is directly to the execution of the task, which continues
// The method is a bit long, let's go line by line
@Override
public TaskResult call(a) throws TaskExecuteException {
try {
// Get the corresponding TaskResult object, TaskJsonResult or TaskHtmlResult, depending on the task type and result type
// By default
TaskResult taskResult = TaskResultFactory.factory(getType(), TASK_RESULT_TYPE_JSON, config);
if (taskResult == null) {
return null;
}
// Record the start time
long startTime = System.currentTimeMillis();
JsonArray jsonArray = new JsonArray();
// Start looping through the list of dex files added during initialization, i.e. RandomAccessFile
for (int i = 0; i < dexFileList.size(); i++) {
RandomAccessFile dexFile = dexFileList.get(i);
// See the following function analysis
countDex(dexFile);
// Close the file stream to prevent leakage
dexFile.close();
// Count the number of internal methods
int totalInternalMethods = sumOfValue(classInternalMethod);
// Count external methods
int totalExternalMethods = sumOfValue(classExternalMethod);
JsonObject jsonObject = new JsonObject();
jsonObject.addProperty("dex-file", dexFileNameList.get(i));
// Group by configuration
if (JobConstants.GROUP_CLASS.equals(group)) {
List<String> sortList = sortKeyByValue(classInternalMethod);
JsonArray classes = new JsonArray();
for (String className : sortList) {
JsonObject classObj = new JsonObject();
classObj.addProperty("name", className);
classObj.addProperty("methods", classInternalMethod.get(className));
classes.add(classObj);
}
jsonObject.add("internal-classes", classes);
} else if (JobConstants.GROUP_PACKAGE.equals(group)) {
String packageName;
for (Map.Entry<String, Integer> entry : classInternalMethod.entrySet()) {
packageName = ApkUtil.getPackageName(entry.getKey());
if(! Util.isNullOrNil(packageName)) {if(! pkgInternalRefMethod.containsKey(packageName)) { pkgInternalRefMethod.put(packageName, entry.getValue()); }else {
pkgInternalRefMethod.put(packageName, pkgInternalRefMethod.get(packageName) + entry.getValue());
}
}
}
List<String> sortList = sortKeyByValue(pkgInternalRefMethod);
JsonArray packages = new JsonArray();
for (String pkgName : sortList) {
JsonObject pkgObj = new JsonObject();
pkgObj.addProperty("name", pkgName);
pkgObj.addProperty("methods", pkgInternalRefMethod.get(pkgName));
packages.add(pkgObj);
}
jsonObject.add("internal-packages", packages);
}
jsonObject.addProperty("total-internal-classes", classInternalMethod.size());
jsonObject.addProperty("total-internal-methods", totalInternalMethods);
if (JobConstants.GROUP_CLASS.equals(group)) {
List<String> sortList = sortKeyByValue(classExternalMethod);
JsonArray classes = new JsonArray();
for (String className : sortList) {
JsonObject classObj = new JsonObject();
classObj.addProperty("name", className);
classObj.addProperty("methods", classExternalMethod.get(className));
classes.add(classObj);
}
jsonObject.add("external-classes", classes);
} else if (JobConstants.GROUP_PACKAGE.equals(group)) {
String packageName = "";
for (Map.Entry<String, Integer> entry : classExternalMethod.entrySet()) {
packageName = ApkUtil.getPackageName(entry.getKey());
if(! Util.isNullOrNil(packageName)) {if(! pkgExternalMethod.containsKey(packageName)) { pkgExternalMethod.put(packageName, entry.getValue()); }else {
pkgExternalMethod.put(packageName, pkgExternalMethod.get(packageName) + entry.getValue());
}
}
}
List<String> sortList = sortKeyByValue(pkgExternalMethod);
JsonArray packages = new JsonArray();
for (String pkgName : sortList) {
JsonObject pkgObj = new JsonObject();
pkgObj.addProperty("name", pkgName);
pkgObj.addProperty("methods", pkgExternalMethod.get(pkgName));
packages.add(pkgObj);
}
jsonObject.add("external-packages", packages);
}
jsonObject.addProperty("total-external-classes", classExternalMethod.size());
jsonObject.addProperty("total-external-methods", totalExternalMethods);
jsonArray.add(jsonObject);
}
((TaskJsonResult) taskResult).add("dex-files", jsonArray);
taskResult.setStartTime(startTime);
taskResult.setEndTime(System.currentTimeMillis());
// Return the result
return taskResult;
} catch (Exception e) {
throw newTaskExecuteException(e.getMessage(), e); }}Copy the code
CountDex (dexFile) function analysis
private void countDex(RandomAccessFile dexFile) throws IOException {
// The internal method Map, grouped by class, clears the cache
classInternalMethod.clear();
// The external dependency method Map, grouped by class, clears the cache
classExternalMethod.clear();
// Group by packet, ibid
pkgInternalRefMethod.clear();
pkgExternalMethod.clear();
// Use the DexData class in the com.android.dexdeps package,
DexData dexData = new DexData(dexFile);
dexData.load();
// Get all method index objects, including internal methods and externally indexed methods
MethodRef[] methodRefs = dexData.getMethodRefs();
// Get index data for all external classes
ClassRef[] externalClassRefs = dexData.getExternalReferences();
// Get the obfuscated classes
Map<String, String> proguardClassMap = config.getProguardClassMap();
String className = null;
for (ClassRef classRef : externalClassRefs) {
// Get the class name
className = ApkUtil.getNormalClassName(classRef.getName());
if (proguardClassMap.containsKey(className)) {
// Match and assign to the class name before the obfuscation
className = proguardClassMap.get(className);
}
if (className.indexOf('. ') = = -1) {
continue;
}
// Put the class name into the external method Map for use when matching external methods below
classExternalMethod.put(className, 0);
}
// Go through all the methods, find the external and internal methods, and add them to the classExternalMethod and classInternalMethod Map, respectively
for (MethodRef methodRef : methodRefs) {
// Get the class name of the method
className = ApkUtil.getNormalClassName(methodRef.getDeclClassName());
// Match the class name before the obfuscation
if (proguardClassMap.containsKey(className)) {
className = proguardClassMap.get(className);
}
if(! Util.isNullOrNil(className)) {if (className.indexOf('. ') = = -1) {
continue;
}
// The class name matches the class information stored by the external method.
if (classExternalMethod.containsKey(className)) {
classExternalMethod.put(className, classExternalMethod.get(className) + 1);
} else if (classInternalMethod.containsKey(className)) {
classInternalMethod.put(className, classInternalMethod.get(className) + 1);
} else {
classInternalMethod.put(className, 1); }}}// Delete classes that have no method references
Iterator<String> iterator = classExternalMethod.keySet().iterator();
while (iterator.hasNext()) {
if (classExternalMethod.get(iterator.next()) == 0) { iterator.remove(); }}}Copy the code
Through the analysis of the source code, in fact, the principle is to use the DexData object to load dexFile, and finally getMethodRefs, getExternalReferences method to obtain relevant information, Finally, all methods are classified into internal methods and external method sets through the external classExternalMethod.
CountClassTask
After the above analysis experience, we can analyze CountClassTask much easier, directly find the core code as follows:
DexData dexData = new DexData(dexFile);
dexData.load();
dexFile.close();
ClassRef[] defClassRefs = dexData.getInternalReferences();
Set<String> classNameSet = new HashSet<>();
for (ClassRef classRef : defClassRefs) {
String className = ApkUtil.getNormalClassName(classRef.getName());
if (classProguardMap.containsKey(className)) {
className = classProguardMap.get(className);
}
if (className.indexOf('. ') = = -1) {
continue;
}
classNameSet.add(className);
}
Copy the code
Whereas getExternalReferences was used above, getInternalReferences is used directly to get the index data for all the inner classes and finally add it to a HashSet. The whole process is straightforward. Are all tasks implemented using DexData objects? No, let’s go to the next one
UnusedAssetsTask
Now that we know the purpose of this task, let’s look at how it works. Let’s look at what’s going on in init
// Initialize method
@Override
public void init(a) throws TaskInitException {
super.init();
// Get the path of the decompressed APk file
String inputPath = config.getUnzipPath();
if (Util.isNullOrNil(inputPath)) {
throw new TaskInitException(TAG + "---APK-UNZIP-PATH can not be null!");
}
inputFile = new File(inputPath);
// Also check whether the file exists and check whether the file properties are folders
if(! inputFile.exists()) {throw new TaskInitException(TAG + "---APK-UNZIP-PATH '" + inputPath + "' is not exist!");
} else if(! inputFile.isDirectory()) {throw new TaskInitException(TAG + "---APK-UNZIP-PATH '" + inputPath + "' is not directory!");
}
// Put it in ignoreSet based on the list of resources ignored in the configuration file. In order to ignore some file checks, such as determining that the resource is useful, it does not need to be checked, and narrow the scope.
if(params.containsKey(JobConstants.PARAM_IGNORE_ASSETS_LIST) && ! Util.isNullOrNil(params.get(JobConstants.PARAM_IGNORE_ASSETS_LIST))) { String[] ignoreAssets = params.get(JobConstants.PARAM_IGNORE_ASSETS_LIST).split(",");
Log.i(TAG, "ignore assets %d", ignoreAssets.length);
for (String ignore : ignoreAssets) {
ignoreSet.add(Util.globToRegexp(ignore));
}
}
File[] files = inputFile.listFiles();
if(files ! =null) {
for (File file : files) {
if (file.isFile() && file.getName().endsWith(ApkConstants.DEX_FILE_SUFFIX)) {
// Again, filter out the dex file and place it in the dexFileNameListdexFileNameList.add(file.getName()); }}}}Copy the code
During initialization, the usual file checks are performed, params parameters are collated, the dex file is cached in a list to be processed, and the call function is taken a look at how it finds unrelied resources.
@Override
public TaskResult call(a) throws TaskExecuteException {
try {
TaskResult taskResult = TaskResultFactory.factory(type, TaskResultFactory.TASK_RESULT_TYPE_JSON, config);
long startTime = System.currentTimeMillis();
// Create a file object in assets
File assetDir = new File(inputFile, ApkConstants.ASSETS_DIR_NAME);
// Find all files recursively and store them in assetsPathSet
findAssetsFile(assetDir);
// Remove ignored files from assetsPathSet
generateAssetsSet(assetDir.getAbsolutePath());
Log.i(TAG, "find all assets count: %d", assetsPathSet.size());
// The core implementation, parsing the assets reference in the code, see the code attached below
decodeCode();
Log.i(TAG, "find reference assets count: %d", assetRefSet.size());
assetsPathSet.removeAll(assetRefSet);
JsonArray jsonArray = new JsonArray();
for (String name : assetsPathSet) {
jsonArray.add(name);
}
((TaskJsonResult) taskResult).add("unused-assets", jsonArray);
taskResult.setStartTime(startTime);
taskResult.setEndTime(System.currentTimeMillis());
return taskResult;
} catch (Exception e) {
throw newTaskExecuteException(e.getMessage(), e); }}// Resolve the assets reference in the code
private void decodeCode(a) throws IOException {
for (String dexFileName : dexFileNameList) {
// Get the DexBackedDexFile object from the dex file using the apktool -lib-2.4.0.1. jar class
DexBackedDexFile dexFile = DexFileFactory.loadDexFile(new File(inputFile, dexFileName), Opcodes.forApi(15));
BaksmaliOptions options = new BaksmaliOptions();
// The apktool API gets the sorted set of class references
List<? extends ClassDef> classDefs = Ordering.natural().sortedCopy(dexFile.getClasses());
for (ClassDef classDef : classDefs) {
// Press space to combine all the code in the class into an array
String[] lines = ApkUtil.disassembleClass(classDef, options);
if(lines ! =null) {
// Matches references to resource filesreadSmaliLines(lines); }}}}private void readSmaliLines(String[] lines) {
if (lines == null) {
return;
}
for (String line : lines) {
line = line.trim();
// Find constant characters
if(! Util.isNullOrNil(line) && line.startsWith("const-string")) {
String[] columns = line.split(",");
if (columns.length == 2) {
// Get the resource name in the constant character
String assetFileName = columns[1].trim();
assetFileName = assetFileName.substring(1, assetFileName.length() - 1);
if(! Util.isNullOrNil(assetFileName)) {for (String path : assetsPathSet) {
// Loop matches, which are added to assetRefSet
if (assetFileName.endsWith(path)) {
assetRefSet.add(path);
}
}
}
}
}
}
}
Copy the code
After code analysis, we already know that the principle is actually to match the constant character in the class with the file name in assets directory. If it can match, it means it has been referenced. If it fails to match, it means it has not been referenced. At this point, are you curious about other tasks and have a method for analyzing them? We won’t go into detail for space reasons, so please feel free to ask any questions in the comments section.
summary
This issue will stop here first, then add detailed actual combat examples, after all, only practice out of the truth, due to the reason of time, there are 8, of course, we will continue to update in the follow-up, please understand.