Author: Dapeng
Cloud music iOS client is the old project began in 2013, after nearly 10 years of business development, since individual music APP development, a variety of business blessing, peremptory has become a huge APP similar to the platform level, and with the development of the business is more and more bloated package volume, affect the actual experience of users, and even the public praise of the brand, Before the author started to optimize, the package volume of Cloud music in AppStore had reached 420MB. In this case, the team started the package volume optimization project.
Package size optimization is old proposition of client development, basically as iOS development students more or less understand generally what to do, but with the development of apple, some feasible measures in the new version is not applicable, so this article is focused on some new practices in the process of optimization, and in the big project is how to fall to the ground, So without further ado, let’s begin.
CAL
Before starting optimization, we first need to know all kinds of diameters of package volume and the relationship between them, because some subsequent optimization measures will lead to the fluctuation of different diameters, so we should first determine the final target diameter. First of all, we can see the specific installation size and download size of our APP in apple background, including different model versions.
See the following image for The Download and installation algorithm. After uploading, Apple performs DRM encryption for binary nodes (which also results in the increase of the package size) and AppHillclimbing. Apphillclimbing will tailor the resources and code of the original package to varying degrees according to the model in order to generate a version for the specific model. Apple also produces a generic version that includes the complete set, but it’s not really useful. About DRM and AppClimbing are not expanded here, there is a link at the bottom of the article.
As shown above, we have three metrics before and after upload:
- Original APP package volume: the actual size of the APP after the IPA decomposes the APP package before uploading
- Download volume: the size of the prompt box for downloading traffic in the AppStore
- Install volume: the size shown in the APP details in the AppStore
After finding out the relationship between each indicator, we finally chose the installation volume with the strongest user perception as the core indicator and optimized it as the final goal.
Analysis of the
Although have determined the goal, but before the optimization, also need to analyze the present situation, find the largest degradation point, thus targeted optimization, to obtain the biggest benefit ratio, then below is cloud music iOS package basic situation, you can see the red resources section accounted for more than half of the volume, and binary times also accounted for more than a quarter, Then the focus of subsequent optimization can be put on resources and binary source code.
resources
In fact, there are several ways to deal with resources: resource cleaning, resource sorting, resource compression, resource cloud migration, resource merger and so on. In short, we try our best to reduce the local space occupied by resources. The following is a brief introduction to the work we have done in cloud music.
Resources to clean up
Before start to do the overall resources optimization, the first step is to clean up are not in use of resources, including pictures, configuration files, audio and video, and so on, to detect the main idea is useless resources through static test whether resources are cited, such as using ImageName judge whether the image is used, of course, online detection method is more accurate, But there are no resources needed, but cloud music as old business, using the image position is various, for example, the reference file name is not standard, not make AssetCatelog, manual Mosaic image name 2 x3x problems, which requires a customized way lookup, eliminate abnormal situation, other APP can be adjusted according to their own actual situation, The idea is the same. There are tools available online.
Cloud music after several rounds of cleaning, has cleared images and other types of files 1200+, the income of 12+MB or so of the original package volume decreased, or quite considerable.
Resource arrangement
Asset. Car files. As we all know, apple has launched AssetCatelog files since iOS7 to help developers manage resources, the most important of which is image resources. Asset-car files will be generated after compilation and put into the IPA package. As mentioned above, cloud music is an old project, so some resource images are not managed by Asset. Using Asset will bring benefits to package volume, so existing resources need to be migrated and managed by Asset. But here’s the question: why does using Asset give you a package size gain?
To answer the above questions, we should start from the principle of Asset. During the compilation process of AssetCatelog, ImageSet images in Asset will be compressed lossless first, and multiple ImageSet images will be combined into a large image. Therefore, after compilation, There is no way to read images from the bundle path, so you have to use Apple’s ImageName API, because it gets the detailed image information from the composite image through coordinates, etc. So the advantage of this is that there is a gain in image size during compression and composition; However, through our study, we found that not all pictures have such benefits. Some large-volume pictures will produce larger volume after lossless compression and synthesis. We guess that this may be related to the synthesis of large pictures, and the smaller the picture, the higher the possibility of benefits.
In addition, it is best not to use the type of ImageSet in the GIF, because in the process of compression and synthesis, there will be problems in the GIF, resulting in incorrect data read through ImageName, can not play and other problems; However, the type of DataSet can be used, because DataSet does not participate in synthesis and compression, so it will not affect. For other resource types, the way of DataSet can also be used, and NSDataAsset can be used when reading. To find out about the resources handled by Asset, you can use the following command to parse the compiled Asset. Car to obtain information about the compiled resources.
xcrun --sdk iphoneos assetutil --info Assets.car
Copy the code
As you can see from the figure above, for resources of type Data, there is no compression; For the Image type, the specific compression algorithm is marked, as well as some Image information. In addition, we also found in the process that Apple used different compression algorithms for different images, and they would be compressed into multiple copies. This is why the IPA volume increased after we moved some resources from the bundle to the AssetCatalog, but it doesn’t matter, the installation volume will decrease. Because the greatest benefits of Asset use come from Apple’s AppHillclimbing, which allows Asset. Car to be distributed according to different engines, for example, 1x2x3X can be compressed into multiple copies, but only one per machine can be used. That’s why even though ipAS are getting bigger, they actually get smaller.
Finally, because there is no way to compare the size of each image before and after Asset compilation, we recommend small images (within 5K) and images with multiple versions (2x3x) to be put into the AssetCatalog. Other resources are more free to be stored separately instead of using DataSet. Because separate storage is easier to use various compression methods without worrying about being affected by Apple’s processing, this point will be discussed in more detail in the following resource compression.
After this optimization, the cloud music iOS client migrates 2400+ image resources of various sizes, realizing the revenue of installation volume 22+MB.
Resources compression
Resource compression is very easy to understand, as the name implies, is the compression of resources in various ways, the most important resource in cloud music is images, other types of a small proportion, the common image resource formats are mainly PNG, APNG, WebP, etc., the vast majority of images in cloud music package are also in the above formats; After the previous work, almost all the pictures are managed in the AssetCatalog, and as mentioned above, Apple will conduct lossless compression of the picture resources of the AssetCatalog, so the lossless compression of the picture resources imposed by ourselves will have no effect, because Apple will press it again, and the final result is subject to it. So in order to get optimization results in compression, you have to substantially reduce the size of the image, and then you have to do lossy compression. For the conventional image format, we use pngQuant, Tinypng and other algorithms and tools for compression. When using PngQuant, after successively testing big data samples, we finally choose the lossy ratio of 80%, because at this time, the ratio is the highest in the yield curve and has little influence on the image quality. However, this curve may be different for different projects, because the actual resources of each project are different, so it is necessary to obtain the data of the project by ourselves. The specific way is to try different compression rates through scripts and record the compression results to form a curve. In addition, there are many large webP giFs left in our package, which cannot be compressed in general ways. After some investigation, we finally found that Google officially provides Webpmux, which can dismantle, compress and synthesize WebP GIFs frame by frame. Based on this, we wrote a script that can compress WebP GIFs. Realize the compression of webP GIF. Finally, we integrated all the image compression capabilities of common formats into a large script to compress all the image resources in the package. This script is also useful for preventing deterioration later.
After this, the overall compression of all sizes of PNG images 5000+, APNG GIF 100+, WEBP GIF 100+, the total revenue 42+MB (original package volume).
Resource Cloud Migration
After cleaning, sorting, compression, resources section there are a lot of packet size, so we started the big special resources in the cloud migration, is a great resource because the benefits of big resources than most, as a result of the discussion, combined with the actual situation of cloud music, eventually set the baseline of 50 KB, more than 50 KB would be defined as major resources. It is not that we did not consider the solution of unified resource migration and unified download, but from the perspective of cloud music experience and cost, we finally chose to deal with the part with high ROI in the traditional way. After screening, there are 150+ cases in the cloud music package that are in line with large resources, and more than 85% of them can be migrated to the cloud. As for whether resources should be stored locally or in the cloud, we worked out the usage rules of relevant resources, pictures/animation, together with the design students, and the pure technical resources were judged by the technical students.
After the migration project was completed, the total migration of 100+ large resources was about 31+MB (original package volume).
Resources combined
In fact, there are two main points in resource merger: one is the removal of a single similar image. We spent some effort to use the similarity image analysis algorithm to detect all resource images of cloud music, and the result was not consistent with our expectation. In fact, there are not too many similar images, including icon, and this part has no profit. Another is the AssetCatalog merge, which has no benefit in combination with the actual situation of cloud music, mainly because cloud music resources are currently centrally managed.
binary
Each APP will eventually be compiled into a body binary, and all static library dependencies will be linked in. The size of this part is mainly influenced by the amount of code and compilation parameters. The optimization ideas below also focus on reducing the amount of code and optimizing compilation parameters.
Dead code detection
When you want to reduce code volume, the first thing you want to do is clean up useless code. What code is useless? There is useless code detection, the general detection methods are divided into online dynamic detection and offline static detection, dynamic detection accuracy is much higher than static detection, and static code compiler has supported some cutting methods, such as DeadCode optimization; Based on this, we adopted a more accurate online dynamic detection of big data. The only disadvantage is that it takes a long period to acquire data and needs to be run online.
The initial idea was to use the hook class initialization method +initialize to determine whether a class is being used or not, but there are several problems with this scheme: The first is the problem of startup timing, because we use AB sampling, so it must be enabled at a certain point after AB initialization, then the class before AB initialization can not be recorded, unless all users are recorded, only at the time of upload sampling, but this will affect the user who has not been gray; The second issue is when +initialize itself is called. Not all classes of +initialize are called. In OBJC, each class has its own metadata. In the metadata, a tag bit stores whether it has been initialized or not. This tag bit is not affected by any factors, and is marked whenever it is initialized.
struct objc_class : objc_object {
bool isInitialized(a) {
return getMeta() - >data()->flags & RW_INITIALIZED; }}Copy the code
But this method APP can’t call directly, it’s objC’s method; But it does not mean that the data of the RW_INITIALIZED flag bit does not exist, the data is still there, so we can use the existing interface and read the source information to simulate the above code, so as to obtain the flag bit data to determine whether a class is initialized, the code is as follows:
#define FAST_DATA_MASK 0x00007ffffffffff8UL
#define RW_INITIALIZED (1<<29)
- (BOOL)isUsedClass:(NSString *)cls {
Class metaCls = objc_getMetaClass(cls.UTF8String);
if (metaCls) {
uint64_t *bits = (__bridge void *)metaCls + 32;
uint32_t *data = (uint32_t *)(*bits & FAST_DATA_MASK);
if ((*data & RW_INITIALIZED) > 0) {
returnYES; }}return NO;
}
Copy the code
Through the above simulation code, whether I can get a class is used, and then report the information, based on what kind of big data analysis is already to clean up, by this way, we check out of the thousands of unused classes, but these do not represent actual is able to clean up, AB, such as some are doing business, and so on, have a plenty of embedded Therefore, the data results need to be filtered again by the business side. Finally, we processed 1200+ classes, successfully cleared 300+, and the revenue was about 2+MB (all diameters of binary chapter are original package diameters). The remaining unprocessed ones are still being processed and optimized as a long-term line.
Two or three libraries are offline
Based on the unused class data above, we can obtain the business components or two or three libraries that are no longer in use through cluster analysis. In the optimization process, we identified several two or three libraries that can be offline, and the revenue is 4+MB.
Dynamic library dependency clipping
In addition to business code processing, cloud music itself also relies on some dynamic libraries, and some static dependencies of these dynamic libraries are repeated due to diachronic reasons, as shown in the figure below:
This is an extreme Case. There is A copy of OpenSSL symbols in the main program, in dynamic library A, and in dynamic library B, so this will cause duplication and occupy binary volume. So the best solution to this problem is to change the dynamic library to static library, link in the main program, remove the original dependence, use Symbol in the main binary, which can also improve the startup speed to a certain extent, because the number of dynamic library is reduced. The total payoff from solving a problem like this is 3+ MB.
Compiler optimization
After optimizing the clipping code in various ways, it is time to optimize another factor that affects binary volume, namely compilation parameters. There are many compilation parameters, which can be divided into compile-time parameters and link-time parameters. Next, I will sort out basically all the parameters that affect binary volume for your reference
Asset Catalog Compiler Optimization
Asset compilation optimization can reduce the volume of asset. car products. This cloud music only enabled the main project, not enabled component compilation parameters, after optimization, the revenue is less than 2.1MB
EXPORTED_SYMBOLS_FILE
For APP, it can be regarded as a large “dynamic library”. When users click to open the APP, the system starts to load the dynamic library. Then, the dynamic library always contains Exported Symbols, but for APP, it will not be called in other places in iOS. More often than not, APP calls the system’s services, so we can export Exported Symbols and Trim them off. Fortunately, the compiler provides EXPORTED_SYMBOLS_FILE, which allows us to limit Exported Symbols and reduce the size of binary. EXPORTED_SYMBOLS_FILE points to the EXPORTED_SYMBOLS_FILE file, EXPORTED_SYMBOLS_FILE points to the EXPORTED_SYMBOLS_FILE file, EXPORTED_SYMBOLS_FILE points to the EXPORTED_SYMBOLS_FILE file, EXPORTED_SYMBOLS_FILE, EXPORTED_SYMBOLS_FILE, EXPORTED_SYMBOLS_FILE By specifying the symbols to leave behind in the TXT file, the compiler will crop out undeclared parts.
The following figure shows the truncated symbol section after opening
It is worth noting that if Firebase is used in the APP, it cannot be clipted completely. As a result, Firebase fails to start and therefore cannot obtain Crash information. Firebase relies on the __MH_execute_header symbol in Export Info above. So you can add __MH_execute_header to the TXT file mentioned above, and the compiler will retain the __MH_execute_header when clipping.
This is the link period optimization, only the main project to open, cloud music after open, this revenue is 2.4MB.
Link-Time Optimization
Optimization of LTO is mainly reflected in optimization of discarded code cutting across files, optimization of empty logic that will never be executed, and in-line optimization, which means that functions are copied directly to reduce the level of in-line and improve the execution efficiency and space utilization of function stack. Please refer to the official LLVM documentation for details, which will not be described here.
And it is proved by test LTO, affects only the static language OC is a dynamic language, all function method could be at runtime for the dynamic invocation, so it is impossible to cut, that’s why the link static library, if it is a C library, so it seems the original binary is very big, is actually the actual link in only a small portion of real use, However, if it is the OC library, almost all will be linked. So if you have a lot of C or C++ code in your APP’s source code, you may benefit more from this.
Although the LTO name appears to be link-duration optimization, it is actually compile-time that needs to be involved, otherwise there is no effect. This is related to cross-file optimization, and some information should be produced at compile time to provide link-duration optimization.
After LTO optimization, the revenue of cloud music is 1MB.
GCC_OPTIMIZATION_LEVEL
This is done through more aggressive GCC compilation optimizations, resulting in lower binaries. Xcode defaults to Debug O0 and Release Os, but Oz mode can also be used to achieve smaller sizes.
In fact, the principle of Oz is exactly the opposite of LTO, which is inline or outline. Oz intends to reduce the inline level of functions by using more inline, but this will make the function call stack very deep, which will reduce the execution efficiency of functions, as shown in the figure above, it will become “slow”. It’s essentially a game of time and space; Also if you want to open the can reference trill, they have encountered some objc_retainAutoreleaseReturnValue problem, but so far, we are temporarily not found in the process of actual practice, but based on the stability of consideration, this is not in a cloud music online, The test is enabled in the DEBUG environment and is still under continuous observation. If this option is enabled, the estimated revenue after testing is around 10+MB.
Other compiler optimizations
- Enable C++ Exceptions and Enable Objective-C Exceptions. Disabling this option will benefit binary volume, but will affect TryCatch. If appropriate, cloud music is disabled
- Architectures, architectural instruction sets, this part needs to be aware that some Architectures contain unneeded instruction sets
- Strip Symbols, clipping Symbols related, not expanded here, below the relevant Settings
- Strip Linked Product = YES
- Strip Style = All Symbols, note: This setting does not take effect if Strip Linked Product is not enabled
- Note: This item is set to YES by default in Deployment Postprocessing no matter how it is set
- Symbols Hidden by Default = YES, set symbol visibility
- Make Strings Read-Only = YES
- Dead Code Stripping = YES, the compile-time test determines that no Code is stripped
- Optimization Level, generally set debug to None and Release to Os
Binary summary
In addition to the above all kinds of optimization of binary measures, in fact, in the industry and many other measures, but cloud music did not use for many reasons, for example by renaming _Text code segment, and then bypass apple’s DRM encryption, to reduce the size of the binary, but the apple after iOS13 already aware of the problem, and solved to a certain extent, So this optimization method is basically dead; Binary compression, from a risk and benefit point of view, is also not yet used; There is also attribute dynamic, which is mainly used for dynamic optimization of model attributes with a large number of attributes, dynamically adding get/ SET method, so as to obtain the benefit of omitting this part of the method. This benefit estimate is also very small, so it is not used. In fact, there are many optimization methods, but for a specific APP, you can choose the most appropriate measures according to the actual situation, which does not have to be the same, after all, there is ROI consideration.
The degradation
In the process of optimization, we found that the actual deterioration rate of the project was also very fast, and even reached 40% and 50% of the optimization amount of each iteration. That is to say, we assume that the optimization of an iteration is 10MB, but the deterioration of this iteration is 45MB, so we have to start the anti-deterioration work at the same time of governance. We have developed some anti-deterioration measures, some of which have been put online, and the rest are still under development. At present, good results have been achieved. The volume deterioration has been effectively contained and the optimization results have been maintained.
- Large resource bayonet: resource detection is performed when the code is merged and the bayonet is enforced
- Two – party library bayonet: check two – party library and three – party library when code is merged, including new and upgrade
- Automatic compression: Automatic compression of resources, but the first push is still in the remote, if necessary, local
- Regular resource detection: regularly and automatically carry out resource search and problem tracing throughout the APP
- Regular code inspection: regular automatic code inspection of the whole APP, useless code offline
- Work with UED to develop guidelines for the use of graphic animation resources, which can be local and which must be remote, as well as optimization of animation resources
The results of
After a period of various optimizations, the installed volume of Cloud music decreased by 87MB, from 420MB+ to 330MB+ now. The overall sensory difference is still significant. The download volume decreased by 65MB, breaking apple OTA limit of 200MB to 160+MB.
The relevant data
- What is app thinning?
- Asset Catalog Format Reference
- pngquant
- webpmux
- Code Size Performance Guidelines
- From Exported Symbols applied to package size optimization to symbol binding
- LLVM Link Time Optimization
- Reducing Code Size Using Outlining
- Interprocedural MIR-level outlining pass
This article is published by NetEase Cloud Music Technology team. Any unauthorized reprinting of this article is prohibited. We recruit technical positions all year round. If you are ready to change your job and you like cloud music, join us!