First, iOS App size limit changes

1. The limit of App download size changes

  • In September 2013, after the official release of iOS 7, Apple increased the size limit of apps downloaded from the AppStore on cellular networks from 50 MB to 100 MB.

  • In September 2017, with the official release of iOS11, the limit was increased from 100 MB to 150 MB, and in late May 2019, 150 MB was “quietly” relaxed to 200MB;

  • In September 2019, iOS13 after the official version, directly released the cellular network under the App download size limit, the main flow is enough, casually under.

  • The download size is the size of the App compression package (.ipa file). Currently, the installation size seen in AppStore is the size of the compressed package after decompression.

2, executable file size limit changes

  • According to Apple’s audit requirements, the size of the executable file uploaded to the App Store is limited. The executable file size is not the size of the binary (Mach O) file, but the size of the __TEXT portion of the binary (Mach O) file.

  • Before IOS 7, the sum of all __TEXT parts of a binary file must not exceed 80MB;

  • From iOS 7.x to iOS 8.x, the __TEXT portion of each Architecture Slice must not exceed 60MB in binary files

    • An Architecture Slice is part of a fat binary layout file for a specific Architecture. For example, a fat binary might contain fragments for both 32-bit and 64-bit architectures.
  • After iOS 9.0, the sum of all __TEXT parts of a binary file should not exceed 500 MB; See maximum build file size for details

  • On April 1, 2020, almost all iOS apps are compatible with the lowest version of iOS 9 starting, such as wechat/Meituan/Meituan Takeout iOS App supports iOS10 at the lowest level, alipay/Aotao/Didi/Douyin/Kuaishou iOS App supports iOS 9 at the lowest level.

3, summarize

  • With the popularity of 4G and the arrival of 5G, the cost of data has been greatly reduced, Apple has released the restrictions on App size, and iOS users are more willing to upgrade the system. IOS developers can relax about the package size, for example, they do not have to worry about exceeding binary__TEXTPart of the limitations, you can give priority to business iteration, manpower, and then to do package slim;
  • If you want to pursue higher quality App and take the top spot among competitors, you still need to spend a lot of effort on bag size. In general, the highest ROI is cleaning up useless resources (mostly images), followed by optimizing binary file sizes; Binary file size is optimized either by optimizing compiler options or by cleaning out useless classes, functions, and code blocks.
  • There are many similar package optimization blogs on the web to refer to, but this article will focus onThe Mach - O filesSurrounding knowledge: Mach-O files themselves, analysis tools, Link Map files, etc.

2. Introduction to Mach-O files

1, an overview of the

  • Mach-OThe Mach Object format is an Executable program format on MacOS or iOS similar to Portable Executable format on Windows. ELF formats on Linux (Executable and Linking Format)
  • The Mach - O filesAre classified into the following five categories:
    • Executable: the Executable file of the application
    • Dylib Library: a dynamically linked Library (also called a DSO or DLL)
    • Static Library: a Static linked Library
    • Bundle: Dylib that cannot be linked and can only be loaded at runtime using dlopen() as a macOS plug-in
    • Relocatable Object File: redirectable File type

2. Composition of mach-O files

The Mach-O file consists of three parts: Header, Load Commands, and Data.

  • Header specifies some basic information, such as CPU architecture, big and small enention order, file type, number of Load Commands, etc. Headers can help verify the validity of Mach-O and locate the running environment of the file. For 64-bit architecture, the Header structure is defined as follows:

    struct mach_header_64 { uint32_t magic; /* Mach magic number identifier, which is used to quickly determine whether the file is 64-bit or 32-bit */ cpu_type_t cpuType; /* CPU specifier, CPU** type, such as ARM */ CPU_subtype_t cPUSubType; /* Machine specifier, corresponding to the specific type, such as arm64, armv7 */ uint32_t fileType; /* Uint32_t NCMDS */ uint32_t NCMDS; /* Number of load commands */ uint32_t sizeofcmds; /* The size of all the load commands */ uint32_t flags; /* uint32_t reserved; /* reserved */};Copy the code

    Filetype can be:

    • OBJECT, for.o files or.a files;
    • EXECUTE: indicates the file after IPA is unpacked.
    • DYLIB, which means.dylib or.framework files;
    • DYLINKER, for dynamic linker;
    • DSYM refers to a file that stores symbolic information for analyzing flash back information.
  • Load Commands, which contains the command type information, name, and binary location in Mach-O Using a 64-bit architecture, the Load Commands structure is defined as follows:

    struct segment_command_64 { /* for 64-bit architectures */ uint32_t cmd; /* CMD is the type of Load commands, LC_SEGMENT_64 represents the 64-bit segment of the file mapped to the address space of the process */ uint32_t cmdsize; /* includes sizeof section_64 structs */ char segname[16]; /* segment name */ uint64_t vmaddr; /* Memory address of this segment */ uint64_t vmsize; /* Memory size of this segment */ uint64_t fileoff; /* file offset of this segment */ uint64_t filesize; /* amount to map from the file */ vm_prot_t maxprot; /* maximum VM protection */ vm_prot_t initprot; /* initial VM protection */ uint32_t nsects; /* number of sections in segment */ uint32_t flags; /* flags */ };Copy the code
    • Load commands tell the loader how to handle binary data, some of which are handled by the kernel, others by the dynamic linker; LC_SEGMENT_64andLC_SEGMENT ‘are the main commands to load. They instruct the kernel to set the memory space for the process.
  • Data consists of the Segment’s Data. It is the largest part of Mach-O, with code and Data, such as symbol tables. Data has three segments: __TEXT (containing executing code and other read-only Data), __DATA (program Data that is writable), and __LINKEDIT (containing symbols used by the linker and other tables).

    • __TEXT and __DATA correspond to one or more sections. __LINKEDIT has no sections. LC_SYMTAB is used to parse symbol table and string table. Inside these are the main data for Mach-O.

    • For 64-bit architectures, the Section structure is defined as follows:

      struct section_64 { /* for 64-bit architectures */ char sectname[16]; /* name of this section such as _text, stubs */ char segname[16]; /* segment this section goes in the segment to which the section belongs, such as __TEXT*/ uint64_t addr; /* Memory address of this section */ uint64_t size; /* size in bytes of this section */ uint32_t offset; /* file offset of this section */ uint32_t align; /* Uint32_t reloff; /* uint32_t reloff; /* file offset of relocation entries */ uint32_t nreloc; /* Number of relocation entries */ uint32_t flags; */ uint32_t reserved1; /* Flag (section type and attributes) /* reserved (for offset or index) */ uint32_t reserved2; /* reserved (for count or sizeof) */ uint32_t reserved3; /* reserved */ };Copy the code
    • Note: __TEXT stands for Segment, lowercase __TEXT stands for Section

3, FatFile/FatBinary

  • FatFile/FatBinaryFat binary is a binaryMach-o artifacts from different compilation architecturesA collection of. An architecturalMach-OIt can only be used on machines or emulators of the same architecture, and an ensemble is required to support different architectures.
  • The architecture here refers to the CPU instruction set. IOS devices use ARM processors. ARM supports two types of instruction sets: 32-bit ARM instruction set (ARMV7 | ARMV7S) and 64-bit ARM instruction set (arm64 and ARM64E).
  • In addition, there are i386(32-bit) and X86_64 (64-bit) instruction sets for Mac processors. IOS emulators do not run ARM instruction sets, and apps running on iOS emulators need to support I386 or X86_64 instruction sets.

XNU loads App Mach-o

  • Based on ARM architecture, iOS system can be divided into four layers from top to bottom: User experience layer (including SpringBoard, Spotlight, etc.), application framework layer (Cocoa and other frameworks), core framework layer (Metal and other graphics and media core frameworks) and Darwin layer (operating system core, including system kernel XNU, drivers, etc.).

  • XNU is the core of Darwin; Mainly Mach, BSD, IOKit; Mach is the inner ring, responsible for the most basic work of the operating system, including process and thread abstraction, processor scheduling, IPC (interprocess communication), message mechanism, virtual memory management, memory protection, etc. The BSD layer can be seen as an outer ring around the Mach layer, providing functions such as process management, file systems, and networking.

  • The binary file of App is loaded by iOS kernel XNU. XNU loading is to create a new process for Mach-O, create a virtual memory space, parse the Mach-O file, and finally map to the memory space.

  • The specific loading process is summarized as follows: fork new process; Allocate memory for Mach-O; Parsing the Mach – O; Read mach-O header information; Traversing load Command information to map Mach-O to memory; Start DYLD (user-mode process, not in XNU, load dynamic library, etc.).

    In Mach, everything was implemented through its own objects; Processes, threads, and virtual memory are all called “objects”; Mach objects cannot be called directly, but can only communicate with each other through message passing. Messages are the most basic concept in Mach. Messages are transmitted between two ports. This is the core of Mach IPC (interprocess communication).

Analyzing the basic commands of Mach-O

1, lipo command

  • A tool for managing Fat files that allows you to view CPU architectures, extract specific architectures, consolidate and split library files

  • The common methods are as follows:

    #See a list of CPU architectures supported by fat binary
    lipo -info xxxx.a/xxxx.framework/xxxx
    
    #Extract the binary of a particular CPU architecture from the fat binaryLipo lxx.a -thin cpu_type(armv7s/arm64, etc.) -output xx_cpu_type
    #[together] Consolidate into Fat files
    lipo -create xxxx1  xxxx2  -output xxxxfat
    
    #Delete To remove files for a specific CPU architectureLipo -remove cpu_type(armv7s/arm64, etc.) xxxx-output XXXXCopy the code

2. Ar command

  • Often used to create, modify libraries, and extract individual modules from libraries.

  • A is a fat file (use libtool(1) or lipo(1) and ar(1) on it).

  • This is because the.a file is a fat binary that contains multiple CPU architectures. Lipo files need to be used first to extract the binary for a specific CPU architecture, as follows:

    #Split a binary of the ARM64 architecture
    lipo xx. a -thin arm64 -output xx_arm64.a
    #Decompress the. A file
    ar -x xx_arm64.a
    Copy the code

3. The nm command

  • A symbol table used to display binary object files

  • The common methods are as follows:

    #Get the program symbol table in Mach-O
    nm path
    #All symbols of the target file
    nm -nm path 
    Copy the code

4. Grep command

  • Used to determine whether a string is included

  • The common methods are as follows:

    #Check if the XXX string is included:The grep -r "XXX" pathCopy the code

Introduction to otool

The Object file displaying tool(Otool) is a powerful tool for analyzing Mach-O files, which can interpret and display specified object files or library files in specific ways. (Xcode is usually installed, otool is installed by default)

1. Check the mach-o header

otool -h app_name.app/app_name
Copy the code
  • Header information includes magic, CPUType, CpusubType, CAPS, FileType, NCMDS, sizeofCMDS, and Flags

2. Check mach-o load Commands

otool -l app_name.app/app_name
Copy the code
  • The information mainly includes the command type information in Mach-O, the name and the location of the binary file.

3. Check the dynamic libraries that Mach-O depends on

otool -L app_name.app/app_name
Copy the code
  • Dynamic library information includes dynamic library name, current version number, and compatible version number

Check the encryption information of the Mach-O file

otool -l app_name.app/app_name | grep crypt
Copy the code
  • In the command output, the value of cryptid can be 0 (unencrypted) or 1(encrypted)

< span style = “max-width: 100%; clear: both;

#Gets the addresses of all classes
otool -v -s __DATA __objc_classlist app_name.app/app_name
#Gets the addresses of all referenced classes
otool -v -s __DATA __objc_classrefs app_name.app/app_name 
Copy the code
  • The difference between the two results can be used and then symbolized to get information about the unreferenced class. Note, however, that an unreferenced class is not the same as an unused class, and some actual uses (dynamic calls, etc.) can be mistaken for unused classes.

6. Extension: MachOView tool

  • useotoolIt is convenient, but it can be usedMachOView toolsIt is easy to see the header, load commands and other information in the Mach-O fileMach-o file browser -MachOView
  • The MachOView tools screen has RAW and RVA options in the upper left corner.
    • RAW is the absolute offset of the byte relative to the beginning of the file. The address of the file header starts at 0x000.
    • RVA is the offset to a base address, that is, the absolute offset of the whole plus some base address, from which the address of the file header begins.

Introduction to class-dump

1, an overview of the

  • Class – dump to the dumpMach-OClass information for the file; It uses the Runtime feature of OC language to extract header information stored in mach-o files and generate corresponding. H files.
  • Class-dump is also used in reverse engineering

2. Download and install

  • Download the latest DMG file from the class-dump address
  • Open the DMG file and copy class-dump to a directory such as$HOME/custom-tool/bindirectory
  • Open the ~/.bash_profile file: vi ~/.bash_profile and add a line at the top of the file:export PATH=$HOME/custom-tool/bin/:$PATHAnd then save and exit
  • performsource ~/.bash_profile;
  • At this point, class-dump takes effect.

3, use,

  • Obtain the IPA file, change the file name extension to. Zip, decompress it, and obtain the app file in the Payload file.

  • It is important to note that App files downloaded from the App Store are encrypted, with executable files covered by a shell. Class-dump does not work directly on such files. It is necessary to use other means to destroy the shell.

  • Put the app file in the specified directory, go to the directory, and run the following command

    #Export mach-O headers (header contents sorted by name)Class-dump -h Mach -o File path -o Directory for storing header filesCopy the code
    • -h indicates that a header file is to be generated
    • -o Specifies the directory for storing header files
  • Command for supplementing statistics on files and folders

    #View the number of files under a file, including subfiles
    ls -lR|grep "^-"|wc -l
    
    #View the number of folders under a file, including subfolders
    ls -lR|grep "^d"|wc -l
    Copy the code

Link Map File

1, an overview of the

  • Source code after the compilation stage, each class will generate the corresponding. O file (object file); Then, in the linking stage, the. O file and the dynamic library are linked together, and finally the executable file is generated.
  • Linkmap is an intermediate product of iOS compilation process, which records the layout of binary files, including the path of executable files, CPU architecture, object files, symbols and other information.
  • Link Map File can be used to understand memory segmentation, partitioning, and analyze the space occupied by classes or libraries in executable files (App slimming can be known).
  • Link Map File can be setProject ->Build Setting->Write Link Map FileFor YES, generate Link Map File after Build; You can also set it upPath to Link Map File, specifyLink Map FilePath to the storage.

2. Important components of Link Map File

  • Path & Arch: Path is the Path to the executable and Arch is the schema type.

    #Path: /Users/xxx/Library/Developer/Xcode/DerivedData/.... /app_name.app/app_name
    # Arch: arm64
    Copy the code
  • Object Files: paths and file numbers of link units (including.o Files and dylib libraries) used in binary generation; The class number corresponds to a specific class. Later in the Symbols section, we’ll use class numbers.

    # Object files:[ 0] linker synthesized [ 1] /Users/xxxx/Library/Developer/Xcode/DerivedData/.... /AppDelegate.o [ 2] /Users/xxxx/Library/Developer/Xcode/DerivedData/.... /main.o#.
    Copy the code
  • Sections: Records the address range for each Segment/section in Mach-O. In Mach-O, there are three types of seinterfaces, and seinterfaces are divided into different sections, and different sections store different information: seinterfaces mainly have three types: __TEXT, __DATA and __LINKEDIT

    • __TEXTContains Mach headers, executed code, and read-only constants (such as C strings) that are read-only and executable
    • __DATAContains global variables, static variables, etc., can be read and written
    • __LINKEDITContains metadata that contains the loader, such as the name and address of the function. Read only.
    #The first column is where the Section starts, the second column is how much memory the Section occupies, the third column is the Segment type, and the fourth column is the Section type.
    # Sections:
    #Address Size Segment Section
    0x100002780 0x0129617D  __TEXT  __text
    0x1012988FE 0x000015E4  __TEXT  __stubs
    #.
    Copy the code
  • Symbols: Records the address range of each symbol in order

    # Symbols:// the __text code area#Address Size File Name
    0x100002780 0x00000450  [  2] -[UIButton(SSEdgeInsets) setImageUpTitleDownWithSpacing:]
    0x100002BD0 0x00000070  [  2] _UIEdgeInsetsMake
    #.
    Copy the code
    • According to theAddressDetermine the region of distribution, e.gThe __TEXT area of the __TEXT section(storing the code),The __objc_methname area of the __TEXT section(store method name),__DATA __objc_classlist district(store all classes), etc.
    • According to theAddress, you can also find the corresponding method name through the symbol tableName(The longer the method name, the more memory it ends up using)
    • According to theFileNumber to find which class the code belongs to;
    • __objc_classlist areaThe value stored in this section is a pointer to the virtual address of the class.

3, functionality,

  • Analyze the size of classes and libraries in binary: in the Symbols section, we can add the sizes of classes with the same numbers to calculate the size of classes. The size of a library can be calculated by counting class sizes in the same library. An off-the-shelf analysis tool, LinkMap
  • Find unreferenced classes: use_objc_classname (All class names) and__objc_classrefsFind unreferenced classes (an unreferenced class is not necessarily an unused class)
  • Find unreferenced methods: _objc_methname(All methods) and__objc_selrefs(referenced methods), find unreferenced methods (unreferenced methods are not necessarily unused methods)
  • Link Map files have many more uses

Article history

IOS App Slimming Tips – Basically gives some tips for slimming your App

PNG principle two or three things – basic introduction of PNG principle, and then the App slimming image compression Buddha system

The document reference

Apple has raised the download limit for iOS AppStore from 150 megabytes to 200 megabytes

IOS reverse class – the dump

IOS code Slimming practice: Remove useless classes

When we talk about iOS slimming down, what are we really talking about

Mac Common commands for viewing file contents

Analyze the Mach-O file

Executable files in iOS

IOS instruction set architecture ARMv6, ARMV7, ARMV7s, ARM64, ARM64E, X86_64, i386

Read the Mach-O file format

Interesting exploration of Mach-O: File format analysis