Welcome to the iOS Reverse series.
- IOS reverse RSA theory
- IOS reverse hash theory
- IOS reverse application re-signature + wechat re-signature combat
- IOS reverse Shell script + script re-signature
- IOS reverse code injection +Hook
- IOS reverse MachO file
- IOS reverse dyLD process
- IOS reverse HOOK principle of Fishhook
- IOS reverse LLDB debugging
Writing in the front
MachO has been mentioned several times before in the iOS Reverse series, and this article will take you through it
Tool: MachOView Password :y2wr
I. Preliminary study of MachO
Definition 1.
MachO is short for Mach Object file format, the MAC and iOS executable file format, Similar to Portable PE Format on Windows and Executable and Linking Format on Linux
It is a file format for executables, object code, and dynamic libraries. As an alternative to the. Out format, MachO offers greater extensibility
2. Common MachO files
- Object file.o
- The library files
- .a
- .dylib
- .Framework
- Executable file
- Dyld (Dynamic linker)
- .dsym (Symbol table: Generated by running Relese environment)
3. Check the file type
$ file xxx.xx
Ii. Architecture
1. The organization chart
In fact, the architecture of different iPhone models is different
architecture | Mobile phone models |
---|---|
i386 | 32-bit simulator |
x86_64 | 64-bit simulator |
armv7 | The iphone 4 and 4 s |
armv7s | The iphone 5, iPhone5C |
arm64 | IPhone5s – iPhoneX |
arm64e | IPhone XS, iPhone XS Max, iPhoneXR, iPhone11… |
2. Generate multiple architectures
Create a new project, run the real machine, view the executable file is just an ARM64 architecture
Relese the environment
Why iOS9.0? This is because the armv7 and armv7s architectures such as iPhone5c do not support iOS11.0
Why run the Relese environment? Because Xcode default Debug only generates a single schema
How do I generate all schemas? Xcode10 contains only v7 and 64, which need to be added in Architectures
Common binary files
Definition 1.
A Universal binary is also called a Fat binary.
- A program code proposed by Apple that can be used in binary files of multiple architectures simultaneously
- Optimal performance for multiple architectures simultaneously in the same package
- Because of the need to store multiple types of code, general-purpose binary applications are generally larger than single-platform binary applications
- However, because the two architectures share common non-execution resources, there are not twice as many as in a single version
- And because only a portion of the code is called during execution, no extra memory is required to run
2. Split/merge architectures
Architecture split
Merge architecture
Universal binary
What! Why not just 1+1=2?
Because parts of code are not shared between architectures (because different combinations of code binaries may mean different things on different cpus), common resource files are common
The above methods can be used to slim down our app
Conclusion:
① The fat binary will be split and reassembled to get the original fat binary
② The size of the universal binary may be greater than the sum of the subschema sizes, smaller than, or equal to, depending on the number of common resource files
3. Terminal CLI
Lipo x -thin armv7-output x -thin armv7-output x -thin armv7-output x -output xxxCopy the code
MachO file
1. Overall structure
When opened with MachOView, you see that the generic binary consists of Fat headers and four executable files
Executable file
Header
Load commands
Data
Universal binary
Header
Load Commands
Content (Data)
- Header:
- The load commands:
- Data:
The MachO file structure can also be viewed from the otool command line
$ otool -f universe
Copy the code
2.header
The header contains the byte order, schema type, number of loading instructions, and so on of the binary file, making it possible to quickly verify such information as whether the current file is 32-bit or 64-bit, the corresponding processor, and the file type
Xcode shift+ Command +O->load. H –
struct mach_header_64 {
uint32_t magic; /* Magic number, fast location 64 /32 */
cpu_type_t cputype; /* CPU type such as ARM */
cpu_subtype_t cpusubtype; /* CPU specific type such as arm64, armv7 */
uint32_t filetype; /* File type such as executable file.. * /
uint32_t ncmds; /* Load commands Load commands */
uint32_t sizeofcmds; /* Load commands load command size */
uint32_t flags; /* The flag bit identifies the functionality supported by binary files, mainly related to system loading and linking */
uint32_t reserved; /* reserved */
};
Copy the code
Mach_header_64 (64-bit) has only one more reserved field than mach_header (32-bit)
3.load commands
Load Commands is a table that contains the location of a region, a symbol table, a dynamic symbol table, and so on. It holds the details of the load instructions, telling the linker how to load the Mach-O file. By looking at the memory address, we see that load Commands are immediately after the header in memory
The name of the | content |
---|---|
LC_SEGMENT_64 | Maps segments (32-bit or 64-bit) in a file to the process address space |
LC_DYLD_INFO_ONLY | Dynamically link related information |
LC_SYMTAB | Symbolic address |
LC_DYSYMTAB | Dynamically link related information |
LC_LOAD_DYLINKER | Dynamically link related information |
LC_UUID | Dynamically link related information |
LC_VERSION_MIN_MACOSX | Supports the lowest operating system version |
LC_SOURCE_VERSION | Source code version |
LC_MAIN | Set the main thread entry address and stack size of the program |
LC_LOAD_DYLIB | Path to dependent libraries, including tripartite libraries |
LC_FUNCTION_STARTS | Function start address table |
LC_CODE_SIGNATURE | Code signing |
4.data
Data is the largest part of the MachO file, where _TEXT and _DATA sections can give a lot of information
There is still a lot of space between Load Commands and data, leaving us with a hole to inject code through
_TEXT section
The name of the | role |
---|---|
_text | Main program code |
_stubs, _stub_helper | Dynamic link |
_objc_methodname | Method names |
_objc_classname | The class name |
_objc_methtype | Method type (v@:) |
_cstring | Static string constants |
_DATA while forming period
The name of the | role |
---|---|
_got=>Non-Lazy Symbol Pointers | Non lazy loading symbol table |
_la_symbol_ptr=>Lazy Symbol Pointers | Lazy loading of symbol tables |
_objc_classlist | Method names |
. | . |
Fifth, dyld
Dyld (The Dynamic Link Editor) is apple’s dynamic linker, which is an important part of Apple’s operating system. After the program preparation of the system content, DyLD is responsible for the remaining work
Since the methods of the system library are public and stored in the shared cache, dyLD will symbolically bind the methods in the shared cache when MachO calls the system methods. This symbol is automatically removed in the Release environment, which is why we often need to restore the symbol table when using the bug collection tool
Write in the back
The above content can be understood