“This is the 15th day of my participation in the First Challenge 2022. For details: First Challenge 2022”
MachO file
MachO is short for Mach Object file format. It is the MAC and iOS Executable format similar to THE PE(Portable Executable) format on Windows, O,. A,. Dylib, Framework, dyld, and. Dsym are common executables on MAC and iOS.
The MachO file format is as follows:
Header
: contains general information about binary files: byte order, schema type, number of loading instructions, processor, and file type.Load commands
: a table containing many contents, including the location of regions, symbol tables, dynamic symbol tables, etc.
LC_SEGMENT_64 | The files in the32 or64 Bits are mapped into the process address space |
---|---|
LC_DYLD_INFO_ONLY | Dynamically link related information |
LC_SYMTAB | Symbolic address |
LC_DYSYMTAB | Dynamic symbol table address |
LC_LOAD_DYLINKER | Dyld load |
LC_UUID | File the UUID |
LC_VERSION_MIN_MACOSX | Supports the lowest operating system version |
LC_SOURCE_VERSION | Source code version |
LC_MAIN | Set the program entrance address and stack size |
LC_LOAD_DYLIB | Path to dependent libraries, including third-party libraries |
LC_FUNCTION_STARTS | Function start address table |
LC_CODE_SIGNATURE | Code signing |
Data
: Mainly responsible for code and data recording;MachO
Based onSegment
This is a structure for organizing dataSegment
Can contain0
One or moreSection
. According to theSegment
Which one is the mappingLoad command
.Segment
In thesection
It can be read ascode
,constant
, orOther data types
. When loaded in memory, also according toSegment
To do memory mapping;
Introduction to MachO files
Let’s take a look at the MachO file generated by the above code:
The Header Header file
Magic Number
Is:32
A still64
A;CPU Type
: the currentCPU
Type;arm64
File Type
: Current file type; Executable fileMH_EXECUTE
Number of Load Commands
: Needs to be loadedLoad Commands
The number of commands;Size of Load Commands
:Load Commands
Instruction size;Flags
: identity;
Binary instructions for code
A hard-coded string
__objc_classList Records the OC class
__swift5_types Records swift classes and structures
So in there is the Descriptor information for the Swift class (TargetClassDescriptor); We can evaluate at that address and get the method of the class;
Verify function address
- Because it is the small – endian mode, so the address must be reversed; Address:
0xFFFFFB8C
, plus offset0xBB8C
, the result is:0x10000B718
- Because the virtual memory address is from
0x1000....
So we subtract the virtual memory address to get0xB718
, we are inMachO
Found in the0xB718
Location:
0xB718
Is ourTeacher
Of the classTargetClassDescriptor
The starting position of the structure, as defined by this structure, offset12
a4
You can find it in bytessize
andvTable
:
size
After isvTable
, thenvTable
The first address in the0xB740
add12
So bytes, that’s just bytes0xB74C
This is usteach
Function in theMachO
Address in;- through
image list
The command to getASLR
And then add0xB74C
I can getteach
Address in memory:
**0x0000000102188000**
+0xB74C
= 0x10219374C
This is when our program is runningteach
Memory address of the function;
Swift
The data structure of the method in memory is as follows:
So, what we calculated0x10219374C
That’s usteach
The first address of the method data structure; We need to findimp
, also need to be offsetFlags
(4 bytes
), it should be noted that in this structureImpl
It’s a relative pointeroffset
, continue offsetoffset
We can findimp
;
Flags
for4
Bytes, according toMachO
And we can see that0xB74C
Corresponding to the second four bytes, namely0xFFFFAEF0
isoffset
, thenteach
The address in memory is:0x10219374C
+0x4
+0xFFFFAEF0
=0x20218E640
;0x20218E640
Then go to the base where the program is running0x1000....
, the result is:0x10218E640
;
- Through the above steps, we verified that
Swift
Class methods are indeed storedvTable
In the table.
So why call methods in assembly with metadata offset 0x50, 0x58, 0x60?
InitClassVTable ();
In the process of running the program will generate a vtableOffset, and then through vtableOffset to load vTable;
Structure method scheduling
Struct (struct); struct (struct);
Run the program to view the assembly instruction:
In structs, methods are called from addresses, which means methods in structs are called statically; That is, after compilation, the address of the function is determined; This is because structs are value types that cannot be inherited, and their internal methods are their own. There is no need to create extra memory to record function tables. So use static call directly;
We can also through the source code of the source code to verify in the StructContextDescriptorBuilder:
You can see that there is no way to call vTable internally;
Method calls in extension
A call to a method in a struct’s Extension
Add an extension to the struct and create a Teach3 method:
View assembly instructions:
Struct extension methods are still called directly (statically distributed);
Class extension method calls
Let’s change the structure to a class and look at teach3 method calls:
As can be seen from the assembly instruction:
The class extension method is also called directly (statically distributed);
We create a subclass of Teacher and add teach4 methods:
Let’s generate the SIL file and look at its vTable:
The Teacher method will also appear in the vtable of the SubTeacher class, but the extension method will not appear in the Vtable of the SubTeacher class.
Summary of method scheduling
type | scheduling | extension |
---|---|---|
Value type struct | Static distributed | Static distributed |
Class class | Function table distribution | Static distributed |
NSObject subclass | Function table distribution | Static distributed |
Affects the way functions are distributed
final
: addedfinal
Keyword function cannot be rewritten, using static distribution, no morevtable
Appears in, andobjc
Not visible at runtime.dynamic
: All functions can be addeddynamic
The keyword is notobjc
Classes and value types are given dynamic functions, but are distributed as function tables.@objc
: This keyword can be usedSwift
Function exposure toobjc
At runtime, the function table is still distributed;
In actual development, @objc is often used with Dynamic, where the Method becomes a mechanism for scheduling objc_msgSend messages. Method-swizzling can be used, but to be called by OC, the class must inherit from NSObject.