One: LLVM interception optimization
Words afterAbove,And analyzedalloc
Method after the underlying call logic, I thought I had figured it outalloc
The underlying call process, not expected to seedebug
Function call stack in, as shown belowThe analysis found:
- in
alloc
Method was called before it was calledobjc_alloc
andcallAlloc
Method, and the entire call order isobjc_alloc
->callAlloc
->alloc
->_objc_rootAlloc
->callAlloc
->_objc_rootAllocWithZone
->_class_createInstanceFromZone
(NSObject
It’s a special onealloc
The method call order isobjc_alloc
->callAlloc
->_objc_rootAllocWithZone
->_class_createInstanceFromZone
), as shown below:
callAlloc
Method is called twice, and each time the process inside the method is different, and we passcontrol
+command
+step into
You can’t get in that wayobjc_alloc
And for the first time,callAlloc
In the method.
So why are the objc_alloc method and the first callAlloc method in the call flow? Why on earth would Apple do this? Here are some questions to explore.
You first need to find out when and where the objc_alloc method was called, but there is no clue in the way of function call stacks, assembly code, etc. The only way to do it is the original way: search the source code project globally for objc_alloc methods and find them one by one.
If (MSG ->sel == @selector(alloc)), then IMP is replaced with objc_alloc.
Next, follow the steps (reverse lookup) to figure out the call order and see under what circumstances the program comes in to replace the IMP.
- Global search
fixupMessageRef
Find the caller_read_images
.
- by
_read_images
The comment indicates that the caller ismap_images_nolock
;
- According to the comment,
map_images_nolock
The caller may bemap_images
, global searchmap_images_nolock
To make sure that the caller really ismap_images
.
- Global search
map_images
And found the_objc_init
The inside of the_dyld_objc_notify_register
Function, continue searching_objc_init
and_dyld_objc_notify_register
Delta function is not going to get you anything, so that’s what it is, rightfixupMessageRef
This is the end of the reverse lookup of the function-related call flow.
- Forward verification: According to the results explored above, the order of forward call can be determined as
_objc_init
->_dyld_objc_notify_register
->map_images
->map_images_nolock
->_read_image
->fixupMessageRef
Set breakpoints in these functions and run source debug validation.
Conclusion: By looking backward at the source code call flow and running program verification forward, it is concluded that the alloc method must be replaced with objc_alloc, but not in the fixupMessageRef function above. So why would Apple provide a fix function that might not execute? Here divergent thinking, does the program compilation stage will have the corresponding similar operation, here is only fault tolerant processing!
With that in mind, download the LLVM source Code and drag it to Visual Studio Code to explore and verify:
MachOVie validates the.app executable and also finds the objc_alloc symbol in the assembly phase:
Final conclusion: Alloc -> objC_alloc has been replaced during the LLVM compilation phase, as well as retain, release, autorelease, and so on. As for why to hook these functions, presumably it is the creation and release of objects related to memory, so the system has done the corresponding monitoring.
There is still a question that puzzles us: why does the system add fixupMessageRef in objC source code? When does the LLVM compiler fail to trigger the fixupMessageRef function? This question is left for further exploration.
Process summary:
Alloc and other special methods are hooked by LLVM at compile time, where alloc is replaced with an objc_alloc function, so that when an object of XJPerson class is declared at run time and memory is allocated for it, the alloc method will respond first to objC_alloc. Then we go to callAlloc, and the first time we never meet the criteria if (fastPath (! CLS ->ISA()->hasCustomAWZ())) triggers ((id(*)(id, SEL))objc_msgSend (CLS, @selector(alloc))), sends an alloc message to XJPerson, This is when the alloc method is actually called, and then the _objc_rootAlloc->callAlloc->_objc_rootAllocWithZone->_class_createInstanceFromZone method does three things: Byte alignment, memory clearing, object binding.
Flow chart:
Two: the influencing factors of object memory
Exploration Direction:
- An empty object that does not declare any member variables, properties, or methods.
- Only properties (member variables) are declared.
- Declare only methods.
Conclusion:
- The memory size of an instance object of a class without declaring member variables, attributes, and methods is
NSObject
theisa
The 8 bytes of a pointer. - Methods do not have any effect on the memory size of the instance object of the class. Methods do not exist in the object
method_list
Class methods in metaclassmethod_list
). - In the case of adding attributes (member variables), the memory size of the instance object of the class is the memory size of each member variable plus
isa
8 bytes of pointer, then aligned with 8 bytes.
Three: byte alignment
8-byte alignment algorithm:
InstanceSize (x + WORD_MASK) & ~WORD_MASK (x + WORD_MASK) & ~WORD_MASK (x + WORD_MASK) & ~WORD_MASK (x + WORD_MASK) & ~WORD_MASK (x + WORD_MASK))
(8 + 7) & ~7
= 15 & ~7 (7 = 0000 0111, ~7 = 1111 1000)
= 0000 1111 & 1111 1000
= 0000 1000
= 8
Copy the code
16-byte alignment algorithm:
InstanceSize (x + size_t(15)) & ~size_t(15), x is a known parameter, size_t is of type size_t, and represents the number of bytes in which the current object is declared as a member variable.
(21 + 15) & ~15
= 36 & ~15 (15 = 0000 1111, ~15 = 1111 0000)
= 0010 0100 & 1111 0000
= 0010 0000
= 32
Copy the code
Why memory alignment?
Principle diagram:
Four: internal alignment of the structure
Memory alignment principles:
- Data member alignment rules: Structure (struct) or group (union) data members, members of the first data in the offset to zero, after each data member of the storage location from the members of the member or members of the size of size (as long as the son of the members members, such as array structure, etc.) of integer times to start (such as int is 4 bytes, It starts with an integer multiple of 4.
- Struct b (char, int, double); struct B (char, int, double); struct B (char, int, double);
- The total sizeof the structure, the result of sizeof, must be an integer multiple of the sizeof its largest internal member, with any gaps to be filled in.
Supplementary information: Number of bytes occupied by the base type
Five: Malloc exploration
Why explore Malloc?
When exploring the memory footprint of print instance objects, unexpected results occur:
I found libmalloc-317.40.8 and found the core code in it:
#define SHIFT_NANO_QUANTUM 4
#define NANO_REGIME_QUANTA_SIZE (1 << SHIFT_NANO_QUANTUM) / / 16
static MALLOC_INLINE size_t
segregated_size_to_fit(nanozone_t *nanozone, size_t size, size_t *pKey)
{
size_t k, slot_bytes;
if (0 == size) {
size = NANO_REGIME_QUANTA_SIZE; // Historical behavior
}
k = (size + NANO_REGIME_QUANTA_SIZE - 1) >> SHIFT_NANO_QUANTUM; // round up and shift for number of quanta
slot_bytes = k << SHIFT_NANO_QUANTUM; // multiply by power of two quanta size
*pKey = k - 1; // Zero-based!
return slot_bytes;
}
Copy the code
Malloc source code flow chart
Conclusion:
- Object memory, 16 bytes aligned.
- Member variables, 8 bytes aligned, add less than 8 bytes optimized together, insufficient complement 0.
- Object to object, 16 bytes aligned.