Preface: As an iOS developer who has worked for 5 years, I suddenly found that my knowledge of the bottom layer was so weak that I didn’t even have a clear understanding of the startup details of an APP. After a series of learning, I learned that when the APP is started, it actually goes through a series of function calls and the loading of relevant support libraries. The specific content will be expanded step by step below.
First, explore the main line thinking of OC object principle
1.1 analysis of the startup process of the program
- First of all, when the APP is started, it will first call the dyLD linker of the system to call the relevant system library
- Then call some image files as needed
- The GCD and Runtime environment support operations are then performed to prepare the program for launch.
The following is a brief overview of the startup process shown above:
0 _dyld_start // start loading dynamic linkers... LibSystem_initializer/libdispatch_init/libSystem_initializer/libdispatch_init/libSystem_initializer/libdispatch_init/libSystem_initializer Y lib_object_init // Load the Runtime library z_objc_init // Perform operations related to runtimeCopy the code
1.2 Leads to the topic of this thesis — the underlying nature of object Alloc
- So alloc, as we’ve seen in the past, is a piece of memory. Here’s an example of an object that alloc to see how its memory changes
For example, LGPerson *p1 = [LGPerson alloc];Copy the code
Next, we’ll explore this section: probe into pointer addresses and memory of Alloc objects.
Pointer address and memory of alloc object
1. Start exploring the influence of alloc on memory and Pointers
Let’s implement the demo to explore whether the conjecture of alloc opening up memory is correct:
Comparing the above memory address situation, we can see some rules:
(1) The address 0x6000019FC5e0 created by alloc is stored in heap space. (2) THE memory addresses of P1, P2, p3 are gradually raised, 8 bytes apart from each other, which are continuous Pointers stored in the stack. (3) Pointers P1, P2, p3 all point to the same memory space.
2. Conclusion and doubt follow-up:
When we look at the pattern above, it is natural to think of Pointers and memory in relation to alloc and init, so we may have the following doubts:
- (1) Does the object already have a memory address and pointer after alloc?
- (2) after init is called, p2 and P3 memory address is different.
- (3) How does alloc open up space? What does init do?
Obviously, only based on the above conclusions, can not prove the role of alloc to open up memory space, and can not let us have a clear understanding of the memory opening process, so let us take these questions, enter the next “Object AlloC bottom Exploration Stage”.
Three, the bottom of the three methods of exploration
On the way of programmer development, with the growth of working years and the continuous accumulation of knowledge, we will continue to go deep into the bottom layer to explore some original things. In many cases, we not only need to understand the knowledge and the problem itself, but more importantly, to understand the perspective of analysis, thinking and exploration, and continue to go deeper to the bottom. Next, let us first to understand, the bottom of the common exploration of the three methods!
1, add symbol breakpoints, step debugging program
Method instructions: (1) debug code location interrupt point, single step debugging. So let’s say we want to debug alloc, right on the line that alloc uses, manual breakpoint, and debug. (2) can be combined with manual add symbol breakpoints, debugging such as: libobjc.a.dylib ‘objc_alloc: access
2. By tracing assembly code
Usage description: (1) The location is Debug — Debug Workflow — Always Show Disassembly. (2) through the breakpoint, and then open (1) function, view assembly code, through the function to track the execution process, looking for symbolic code. For example, get objc_alloc
3, by knowing the function name, and manually insert the symbol breakpoint, determine the location
Debug — Debug Workflow — Always Show Disassembly = NO; (2) Know the method to trace, such as alloc; Then manually insert symbols, such as alloc, for single-step debugging. Alloc: libobjc.a. dylib ‘+[NSObject alloc]:
4. More ways of inquiry
-
In addition to the above three ways, we may also use disassembly, LLDB tools, stack and other ways to explore the underlying principles.
At present, our exploration methods have been mastered and understood, the following begins the actual combat process!
Four, assembly combined with source code debugging analysis – Alloc source code analysis combat
By analyzing the runtime source code operation process, we can have a deeper understanding of the internal mechanism of Alloc. First, we need to obtain the Runtime source code, and then analyze the alloc part of the source code.
1, source code download reference address
1) Apple source code: openSource.apple.com
2) opensource address: https://opensource.apple.com/tarballs/
3) Github address: github.com/LGCooci/obj… (Compiled)
2. Source code analysis
- Compile objC4-818, you can view the alloc function execution flow, as shown in figure:
Image files address: www.processon.com/view/link/6…
Analysis process: (1) Can use method three: through known function name, and manually insert symbol breakpoint, determine the location. (2) According to the known process method, the functions involved in the above alloc are regarded as symbolic breakpoints for single-step tracking debugging. You can understand the execution process of alloc source code.
Compiler optimization
1. Objective-c program-to-source process
As it runs, objective-C programs are optimized by the Clang compiler to generate assembly code, which then generates binaries (MachO files) that can be recognized by machines.
2. Compiler optimization strategy
- The compiler optimization process can be analyzed by using the following functions:
(1) If compiler optimization is enabled in Xcode, c = lgSum(a + b) is equivalent to C = a + b; (2) Xcode has some strategies for compiler optimization built in for us. Generally speaking, it is related to processing according to the algorithm rules of space and time. (3) If compiler optimization is adopted, some simple function operations may be inlined. We generally choose to turn this option off when code tracing. Xcode allows the Fastest [-OS] function to be enabled by default when using a real server package.
6. Main line flow of ALLOc
1. Alloc process analysis diagram
2. Source code process analysis
With the method and experience of tracing source code above, we will perform symbolic breakpoint debugging on the alloc process. Remember the main line of the current research — the process of alloc bottom implementation and object to open up space and memory and alloc relationship, to firmly grasp this main line to explore!! For the sake of time, we’ll go straight to the third source method here: directly using the objC libraries already compiled by Cooci. All right, here we go! The following is the source code analysis:
- 2.1 From the previous analysis of memory and Pointers, the pointer P created by LGPerson points to a space created by alloc
LGPerson *p = [LGPerson alloc] ;
Copy the code
- 2.2 In nsobject. mm, get the alloc method and trace it
+ (id)alloc {
return _objc_rootAlloc(self);
}
id _objc_rootAlloc(Class cls)
{
return callAlloc(cls, false/*checkNil*/, true/*allocWithZone*/);
}
Copy the code
- 2.3 Here comes the core part of Alloc, the implementation of callAlloc. Next, we analyze the details in the source code:
static ALWAYS_INLINE id callAlloc(Class cls, bool checkNil, bool allocWithZone=false) { #if __OBJC2__ if (slowpath(checkNil && ! cls)) return nil; if (fastpath(! cls->ISA()->hasCustomAWZ())) { return _objc_rootAllocWithZone(cls, nil); } #endif // No shortcuts available. if (allocWithZone) { return ((id(*)(id, SEL, struct _NSZone *))objc_msgSend)(cls, @selector(allocWithZone:), nil); } return ((id(*)(id, SEL))objc_msgSend)(cls, @selector(alloc)); }Copy the code
This is where the CLS, checkNil, and allocWithZone values passed in determine the rest of the program. HasCustomAWZ defines a method to get the cache of an object. If there is any cached content in the object, execute _objc_rootAllocWithZone. If there is any cached content in the object, execute _objc_rootAllocWithZone. 2. Trace the objc_msgSend method and find that the next step implementation file is implemented in assembly mode. Focus on tracing the _objc_rootAllocWithZone method.
5. Define the implementation of _objc_rootAllocWithZone and _class_createInstanceFromZone methods.
NEVER_INLINE id _objc_rootAllocWithZone(Class cls, malloc_zone_t *zone __unused) { // allocWithZone under __OBJC2__ ignores the zone parameter return _class_createInstanceFromZone(cls, 0, nil, OBJECT_CONSTRUCT_CALL_BADALLOC); } # pragma-mark :_class_createInstanceFromZone alloc static ALWAYS_INLINE ID _class_createInstanceFromZone(Class cls, size_t extraBytes, void *zone, int construct_flags = OBJECT_CONSTRUCT_NONE, bool cxxConstruct = true, size_t *outAllocatedSize = nil) { ASSERT(cls->isRealized()); // Read class's info bits all at once for performance bool hasCxxCtor = cxxConstruct && cls->hasCxxCtor(); bool hasCxxDtor = cls->hasCxxDtor(); bool fast = cls->canAllocNonpointer(); size_t size; size = cls->instanceSize(extraBytes); if (outAllocatedSize) *outAllocatedSize = size; id obj; if (zone) { obj = (id)malloc_zone_calloc((malloc_zone_t *)zone, 1, size); } else { obj = (id)calloc(1, size); } if (slowpath(! obj)) { if (construct_flags & OBJECT_CONSTRUCT_CALL_BADALLOC) { return _objc_callBadAllocHandler(cls); } return nil; } if (! zone && fast) { obj->initInstanceIsa(cls, hasCxxDtor); } else { // Use raw pointer isa on the assumption that they might be // doing something weird with the zone or RR. obj->initIsa(cls); } if (fastpath(! hasCxxCtor)) { return obj; } construct_flags |= OBJECT_CONSTRUCT_FREE_ONFAILURE; return object_cxxConstructFromClass(obj, cls, construct_flags); }Copy the code
1, Through the breakpoint debug, we found that the CLS passed in is the LGPerson object, through the method body return value obj, we can determine that the purpose of this method is to create an instance of the LGPerson object. If the zone is empty, execute calloc () to open up the memory space. The memory address in the zone is dirty, because there is no isa pointer bound. If a zone exists, execute malloc_zone_calloc. 3. Check the conditions for zone and fast. If zone exists and fast lookup is supported, initInstanceIsa is used to bind isa to the zone address. If not, run the initIsa command to open the memory. InitInstanceIsa and initIsa are the core methods alloc uses to open up memory and bindings. Finally, return an object type with an ISA pointer and a memory pointer.
Byte alignment and its principles
After understanding the alloc memory opening and pointer binding process, let’s see how the size of memory space is determined, as shown in the figure:
Through breakpoint debugging, we found that CLS is the LGPerson object that we initialized. The operation here is to fetch the space occupied by the LGPerson object. Let’s take a look at the instanceSize implementation code as follows:
inline size_t instanceSize(size_t extraBytes) const { if (fastpath(cache.hasFastInstanceSize(extraBytes))) { return cache.fastInstanceSize(extraBytes); } size_t size = alignedInstanceSize() + extraBytes; // CF requires all objects be at least 16 bytes. if (size < 16) size = 16; return size; } // Class's ivar size rounded up to a pointer-size boundary. uint32_t alignedInstanceSize() const { return word_align(unalignedInstanceSize()); } #ifdef __LP64__ # define WORD_SHIFT 3UL # define WORD_MASK 7UL # define WORD_BITS 64 #else # define WORD_SHIFT 2UL # define WORD_MASK 3UL # define WORD_BITS 32 #endif static inline uint32_t word_align(uint32_t x) { return (x + WORD_MASK) & ~WORD_MASK; } static inline size_t word_align(size_t x) { return (x + WORD_MASK) & ~WORD_MASK; } static inline size_t align16(size_t x) { return (x + size_t(15)) & ~size_t(15); }Copy the code
1, memory alignment:
- The instanceSize code block above describes objC’s alignment rule for object memory size “CF requires All Objects to be at least 16 bytes,” where objects must be aligned with at least 16 bytes.
2. Byte alignment
- AlignedInstanceSize — > word_align we can get the rules for byte alignment
Word_align specifies the byte alignment rules for different environment conditions: WORD_MASK is different and is calculated as: (x + WORD_MASK) & ~WORD_MASK where x is the size of the object passed in through the function: UnalignedInstanceSize — >data() — >ro() — >instanceSize Gets the size of the object. For example, LGPerson object size isa size: 8 bytes Calculated size: (0x00001000 + 0x00001000) & ~0x00001000 = 0x00001000 = 8
Object memory space
To be updated….