directory

  • 1, the background
  • From ISA to classes to metaclasses
  • 3, the classic ISA bit and metaclass inheritance relationship (interview must ask)
  • 4. Class structure
  • 5, pointer and memory translation
  • 6, class structure and memory calculation
  • 7. Use LLDB to analyze the structure of classes
  • Member variables and class methods
  • 9,

1. The background

Learning is not confused, let me fly! Hi, I’m Tommy! First of all, I would like to apologize to you for the serious delay caused by my heavy work this month. I originally planned to update two articles, but only one of them was updated. I hope you can understand. So without further ado, let’s move on! ~ (Every article is carefully done)

This article is an exploration of objective-C underlying classes. After three articles, we have a new understanding of the underlying objects. Next, we will explore what the most familiar Class is.

2. From ISA to class to metaclass

  • Two ways to get information about a class
    • (1) We have introduced it in our previous studyx/4gxYou can get the objectisaAnd then through theisaWith the mask and operation can directly get the class address, or can also directly throughp/x ZXPerson.classYou can also get the class address.

    • (2) To expand the sweet and sweet, we can open the MachO file and view the data section__DATA,\__objc_classrefs(Class reference list) data, find what we want to viewZXPersonClass, rememberOffsetOffset the value of the passLLDBThe commandimage listAccess to theASLRAnd then useASLR + OffsetSo that’s the address that I getZXPersonClass address, in usexCommand to get the stored value, last usedpoThe command prints the value, and the result is the same as what we got above.

    Ps: Knowledge of MachO documentation and ASLR will be added at the end of this article

  • Isa leads to metaclasses
    • And then we can do a little bit of thinking throughX / 4 gx + objectWe know the memory structure of the object, so what happens when we apply the command to the class? So let’s do this together.

    • We firstx/4gx zxpOf the objectisaAddress, and then with mask and operation to obtain the class address, the value of the two addresses are the same.
    • And then we go to the class address that we getx/4gxOperation, get the result after usepoThe command prints the value of the first position, resulting inZXPersonThe value of the second position isNSObject. Although both prints are shownZXPersonClass name, but we can verify that these two addresses are different from the ones we created ourselves. So the class with the same name and different value as the one we created is the metaclass.metaclass).

    • We know thatzxpThe object’sisais0x0000000102099540It points to the class of the object, and the class0x0000000102099518Is alsoisa, it points to the metaclass of this class. We can verify this with a mask
    • The last metaclass is created automatically at compile time, which we can look atMachOThe file can be verified.

  • Node summary:
    • We have obtained a logical relation through verification, that is: class objectisaIt points to a class, of a classisaIt points to a metaclass. And that brings us to a graphisaClassic bitmaps, which we will explain in the next section, end this summary.

3. Classic ISA bit and metaclass inheritance relationship (interview must ask)

  • Isa bitmap
    • Through the above section we can see that our own class will be created at compile timemateClassThat’s what we’re talking aboutThe metaclass, let’s look at a picture:

    • This is something I’m sure you’ve seen before, but the original is a diagram from apple’s official documentation. Intended as an illustration of ISA and class inheritance chains. But do you really understand this picture? I believe that some friends still have some confusion, so I will take you to comb through, so that you can clear up the concept of ISA position.

    • Isa step concept analysis: we first to strengthen the overall concept, this point to understand is almost the same, please see the following several concept analysis:

    • ☆ There is a corresponding metaclass for each class.

    • The ISA of each metaclass refers to the Root metaclass.

    • ☆ What is a root metaclass? How to understand? You can view it as a metaclass of NSObject.

    • So let’s verify this bitmap to see if it’s consistent with what’s in the picture; First we can use the object_getClass() code to print the values for each class, but remember to reference the

      header file.

    The printed result is the same as using LLDB.

    • At this point, we’ve takenZXPersonThe information about the class is printed, so let’s print itNSObjectInformation. Look at the pictureRootClass(mate)Whether isNSObjectThe metaclass. Let’s add the following code to print and see the result.

    • We have confirmed the authenticity of isa bitmap by the printed output results, so we can draw a bitmap for better understanding of the conclusion verified by ourselves.

  • Summary of ISA moving position:
    • ☆ All instances, classes, and metaclassesisaThe isa of the instance points to the class, and the ISA of the class points to the corresponding metaclass and metaclassisaPoint to the root metaclass(NSObject metaclass);
    • ☆ Finally onlyNSObjectThe metaclass (root metaclass) of his is specialisaHe was pointing at himself.

    The lastisaWe have sorted out the position and relationship of. Now let’s sort out the inheritance chain.

  • Metaclass inheritance:
    • Yes, you read that right! Although metaclasses are created automatically at compile time, they are also a Class and have class-related features, including inheritance. Metaclasses also have inheritance relationships like ordinary classes.

    • The inheritance relationship of metaclases is actually very easy to understand, and it is consistent with the inheritance relationship of ordinary classes corresponding to it. For example, ZXPerson inherits from ZXHuman, and ZXPerson_MateClass inherits from ZXHuman_MateClass. There’s still something special that happens when the inheritance chain goes to NSObject. Let’s verify and explore it in code.

    • We first print the information for the normal and metaclass inheritance chains separately.

    • Then look at the output:

    • Based on the print results we found thatZXPersonThe parent of the metaclass (red box) ofZXHumanThe metaclass;ZXPersonThe metaclass 2 parent (blue box) of isNSObjectThe metaclass;ZXPersonThe metaclass 3 parent (green box) of isNSObjectIn the class.ZXPersonThe level 4 superclass of the metaclass ofnilThat means there are no more superclasses.

  • Node summary:
    • After the above verification and explanation I think you should be rightisaGo a bit, the understanding of metaclass had a deeper level of comprehension. In the last few articles, we have introduced the underlying structure of objects. Now let’s explore what the underlying structure of classes looks like.

The structure of the class

  • View the memory structure of a class using LLDB
    • After the above study we have to report to the isa, yuan has a cognitive, now we will to make a study of the structure of the class when exploring the structure of the object before us first by LLDB view object of memory, and then through the analysis of the source code, so for class can also use the same idea.
    • Let’s stick with our old buddy.”x/4gx“To view the class, I still use theZXPersonFor example, let’s start with the first one8Bytes are, unsurprisingly, classisaPointer.

    • The second8Bytes printed are found to be Pointers to the parent class. Further information is not available, but we can explore it by looking at the source code.

    (PS: I hope I gave you a way to learn, if you encounter similar problems in the future will also provide a way of thinking.)

  • View the class structure from the source code
    • In the previous chapter I introduced that you can passxcrunorclangCommand toOCA filec++Compile, and then we found outOCIn theClassThe type will be changed after the underlying compilationobjc_classStructure to replace, then we can search directly through the source codeobjc_classKeyword to start our exploration.
    • Open theobjcAfter the sourcecommand+shift+Osearchobjc_class, and then select the result shown below.

    • By default, the first one is isa inherited from objc_Object, and the second one is superclass that we printed. The last two are cache_t and class_data_bits, followed by some methods. Overall, the structure contains four member attributes, two of which are class Pointers, so it’s a safe guess that the only remaining cache_t and class_datA_bits that hold critical data in a class are cache_T and class_datA_bits.

    • We can go further and further down. Bits is used in many methods, so most important data is stored in bits.

  • Expand your knowledge
    • We can expand on this a little bit before we go any further__has_feature(ptrauth_calls)This code may also be useful in our actual development.
    • _has_feature (ptrauth_calls) :Pointer authentication is used to determine whether the compiler supports pointer authentication, mainly for AppleA12Whether the processor or above is supportedarm64eRelevant ability; Among themptrauth_callsIs authentication against pointer identity.
  • Node summary:
    • This small pairClassThe structure makes a simple exploration of whichbitsIs one of the more critical attributes, this small pairclassThe structure makes a simple exploration of whichbitsIs a key attribute, but we do not know how to obtain this attribute, which requires us to have another knowledge point, that is, understandingPointer and memory translationThe concept.

Pointer and memory translation

  • The concept of memory pan is not hard to understand. We used the concept of memory pan again in the first section when we used the offset in the ASLR+MachO file to find the desired class information. The principle is to obtain the desired result by changing the cheap location of the current memory pointer. Let’s use a few examples to better understand this.

  • Common type memory address:
    • So let’s start with some simple code, and we’ll define it separately3aintType, where a variable is assigned to10And then assign each to the others2Variable, and then print the value and memory address.

    • We found that although3The values of all variables are10But the addresses are different, so we can print the addresses separately and see how the values are stored.

    • Address after printing0x7ffeefbff3fc,0x7ffeefbff3f8,0x7ffeefbff3f4The values are0x0aThat is10For common types, values are stored separately in memory.
  • Memory address of pointer type:
    • Let’s look at the memory address of the pointer type, which we createdNSObjectAnd then print them separately. throughLLDBBy analyzing it, I was able to get their previous relationshipaObjcFor example,&aObjcaddress0x7ffeefbff3e8The saved value isa0 59 54 00 01 00 00(Small segment mode looking from right to left) the positive row isaObjcThe address of the0x1005459a0And theaObjcThe value stored in the address is89 13 7a 80 ff ff 1d 01At the right moment is theaObjcThe object’sisaThe address of the0x011dffff807a1389.

  • Memory address of array type:
    • Finally, let’s look at the memory address of the array, so let’s create an array. After printing, we find that the address of the array type is the address of the first element of the array. So can we print the values of other elements in the array by manipulating memory? The answer was available at the time.

  • Memory translation test:
    • We just had an arrayarray[10]Perform a memory translation test as follows:

    • We go through the loop pairpointPointer to + operation, each time+ 1That’s the size of a type shift. Our array is of typeintType,4A byte size, then each time on the pointer+ 1Equivalent to memory translation4A step.(Each address is spaced by 4 sizes)

  • Node summary:
    • Now we can use memory translation to get all the attributes of the other members of the objC_class structure, but we still need to figure out how much translation we need to get all the data we want. And that’s the next summary that we’re going to figure out which is the in-memory computation of class structures.

Class structure and memory calculation

  • See here we first carry out a train of thought comb, the purpose is to get the results in front of us and we want to achieve behind the order in the clear.

  • Ideas:

    • 1. Through source code exploration, we found that some important data information is stored in the bits member variable, but we currently cannot use LLDB to print it.

    • 2. We have learned the method of memory translation. We can obtain the desired data by obtaining the first address of the object +offset offset.

    • The size distribution of ISA and superclass occupies 8 bytes. If we know the size of cache, we can get bits by translation.

  • Cache size:
  • Let’s look at the source code nowobjc_classIn thecacheWhat’s the size of the members,cacheThe type is named onecache_tWe can click on it to have a look.

  • As we explained in our last few articles, the only things that affect the size of a structure are the internal member variables;

  • Uintptr_t uintptr_t unsigned store pointer address, so it takes 8 bytes;

  • Second variable: The whole is a shared structure, so we need to observe how much space is occupied inside the shared structure.

    • The first variable of shared body structures:_maybeMaskType ismask_tClick on it and it turns out to be onetypedef uint32_t mask_t, is actuallyuint32_tTypes of4Bytes;
    • Second variable of shared body structure:_flags is a uint16_tType take up2Bytes;
    • Second variable of shared body structure:_occupiedIs alsouint16_tType take up2Bytes;
    • The second variable inside the common body:_originalPreoptCacheis(preopt_cache_t *)Pointer type so definitely occupy8Bytes;cacheThe interior is going to be16Bytes;
  • If we want to get bits, we need to offset the first address by 8+8+16 bytes (ISA +superclass+cache). This is 32 bytes in hexadecimal format.

  • Validation:
  • 1, obtain the first address of zxperson. class;

  • 2. Take memory translation for the first address, offset 0x20;

  • 3, find pointer to class_datA_bits_t;

  • 4. Print and callclass_data_bits_tIn thedata()Function to see if it can be correctly called successfully;

  • Node summary:

    Now that all the barriers to exploring the underlying class have been removed, we can examine the detailed structure of the class through the LLDB, which concludes the summary.

Analyze the structure of classes with LLDB

  • Before starting this section, we will make a detailed analysis of class_datA_bits_t bits. Above, we only have a cursory understanding of some important data in bits, but we do not know the specific contents of these data. So let’s explore it.

  • Class_data_bits_t bits hold the important information
    • We looked carefullyclass_data_bits_tStructures. I found one of themdata()Method, which returns aclass_rw_tThe structure of the body(Details about RW and RO will be introduced in the following chapters)To trackclass_rw_tWe found that this structure containsMethods (), properties(), protocols()This information, this is what we define in our normal class, and I’m going to use itLLDBTo analyze the structure of a class.

  • Structure analysis
    • We have passed through the above processLLDBAccess to thebitsthedata()Content, i.eclass_rw_tStructure. We’re throughLLDBcallclass_rw_tLet’s see what happens with the method in the structure.

    • In the printproperties()After that, the data can be returned normally, indicating that we have obtained itproperties()Now that the object is returned, let’s go back to the source code and see if there’s any way we can print the contents of the member properties. Check the source code we knowproperties()The returnedprotocol_array_tIt’s something called a structure.(Also printed above via LLDB)

    • To follow upprotocol_array_tSource code, its internal is alist_array_tt.(Also printed above via LLDB)

    • According to theLLDBPrinted resultlist_array_ttThere should be an internal one namedlistAnd inside the set is a set calledptrWith these questions I came to seelist_array_ttSource code, first let’s take a look at the comments.

    • Generic implementation of metadata extensible by category;
    • parameterElement: Element is the base metadata type (for example, method_t);
    • parameterListIs a list type of metadata (for example, a list of methods);
    • parameterPtr: is a template that is applied to elements, used to generate elements *, and is useful for applying qualifiers to pointer types. (The Ptr argument is a RawPtr passed in PROTOCOL_array_t, which is a template and most likely used for encapsulation.)

    • And if you look down there’s one insidePtr<List> listThe type andLLDBPrint the same content. Let’s see if we can get throughLLDBTo get.

    • We have to$28.list.ptrYou print it, you get the result, and then you print itp *$34gotproperty_list_tAnd found an internal one calledentsize_list_ttThe structure of phi, thenproperty_list_twithentsize_list_ttWhat does it matter?
    • Let’s go back toproperty_array_tMethod, discovery calllist_array_ttWhen the incoming3Each of these parameters corresponds to the comment content we saw aboveElement, List, Ptr.So let’s take a lookproperty_list_tHow is it defined.

    • It doesn’t seem to jump from here by clicking directly, so we use searchcommand+shift+O

    • It turns out thatproperty_list_tIs the inheritance andentsize_list_ttHe didn’t achieve anything on his own. So let’s go straight to itentsize_list_tt.

    • We just printedproperty_list_t, the results show thatentsize_list_ttThe inside of thecountThe size is4, which suggests thatentsize_list_ttItself is also a collection structure, through the source code found that there is an iterator inside(the iterator)It also provides a method to get the inner element, OK! After all the trials and hardships, I finally found it! We’re throughgetDoes the method print out the properties in our class?

    • We’re throughLLDBcall$35. Get (I) ' 'We took each of the0-3To print,0-3Is the property that we defined,ZxName, zxAge, zxSex, zxHeight;(When it prints to 4, it says it's out of bounds.)

    • Now that we’re done with the property list, let’s look at the method list;
  • Print method list data
    • Same way but get a different result, callget()Methods cannot print information about methods. Why? Look at the next summary.

  • Node summary:

    • A proposed structure is attached to facilitate understanding of the relationship between the structures.
    • This summary we passLLDBIn combination with the source code, rightclass_rw_tThe structure is analyzed in detail and can successfully print out the attribute information, but when the same method is used to print the method information, it fails to meet the expectations. So, we’ll explore that in the next summary, and that’s the end of this summary.

Member variables and class methods

  • Continuing with the above legacy problem, we can correctly obtain attribute information in the same way, but the method information is empty, which is not quite the same as the expected result. So let’s think about the positioning problem first.

    • 1, first of all, according to the aboveUMLGraph to analyze when we printp $6.get(0)Is returnedmethod_tIn fact, the structure is not empty, but the structure does not output as we expected.
    • 2. Then whether the problem lies inmethod_tHow about in the structure? Let’s look at the source code again.
  • Method_t source code analysis
    • Found in themethod_tOne of them is calledbigAnd the internal structure of this structure has3Two member variables. If we try to print this, can we achieve the desired effect? Give it a try!

    • Let’s say p$7.get(1).big()And grace! Sure enough, the name of the method was printed.

  • Class methods and member variables
    • Let’s first define a class with member variables, attributes, class methods, instance methods, to do an experiment. The purpose is to observemethods()withproperties()Whether or not to include this information.

    • properties()Result in: Contains only onezxNameProperties, member variables are not shown.

    • methods()Result: contains5One method, two instance methodsUltimately responds, initThen a couplezxNameProperties of thegetterwithsetterAnd one morecxx_destructisC++. But the class methods are not shown.

    • So let’s see how do we find the member variablesnikeNameAnd we first went back toclass_rw_tHere, because this is where we know it’s storedmethods(),properties()The data, so the member variables must be there, but they’re probably hidden deep, and we’re starting frommethods()Go up one investigation always found again, look firstRo (), set_ro.

    • set_roThe function is just an assignment and it doesn’t make any sense to go over it, and we’ll seero()So one is returned hereclass_ro_tStructure, get in there and see if there are any surprises!

    • Let’s print it out. Ha ha, it’s here! andzxNameProperty automatically generates one_zxNameMember variables. Now that the member variable problem is solved, where are the class methods?

    • Or in theclass_rw_tOne of them is calledbaseMethods()The delta function, this wouldn’t be it? ! After my personal experiments, I found that there is no information about class methods in it.(I have omitted the verification step here)Instance methods can be obtained from classes. Can class methods be obtained from metaclasses? So let’s verify that.

    • Let me print it out in codeZXPersonThe metaclass

    • Then print according to the steps, and finally verify our assumption that there is no problem.

  • Node summary:
    • This summary, after theclass_rw_tSource code analysis, the success of the member variables and class method information to find out, so we should have on the class stored in the data have a whole cognition and understanding. That’s the end of this summary.

9. To summarize

  • 1. This article first guides you fromisaDerived metaclass, and detailed analysis and verificationisaConcepts such as bit-walking and metaclass inheritance.(This concept is not hard to recite, play once to understand)
  • 2. Through the method of memory translation, the overall study of the structure of the class is carried out, mainly on which structures are stored in the attributes and methods of the class, and throughLLDBTo verify.
  • 3. Finally, we startedclass_rw_tThis structure, what does this structure actually mean? So look at the next one.
Complementary extension
  • MachO file:MachOA file format that includes executable files, static libraries, dynamic libraries,dyldAnd so on; Executable files that are included are a collection of multiple schemas, such as includeArmv7, arm64And so on;
  • ASLR:iOSOpen one in the systemAppIt is time to beAppBinary data from the hard diskcopyTo the memory, then the binary data will correspond to a memory address, due to security and other factors, the memory address is replaced by the virtual cache address, and the address starting position is dynamic, each time the startup will be different, this technology isASLRSo whenDYLDloadingMachOThe first step is to relocate the data, and this relocation is calledrebase.

Wrote last
  • This summary code example gitHub download address
  • To this content and the end of this! Don’t forget to give a thumbs-up if you like it! Your praise is my biggest power source!
Navigation:
  • Previous: Objective-C Low-level Object Research – Part 2 Next: to be continued…..