Improvements to the Objective-C runtime in WWDC2020 This video mentions some changes to the data structure of classes. This article is a translation of some of the changes mentioned in this video.
If you want to know more detailed content, please click go to website to see the video: developer.apple.com/wwdc20/1016… .
The video introduces three changes:
- The first is the changes to the data structures that the Objective-C runtime uses to track classes.
- The second is the change in the list of Objective-C methods.
- Finally, the tagged Pointer format changes.
1. Changes in data structures
The class object itself contains the most frequently accessed information: Pointers to metaclasses, superclasses, and method caches. It also has a pointer to more data. The place to store additional information is called class_ro_t.
“Ro” stands for read-only and includes information like class nouns, methods, protocols, and instance variables. The Swift and Objective-C classes share this data structure, so each Swift class has these data structures as well.
Classes start like this when they are first loaded from disk into memory, but they change as they are used.
Before you understand these changes, consider the difference between clean and dirty memory.
* clean memory
: indicates the memory that does not change after loading.class_ro_t
Belong toclean memory
Because it is read-only.
* dirty memory
: Refers to the memory that changes while the process is running. The class structure becomes as soon as it is useddirty memory
Because the runtime writes new data to it. For example, create a new method cache and point to it from the class.
Dirty memory is much more expensive than clean memory, and it must always exist as long as the process is running. Clean memory, on the other hand, can be removed to save more memory, and the system can be reloaded from disk when clean memory is needed.
MacOS can optionally invoke dirty memory, but because iOS doesn’t use swap, dirty memory is expensive in iOS.
While this data is enough to get us started, the runtime needs to track more information about each class, so when a class is first used, the runtime allocates it additional storage capacity.
The runtime allocated storage capacity is class_rw_T for read-write data. In this data structure, we store new information that is only generated at runtime, First Subclass, Next Sibling Class.
For example, all classes are linked into a tree structure by using the First Subclass, Next Sibling Class pointer, which allows the runtime to traverse all classes currently in use, which is useful for invalidation of method caching.
But why have methods and properties here when they are also in read-only data?
Because they can be changed at run time, when a category is loaded, it can add new methods to the class, and programmers can use the runtime API to add them dynamically, while class_ro_t is read-only, so we need to keep track of these things in class_rw_t.
Only the Swift class will use itdemangled name
Field, and the Swift class does not need this field unless something accesses their Objective-C names.
So we can remove the parts that we don’t use -class_rw_ext_t, which reduces the size of class_rw_t by half.
For classes that really need additional information, we can assign one of these extension records and slide it into the class for its use.
The list of Objective-C methods changes
Each class comes with a list of methods that are added to the list as you write new methods on the class. The runtime uses these lists to parse the message send.
Each method contains three pieces of information.
-
The first is the name of the method, or the selector, the selector is a string, but they’re unique, so they can be compared using pointer equality.
-
Next comes the type encoding of the method. This is a string that represents the parameter and return type. It is not used to send messages, but it is required for introspection and message forwarding at runtime.
-
Finally, there’s a pointer to the implementation of the method, the actual code for the method, and when you write a method, it compiles into a C function that contains your implementation, and then the entry in the list of methods points to that function.
Third, tagged pointer
What is tagged Pointer? The binary representation of 0x00000001003041E0 is shown below:
We break it down into binary notation and we have 64 bits and we don’t really use all of those bits.
- We only used the middle bits in a real object pointer.
- Objects must always be located at an address that is a multiple of the size of the pointer due to the existence of the alignment requirement.
- Because the address space is limited it’s always going to be zero and we’re not actually going to use 2 to the 64.
0000 0000 0000 0000 0000 0000 0000 0001 0000 0000 0011 0000 0100 0001 0001(0000become0001)
Copy the code
- These high and low bits are always 0 so let’s pick one of these bits that are always 0 and set it to 1, that will tell us immediately that this is not really a pointer to an object and then we can give all the other bits some other meaning and we’ll call this pointer
tagged pointer
.
For example, we could put a value in other bits and as long as we wanted to teach NSNumber how to read those bits and make the runtime handle tagged Pointer appropriately the rest of the system could treat those things as object Pointers and never know the difference.
This saves us the cost of assigning a small number object for every case like this, which is a major improvement.