Write in front: iOS underlying principle exploration is my usual development and learning in the accumulation of a section of advanced road. Record my continuous exploration of the journey, I hope to be helpful to all readers.Copy the code

The directory is as follows:

  1. IOS underlying principles of alloc exploration
  2. The underlying principles of iOS are explored
  3. The underlying principles of iOS explore the nature of objects & isa’s underlying implementation
  4. Isa-basic Principles of iOS (Part 1)
  5. Isa-basic Principles of iOS (Middle)
  6. Isa-class Basic Principles of iOS Exploration (2)
  7. IOS fundamentals explore the nature of Runtime Runtime & methods
  8. Objc_msgSend: Exploring the underlying principles of iOS
  9. Slow lookups in iOS Runtime
  10. A dynamic approach to iOS fundamentals
  11. The underlying principles of iOS explore the message forwarding process
  12. Dyld (part 1)
  13. IOS Basic Principles of application loading principle dyld (ii)
  14. IOS basic principles explore the loading of classes
  15. The underlying principles of iOS explore the loading of categories
  16. IOS underlying principles to explore the associated object
  17. IOS underlying principle of the wizard KVC exploration
  18. Exploring the underlying principles of iOS: KVO Principles | More challenges in August
  19. Exploring the underlying principles of iOS: Rewritten KVO | More challenges in August
  20. The underlying principles of iOS: Multi-threading | More challenges in August
  21. GCD functions and queues in iOS
  22. GCD principles of iOS (Part 1)
  23. IOS Low-level – What do you know about deadlocks?
  24. IOS Low-level – Singleton destruction is possible?
  25. IOS Low-level – Dispatch Source
  26. IOS bottom – a fence letter blocks the number
  27. IOS low-level – Be there or be Square semaphore
  28. IOS underlying GCD – In and out into a scheduling group
  29. Basic principles of iOS – Basic use of locks
  30. IOS underlying – @synchronized Flow analysis
  31. IOS low-level – The principle of lock exploration
  32. IOS Low-level – allows you to implement a read/write lock
  33. Implementation of Objective-C Block
  34. Implementation of Objective-C Block
  35. IOS bottom – Block, comprehensive resolution!
  36. IOS Basics – Startup Optimization (part 1)
  37. IOS Basics – Startup Optimization (2)
  38. Exploration of basic principles of iOS — Memory management of memory five areas

Summary of the above column

  • Summary of iOS underlying principles of exploration

Sort out the details

  • Summary of iOS development details

preface

Following up on the last article, we looked at some of the optimizations Apple made to the Objective — C Runtime. According to Apple’s official statement, even if our app does not make any optimization, it will run faster. Then what optimization does Apple make at the bottom to support it? Today, we’re going to find out.

Memory management scheme

When it comes to memory management, everyone should think of ARC MRC, etc. Here we will look at the following three:

First, let’s review the changes Apple has made to the runtime

Changes made to the Objective-C runtime from Apple WWDC2020

Advances in objective-C runtime

Dive into the microcodding world of low-level bits and bytes that underpin every Objective-C and Swift class. Learn how recent changes to internal data structures, method lists, and tag Pointers provide better performance and lower memory usage. We’ll show you how to identify and fix code crashes that depend on internal details, and show you how to make your code impervious to runtime changes.

Tagged Pointer Format Changes

What is Tagged Pointer

So let’s take a look at the structure of a pointer to an ordinary object and usually when we look at these Pointers they’re printed out as these big hexadecimal numbers that we see all the time; Let’s break it down into binary notation; We have 64 bits and we’re not really using all of those bits; We only use the middle bits in a real object pointer; Objects must always be located at an address that is a multiple of the size of the pointer due to the existence of the alignment requirement; Because the address space is limited it’s always going to be zero and we’re not actually going to use 2^64; These highs and lows are always 0; So let’s pick one of these bits that are always 0 and set it to 1; That lets us immediately know that this is not a real object pointer and then we can give all the other bits some other meaning;

We call this pointer tagged pointer

For example, we can insert a number in other bits

As long as we want to teach NSNumber how to read these bits and have the runtime properly handle tagged Pointer the rest of the system can treat these things as object Pointers and never know the difference, okay

This saves us the cost of assigning a small number object to every case like this which is a major improvement

By the way, these values are actually obfuscated by combining them with random values initialized when the process starts

This security measure makes it difficult to forge Tagged Pointer

We’re going to ignore that for the rest of the discussion because it just adds a layer at the top just be aware that if you actually try to look at these values in memory they get scrambled

Tagged Pointers On Intel

This is the full format of Tagged Pointer on Intel

We set the low value to 1 to indicate that this is a tagged pointer

And as we talked about for a real pointer this bit has to always be zero so that allows us to distinguish between them

The next three bits are the tag number and that’s the type of tagged Pointer. For example, 3 means it’s an NSNumber and 6 means it’s an NSDate

Since we have three tag bits there are eight possible tag types

The rest of the bits are the payload which is a particular type of data that you can use at will

This is the actual number for the tagged NSNumber

Now tag 7 has a special case that represents an extension tag and the extension tag uses the next 8 bits to encode the type which allows 256 additional tag types at the cost of reducing the payload

This allows us to use Tagged Pointer for more types as long as they can fit their data into a smaller space

This can be used for things like user interface colors or NSIndexSets

Now if this is convenient for you you may be disappointed because only the run-time maintainer, Apple, can add tagged Pointer

But if you’re a Swift programmer you’ll be happy to create your own tagged Pointer type if you’ve ever used an enumeration with associated values that’s a class like tagged Pointer

The Swift runtime stores the enumeration discriminator in the spare bits of the associated value payload

And Swift’s use of value types actually makes tagged Pointer less important because the value no longer needs to be exactly the size of the pointer

For example, the Swift UUID type can be two words and kept inline rather than assigning a separate object because it does not fit inside a pointer

So that’s Tagged Pointer on Intel let’s look at ARM

Tagged Pointers On ARM64

On arm64 these are reversed

Setting the highest bit to 1 instead of the lowest is used to indicate a tagged pointer

Then the label number appears in the next three bits

The payload uses the remaining bits

Why do we use the top bit to represent Tagged Pointer on ARM instead of the bottom bit as we do on Intel?

Well this is actually a small optimization to objc_msgSend

We want the most common path in msgSend to be as fast as possible and the most common path is a plain pointer

We have two less common situations

Tagged pointer and nil

It turns out that when we use the highest bit we can check for both of them in a single comparison compared to checking for nil and tagged Pointer separately which saves a conditional branch for the common case in msgSend

And just like in Intel we have a special case for tag 7 where the next 8 bits are used as an extension tag and then the rest of the bits are used for the payload or this is actually the old format that iOS 13 used


In the 2020 release we made some changes to keep the TAB bits at the top because msgSend optimizations are still very useful

The label number has now moved to the bottom three digits

If you are using an extension tag, it will occupy the top 8 bits after the tag bit

Why do we do this? Well let’s look at the normal pointer again

Existing tools such as dynamic linking Ignore the first eight bits of Pointers due to an ARM feature called Top Byte Ignore

Instead, we place the extension tag in the Top Byte Ignore bit

The bottom three bits of an aligned pointer are always 0

But we can change that just by adding a little number to the pointer

We’ll add 7 to set the low level to 1 and remember that 7 means this is an extension tag

This means that we can actually put the pointer above into an extended label pointer payload

The result is a tagged pointer with a normal pointer in its payload

Why is this useful? Ok, so it turns on tagged Pointer the ability to reference constant data in binaries like strings or other data structures that would otherwise have to eat up dirty memory

Of course, these changes now mean that code that accesses these bits directly will be disabled when iOS 14 is released later this year

In the past bit checks like this could be done

But on a future OS it’s going to give you the wrong answer and your app is going to start destroying user data for no apparent reason

So don’t use code that relies on anything we’ve talked about instead you can probably guess what I’m going to say which is use the API, right

Type checks like isKindOfClass and they’re going to work on the old tagged Pointer format and they’re going to work on the new tagged Pointer format and all NSString or NSNumber methods are going to work on those All information in Tagged Pointer can be retrieved using standard apis


According to Apple, Tagget Pointer has a 3x space efficiency and a 106-fold increase in create/destroy performance.


Code validation

Simulator environment

NSTaggedPointerString

Its address 0xe162FF2824E12CCF looks a little bit unscientific; In the previous article, the NSTaggedPointerString address usually starts with 0x7 if it is in the stack, 0x6 if it is in the heap, and 0x1 if it is in the global area.

Let’s take a look at the source environment:

_objc_decodeTaggedPointer


These values are actually obfuscated by combining them with random values initialized when the process starts
// This security measure makes it difficult to forge tagged Pointer
static inline uintptr_t
_objc_decodeTaggedPointer_noPermute(const void * _Nullable ptr)
{
    uintptr_t value = (uintptr_t)ptr;
#if OBJC_SPLIT_TAGGED_POINTERS
    if ((value & _OBJC_TAG_NO_OBFUSCATION_MASK) == _OBJC_TAG_NO_OBFUSCATION_MASK)
        return value;
#endif
    return value ^ objc_debug_taggedpointer_obfuscator;
}

static inline uintptr_t
_objc_decodeTaggedPointer(const void * _Nullable ptr)
{
    uintptr_t value = _objc_decodeTaggedPointer_noPermute(ptr);
#if OBJC_SPLIT_TAGGED_POINTERS
    uintptr_t basicTag = (value >> _OBJC_TAG_INDEX_SHIFT) & _OBJC_TAG_INDEX_MASK;

    value &= ~(_OBJC_TAG_INDEX_MASK << _OBJC_TAG_INDEX_SHIFT);
    value |= _objc_obfuscatedTagToBasicTag(basicTag) << _OBJC_TAG_INDEX_SHIFT;
#endif
    returnvalue; } -/*********************************************************************** * initializeTaggedPointerObfuscator * Initialize objC_debug_taggedpointer_obfuscator with randomness. * * The marked pointer obfuscator is designed to make it more difficult * for an attacker to construct a specific object with a marked pointer, * Memory when a buffer overflow or other write control occurs. * When set, fuzzer is xORD with marker pointer or retrieves payload value. * They are first full of random use. * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * /
static void
initializeTaggedPointerObfuscator(void)
{
    if(! DisableTaggedPointerObfuscation) {// Pull random data into the variable, then shift away all non-payload bits.
        arc4random_buf(&objc_debug_taggedpointer_obfuscator,
                       sizeof(objc_debug_taggedpointer_obfuscator));
        objc_debug_taggedpointer_obfuscator &= ~_OBJC_TAG_MASK;

#if OBJC_SPLIT_TAGGED_POINTERS
        // The obfuscator doesn't apply to any of the extended tag mask or the no-obfuscation bit.
        objc_debug_taggedpointer_obfuscator &= ~(_OBJC_TAG_EXT_MASK | _OBJC_TAG_NO_OBFUSCATION_MASK);

        // Shuffle the first seven entries of the tag permutator.
        int max = 7;
        for (int i = max - 1; i >= 0; i--) {
            int target = arc4random_uniform(i + 1);
            swap(objc_debug_tag60_permutations[i],
                 objc_debug_tag60_permutations[target]);
        }
#endif
    } else {
        // Set the obfuscator to zero for apps linked against older SDKs,
        // in case they're relying on the tagged pointer representation.
        objc_debug_taggedpointer_obfuscator = 0; }}Copy the code

_objc_makeTaggedPointer

static inline void * _Nonnull
_objc_makeTaggedPointer(objc_tag_index_t tag, uintptr_t value)
{
    // PAYLOAD_LSHIFT and PAYLOAD_RSHIFT are the payload extraction shifts.
    // They are reversed here for payload insertion.

    // ASSERT(_objc_taggedPointersEnabled());
    if (tag <= OBJC_TAG_Last60BitPayload) {
        // ASSERT(((value << _OBJC_TAG_PAYLOAD_RSHIFT) >> _OBJC_TAG_PAYLOAD_LSHIFT) == value);uintptr_t result = (_OBJC_TAG_MASK | ((uintptr_t)tag << _OBJC_TAG_INDEX_SHIFT) | ((value << _OBJC_TAG_PAYLOAD_RSHIFT) >>  _OBJC_TAG_PAYLOAD_LSHIFT));return _objc_encodeTaggedPointer(result);
    } else {
        // ASSERT(tag >= OBJC_TAG_First52BitPayload);
        // ASSERT(tag <= OBJC_TAG_Last52BitPayload);
        // ASSERT(((value << _OBJC_TAG_EXT_PAYLOAD_RSHIFT) >> _OBJC_TAG_EXT_PAYLOAD_LSHIFT) == value);
        uintptr_t result =
            (_OBJC_TAG_EXT_MASK |
             ((uintptr_t)(tag - OBJC_TAG_First52BitPayload) << _OBJC_TAG_EXT_INDEX_SHIFT) |
             ((value << _OBJC_TAG_EXT_PAYLOAD_RSHIFT) >> _OBJC_TAG_EXT_PAYLOAD_LSHIFT));
        return_objc_encodeTaggedPointer(result); }}Copy the code

In read_images, this random number is initialized;

We can try to take out the confused value, manipulate it, and have a chance to get a real pointer.

Let’s look at the next example:

How do the types correspond here?

In the following definition:

Real machine environment

The actual printed content here is still different from what Ben said in WWDC, indicating that Apple has made a small change (the version of the real computer system I tested is not 14.2);

After string, let’s look at Number:

supplement

Finally, we added environment variables to the debug environment to turn off the printing tests.