The interview is almost over, and if nothing happens (for example, the employer finally withdraws the offer), I will go to a company that I think has more space to work. While preparing for the interview, I bought a copy of iOS Interview Tips and found some flaws in the technical section. I sent an email to the co-author of the technical section of the book, but later found out that there were some mistakes in that email as well. Now I have time to proofread and expand some content and choose to publish it on Chinese web. Because one of the topics covered the lengthy Swift object model and runtime details, it was published as a separate article and the rest as a second article. This is the first article to discuss the linked list code example on page 34 of iOS Interview Tips.

Automatic reference counting + Long list release == stack explosion?

I noticed that the linked list code example on page 34 of this book uses automatic reference counting to manage the successor nodes of a linked list node. However, in my experience, this construction has the potential to cause the app to crash if the list becomes too long and bursts the stack when it is released.

class ListNode {
    var val: Int
    var next: ListNode?
    
    init(_ val: Int) {
        self.val = val
        self.next = nil}}Copy the code

To create a new macOS command line project, open Xcode and type:

import Foundation

print("Main thread stack size: ", Thread.main.stackSize)

class ListNode {
    var val: Int
    var next: ListNode?
    
    init(_ val: Int) {
        self.val = val
        self.next = nil
    }
}

autoreleasepool {
    let root = ListNode(0)
    
    var current = root
    
    for _ in0.. < 100000 {let next = ListNode(0)
        current.next = next
        current = next
    }
}

print("Foo")
Copy the code

Question to consider: Do you notice anything strange in the above code?

If the stack does not burst, the program will eventually print “Foo”. But at runtime, the program crashes — Xcode switches the left Navigator panel to the Debug Navigator and displays the following:

Intuitively, we can guess that the process the compiler used to release Next in ListNode’s automatically generated deinit was a recursive process, and 100,000 nodes would cause the recursive process to get too deep, and the call stack would burst. But here’s the problem:

  • Is it really so?
  • Swift doesn’t do tail recursive optimization?

Tail recursion optimization refers to that when a recursive function calls itself at the end of the function body, we can directly use the original stack frame to calculate without creating a new stack frame, so as to avoid the danger of stack bursting due to the call stack being too deep in the recursive process. This optimization is an optional part of compilation optimization, and is at the discretion of the compiler vendor.

Understanding tail-recursion in depth and discussing tail-recursion optimization in iOS development involves process abstraction in the runtime environment of compilation principles. I’ll write about process abstraction in iOS development later, but we won’t go into that.

All of these issues require a basic understanding of how Swift works when releasing class instances.

Trace the ListNode instance release process

To understand how Swift works in more detail, it is often tempting to look directly at the compiled binaries. Swift we can save the ListNode part of the code as a file called callstack. swift and decompile it with Hopper on the command line: swiftc Callstack. swift

Apparently, the Swift compiler also uses a similar technique to the C++ compiler’s name mangling to adapt the names of members of the translation unit. This minimizes the duplication of functions and types in Swift. But this also results in the fact that in this diagram, in addition to being able to guess the corresponding process for the accessors of val and Next properties (i.e. the corresponding machine code of the function in the binary), It is no longer possible to guess which procedure corresponds to this automatically generated deinit function.

The distinction between “function” and “procedure” varies from context to context. In the early days of computer programming, functions referred to functions that returned values, while procedures referred to functions that did not return values. By now, people barely distinguish between functions and procedures. Here, I use functions to represent functions in Swift, and procedures represent the corresponding machine code sequence of functions in the binary.

If we could go back to a higher level of compilation at this point, we might find the answer. For many of you, however, you may only know the process of converting source code to AST (abstract syntax tree), AST to LLVM IR (LLVM intermediate representation), LLVM IR to generate machine code, and then the linker to synthesize object files (executable files or libraries). And for these people, viewing and analyzing AST or LLVM IR is very new. But Swift’s compilation process is a little different: The Swift compilation process is not the same as the traditional Clang + LLVM compilation process. A product called SIL (Swift Intermediate Language) is generated between AST and LLVM IR. SIL is a layer of abstraction between Swift AST and LLVM IR, an intermediate compiler representation of SSA form (static single-assignment form) with high-level (as opposed to low-level machine language) semantics. If you don’t understand this, think of SIL as a combination of high-level language semantics (i.e., “instruction set” abstracted from the Swift runtime, not just the target platform) and low-level machine abstractions (e.g., using %0 %1 %2… %n to represent virtual registers, while allowing partial manipulation of memory layout). Since it has both high level semantics and low level, maybe we can find information about ListNode’s deinit function here (if you can check out this video on YouTube to see what SIL is, but you need a little compiler knowledge).

Swiftc-emit -sil callstack. swift is used at the command line to obtain the sil of the source code.

Both -emit-sil and -emit-silgen can be used to generate a Swift SIL, but the former has syntax diagnostic information and the latter is a raw SIL.

Deinit process analysis: SIL perspective

By searching listNode.deinit we can find the following:

// ListNode.deinit
sil hidden @$s9CallStack8ListNodeCfd : $@convention(method) (@guaranteed ListNode) -> @owned Builtin.NativeObject {
// %0                                             // users: %4, %2, %1
bb0(%0 : $ListNode):
  debug_value %0 : $ListNode.let, name "self", argno 1 // id: %1
  %2 = ref_element_addr %0 : $ListNode.#ListNode.next // user: %3
  destroy_addr %2 : $*Optional<ListNode>          // id: %3
  %4 = unchecked_ref_cast %0 : $ListNode to $Builtin.NativeObject // user: %5
  return %4 : $Builtin.NativeObject               // id: %5
} // end sil function '$s9CallStack8ListNodeCfd'
Copy the code

Interestingly, even if you customize a blank deinit function in ListNode, the Swift compiler generates the same SIL, As you can see, the Swift compiler automatically completes the destroy code for all instance members in its custom deinit function.

$s9CallStack8ListNodeCfd; $s9CallStack8ListNodeCfd;

bb0(%0 : $ListNode):
  ...
  %2 = ref_element_addr %0 : $ListNode.#ListNode.next // user: %3
  destroy_addr %2 : $*Optional<ListNode>          // id: %3
  ...
}
Copy the code

First of all, notice the bb0:

Bb stands for Basic Block, a concept from LLVM: a basic block of code.

A Swift function written to SIL may have only one BASIC block, or it may have many basic blocks due to various control flows (if-else, switch-case, etc.). Bb0 is the 0th basic block in the body of the function.

Then, we can also notice that bb0 has another (%0: $ListNode) behind it:

In this case, %0 is essentially a local variable in the BASIC block bb0, which refers to virtual register 0 and is of type ListNode. In particular, virtual register no. 0 acts as the first “formal parameter” to this Basic block — bb0 is not a function, of course, and I’m just borrowing the “formal parameter” concept here to help you understand what %0 is. And then you’ll see in SIL %1 %2 %3… %n this notation, these all stand for “NTH virtual register” – and they are also called “local variables”. Again, this notation comes from LLVM IR, from which SIL borrows.

Finally, there’s what ref_element_addr and destroy_addr are:

These things are called SIL instructions. These things are also why SIL is called SIL (Swift Intermediate Language) — because these SIL instructions are not abstractions of instructions about the target platform, but abstractions of the Swift runtime. We can think of these SIL instructions as functions, and they actually do call the corresponding functions in the Swift runtime written in C++ after the LLVM IR is generated.

Let’s talk about what SIL does:

This section is all about passing %0 of bb0 into the ref_element_ADDR directive. Then catch the return value with %2, and pass it into the Destroy_ADDR directive.

We can find a description of the ref_element_ADDR directive here:

sil-instruction ::= 'ref_element_addr' sil-operand ', ' sil-decl-ref

%1 = ref_element_addr %0 : $C.#C.field
// %0 must be a value of class type $C
// #C.field must be a non-static physical field of $C
// %1 will be of type $*U where U is the type of the selected field
//   of C
Copy the code

Given an instance of a class, derives the address of a physical instance variable inside the instance. It is undefined behavior if the class value is null.

So we know that this instruction is used to get the address of the instance variable.

Sil-instruction ::= ‘ref_element_addr’ sil-operand ‘,’ sil-decl-ref ‘

while

%2 = ref_element_addr %0 : $ListNode.#ListNode.next
Copy the code

Get the address of the listNode. next instance variable for the ListNode instance on %0.

Next sentence:

destroy_addr %2 : $*Optional<ListNode>
Copy the code

We can find the documentation for Destroy_ADDR here:

sil-instruction ::= 'destroy_addr' sil-operand

destroy_addr %0 : $*T
// %0 must be of an address $*T type
Copy the code

Destroys the value in memory at address %0. If T is a non-trivial type, This is equivalent to:

%1 = load %0
strong_release %1
Copy the code

except that destroy_addr may be used even if %0 is of an address-only type. This does not deallocate memory; it only destroys the pointed-to value, leaving the memory uninitialized.

If T is a trivial type, then destroy_addr is a no-op.

We can learn from this that the SIL statement is actually equivalent to executing:

strong_release %2 : $*Optional<ListNode>
Copy the code

The documentation explains strong_release as follows:

strong_release %0 : $T
// $T must be a reference type.
Copy the code

Decrements the strong reference count of the heap object referenced by %0. If the release operation brings the strong reference count of the object to zero, the object is destroyed and @weak references are cleared. When both its strong and unowned reference counts reach zero, the object’s memory is deallocated.

So we know that strong_release reduces the strong reference count for that object. When the strong reference count reaches zero, the object is destroyed and the weak reference is set to nil. When the strong reference count and unowned reference count both reach zero, the object memory is deallocated.

Because there is no unowned reference except a strong reference to the next node in the ListNode, it is not difficult to conclude: The SIL statement is intended to destroy the instance variable next, but then causes the memory space pointed to behind the instance variable next to be deallocated.

The deinit function does not deallocate the ListNode instance in its SIL. Is there a function that “wraps” the Deinit function and releases the ListNode? Or is freeing an object instance actually done by the Swift runtime?

A “surprise”…

Let’s test our first guess: After searching for $s9CallStack8ListNodeCfd in our generated SIL content, which is the procedure for deinit, we can see that the Swift compiler also generates a procedure called $s9CallStack8ListNodeCfd, The corresponding Swift function name should be listNode.__deallocating_deinit according to the comment in SIL. This process also wraps up our deinit function:

// ListNode.__deallocating_deinit
sil hidden @$s9CallStack8ListNodeCfD : $@convention(method) (@owned ListNode) - > () {// %0 // users: %3, %1
bb0(%0 : $ListNode):
  debug_value %0 : $ListNode.let, name "self", argno 1 // id: %1
  // function_ref ListNode.deinit
  %2 = function_ref @$s9CallStack8ListNodeCfd : $@convention(method) (@guaranteed ListNode) -> @owned Builtin.NativeObject // user: %3
  %3 = apply %2(%0) : $@convention(method) (@guaranteed ListNode) -> @owned Builtin.NativeObject // user: %4
  %4 = unchecked_ref_cast %3 : $Builtin.NativeObject to $ListNode // user: %5
  dealloc_ref %4 : $ListNode                      // id: %5
  %6 = tuple ()                                   // user: %7
  return %6 : $()                                 // id: %7
} // end sil function '$s9CallStack8ListNodeCfD'
Copy the code

So our first guess is true.

__deallocating_deinit Process analysis: SIL perspective

The main contents of the SIL of the listNode.__deallocating_deinit function are as follows:

bb0(%0 : $ListNode%) :2 = function_ref @$s9CallStack8ListNodeCfd : $@convention(method) (@guaranteed ListNode) -> @owned Builtin.NativeObject // user: %3
  %3 = apply %2(%0) : $@convention(method) (@guaranteed ListNode) -> @owned Builtin.NativeObject // user: %4
  %4 = unchecked_ref_cast %3 : $Builtin.NativeObject to $ListNode // user: %5
  dealloc_ref %4 : $ListNode                      // id: %5. }Copy the code

$s9CallStack8ListNodeCfd = $s9CallStack8ListNodeCfd = $s9CallStack8ListNodeCfd This function returns %3, casts the value as a ListNode, and stores it in %4, passing %4 as the actual argument to the SIL dealloc_ref directive.

We can find the dealloc_ref documentation here:

sil-instruction ::= 'dealloc_ref' ('[' 'stack' '] ')? sil-operand

dealloc_ref [stack] %0 : $T
// $T must be a class type
Copy the code

Deallocates an uninitialized class type instance, bypassing the reference counting mechanism.

The type of the operand must match the allocated type exactly, or else undefined behavior results.

The instance must have a retain count of one.

This does not destroy stored properties of the instance. The contents of stored properties must be fully uninitialized at the time dealloc_ref is applied.

The stack attribute indicates that the instruction is the balanced deallocation of its operand which must be a alloc_ref [stack]. In this case the instruction marks the end of the object’s lifetime but has no other effect.

So we know that the dealloc_ref SIL directive is used to release and de-initialize the class instance, ignoring the reference count and requiring the instance to have a reference count of 1. Also, the SIL directive does not destroy the stored properties inside the instance, The contents of stored properties must be cleared before the dealloc_ref directive is executed (which is what listNode.deinit does).

Sil-instruction ::= ‘dealloc_ref’ (‘[‘ stack’ ‘]’)? Sil-operand or something like that. I once saw this expression in a book on compiler construction published in the United States in the 1990s called EBNF, which stands for Extended Backus — Naur Form, a notation for context-free syntax (CFG). Those who don’t understand “context-free syntax” can interpret it as a language construct stripped of all the semantically valid aspects of a programming language, such as the types that must match when a function passes an argument, the types that must match when a variable is assigned, and so on. Of course, this doesn’t just apply to compilers. Natural languages can also be used. BNF notation is an extension of the BNF notation that we usually study in compiler courses. This has been extended to make the context-free syntax (CFG) used for BNF notation more concise.

When read in Chinese, the sentence EBNF should read:

The “dealloc_ref” “[” “stack” “]” SIL operator (sil-operand) can be derived with sil-instruction, where “[” “stack” “]” is the optional part.

If we take an example of the input acceptable to the “context-free grammar” rule described by EBNF, we would have:

dealloc_ref [ stack ] %foo : $Bar

dealloc_ref %foo : $Bar
Copy the code

If we use BNF rewrite, then we have to write it like this

sil-instruction ::= 'dealloc_ref' stack-operand-or-epsilon sil-operand
stack-operand-or-epsilon ::= 'stack'| epsilon.Copy the code

Is the EBNF much simpler?

You may also notice that the documentation for the dealloc_ref directive mentions alloc_ref, the SIL directive that allocates an instance of a class, and a parameter called stack. From what I’ve explained about EBNF above, you can easily figure out that Swift supports allocating class instances on stack memory. But I haven’t had time to explore how to do this in code, so I won’t do it here.

ListNode instance release process SIL summary

We can infer from the SIL excerpt above that the listNode.__deallocating_deinit function is used to free the memory allocated for an object on the deallocate heap memory, and this is done by calling listNode.deinit. The listNode.deinit function is responsible for destroying the internal members of the instance. Of course, the Next member of the ListNode has no other unowned references or strong references, so destroying the next instance member also causes the memory region pointed to by the next node to be deallocated. Obviously, the above procedure causes an indirect recursive call.

Now we can turn on Hopper again and find the process for deinit.

Where we found swiftc generated machine code for deinit after this process will actually call a process called _ $s9CallStack8ListNodeCSgWOh, so we find it:

Yes, and you’ll see that the procedure ends up calling swift_release, which, based on our previous study of SIL in deinit, should be the dealloc_ref SIL instruction, It should also be this process that finally performs the release of the next node.

Question to consider: Guess the design intent of Swift generating a separate function for the ListNode member destruction process and enclosing it from deinit (outline).

We can guess from the name that this function probably does the same thing as [Nsobject-release] or CFRelease. Based on our common knowledge of reference counting, it should also do the same thing as releasing memory after reference counting to zero. But this is just a guess, and we need to verify it from a source code point of view.

Here we should note that Swift_Release is part of the Swift runtime and is provided by the Swift standard library. The Swift standard library will be linked to the object file during the compilation of the Swift program. In the following analysis, we should not forget that we are checking the composition of the Swift runtime, not the composition of the ListNode program that we wrote.

Tracking swift_release

Clone the Swift source code with Git and switch the branch to swif-5.0-branch.

In Swift code search swift_release we can include/Swift/Runtime/HeapObject. H find the following code:

namespace swift {

...

SWIFT_RUNTIME_EXPORT
void swift_release(HeapObject *object); . }Copy the code

Swift_release is a function in the swift namespace that has a formal parameter of type HeapObject *, but we’re not sure that’s what we’re looking for — because this is the swift runtime code implemented by C++, In C++ developers can choose between the name mangling rule — that is, to use C or C++ rules.

After adapting according to the rules of C, swift_release should be called _swift_release; And after adapting according to the C++ rules, sorry, the C++ adaptation rules are too complicated, I can’t remember… But it’s going to be a long way from _swift_release.

Extern “C” is the extern “C” prefix that is used to rename a function or variable name in a C++ header/source file using C’s name adaptation rules.

To verify that this is the swift_release function we are looking for, we need to find the SWIFT_RUNTIME_EXPORT macro. We can in the stdlib/public/SwiftShims/Visibility. H find this macro definitions:

#if defined(__cplusplus)
#define SWIFT_RUNTIME_EXPORT extern "C" SWIFT_EXPORT_ATTRIBUTE
#else
#define SWIFT_RUNTIME_EXPORT SWIFT_EXPORT_ATTRIBUTE
#endif
Copy the code

Extern “C” SWIFT_EXPORT_ATTRIBUTE; extern “C” SWIFT_EXPORT_ATTRIBUTE; SWIFT_EXPORT_ATTRIBUTE if compiled as C (including C source files). So we know that the swift_release function in the Swift namespace is compiled and its binary symbols are not adapted in C++ mode, but in C mode. So we can be sure that this is the function we’re looking for.

Brief description of swift_release signature

Here we have another question: why is the formal parameter of this function of type HeapObject *?

Many people think it’s probably because Swift has a common root class like Objective-C — just that Swift doesn’t expose this common root class.

This is half true, but here it is a false question: indeed, later we can see: Once the Objective-C runtime is selected during Swift compilation, So in addition to the @objc message-forwarding level of language interoperability between Swift objects and Objective-C objects, Each Swift class has a corresponding Objective-C class to ensure the most basic life cycle control interoperability between Swift and Objective-C, And all the Objective-C equivalents of Swift classes inherit from a common root class called SwiftObject.

As for “all Swift classes have a common root class called HeapObject”, this is not even close. Why is that?

First of all, in order to ensure the unity of “value semantics” with C language, there is no substantial difference between C++ structs and classes, and the memory layout is exactly the same. The creators of C++ kept structs only for compatibility with C. So even if the above description of “all Swift classes have a common root class called HeapObject” is true, it should be changed to “All Swift classes have a common root type called HeapObject”.

What are “value semantics”?

The opposite of “value semantics” is “reference semantics”, expressed in Swift code, and their differences are as follows:

var a: Int = 4
b = a
Copy the code

In the above code, there is no correlation between a and B, which is “value semantics”.

class Foo {
    var int: Int = 0
    init(int: Int) { self.int = int }
}

var a = Foo(int: 0)
b = a
b.int = 10 // a.int == 10; b.int == 10;
Copy the code

In the above code, a still points to the same object after being copied to B. Modifying the content of B will cause the content of A to change at the same time, which is “reference semantics”.

But if we write something like this in C++ :

class Foo {
    int value;
    Foo(int val): value(val) {}
}

auto a = Foo(0);
auto b = a;
b.value = 10; // a.int == 0; b.int == 10;
Copy the code

Then the value of A will not change when the value of B is assigned.

And in general, we do this in C++ :

class Foo {
    int value;
    Foo(int val): value(val) {}
}

auto a = new Foo(0);
auto b = a;
b -> value = 10; // a -> int == 10; b -> int == 10;
Copy the code

We can see that we changed Foo(0) to new Foo(0), and then b.value = 10; B -> value = 10; , which actually assigns an instance of Foo to the heap and returns a pointer. Thus, we can achieve the same effect as the Swift code above. But this is still not “referential semantics” — because new Foo(0) returns a pointer to an instance of Foo rather than Foo itself, and because the operator -> is an operator that operates on Pointers, the code above revolves around Pointers rather than types. So the use of Pointers for reference semantics does not necessarily mean that a programming language that has Pointers has reference semantics per se. (Of course, we should use smart Pointers in the latest C++ practice, specifically STD ::shared_ptr, but that doesn’t prevent us from explaining the unity of C++ and C in “value semantics.”)

Question to consider: After understanding the difference between “value semantics” and “reference semantics,” if I ask you when to use struct and when to use class in Swift, will you still give those so-called “standard answers” on the Internet?

Second, we can talk about it in terms of the C++ object model (i.e. memory layout) — because if a Swift class is going to have a C++ type as its root type, then Swift objects and C++ objects have at least the same memory layout.

We in include/swift/SwiftShims/HeapObject finds HeapObject h type definition:

#define SWIFT_HEAPOBJECT_NON_OBJC_MEMBERS       \
  InlineRefCounts refCounts

/// The Swift heap-object header.
/// This must match RefCountedStructTy in IRGen.
struct HeapObject {
  /// This is always a valid pointer to a metadata object.
  HeapMetadata const *metadata;

  SWIFT_HEAPOBJECT_NON_OBJC_MEMBERS;

#ifdef __cplusplus
  HeapObject() = default;

  // Initialize a HeapObject header as appropriate for a newly-allocated object.
  constexpr HeapObject(HeapMetadata const *newMetadata) 
    : metadata(newMetadata)
    , refCounts(InlineRefCounts: :Initialized)
  { }

#ifndef NDEBUG
  void dump() const LLVM_ATTRIBUTE_USED;
#endif

#endif // __cplusplus
}
Copy the code

This is a struct containing adata member metadata and a SWIFT_HEAPOBJECT_NON_OBJC_MEMBERS macro.

We expand the SWIFT_HEAPOBJECT_NON_OBJC_MEMBERS macro to get InlineRefCounts refCounts, And we can include/swift/SwiftShims RefCount. Found in h InlineRefCounts definition:

// This definition is a placeholder for importing into Swift.
// It provides size and alignment but cannot be manipulated safely there.
typedef struct {
  __swift_uintptr_t refCounts SWIFT_ATTRIBUTE_UNAVAILABLE;
} InlineRefCountsPlaceholder;

#if ! defined(__cplusplus)

typedef InlineRefCountsPlaceholder InlineRefCounts;

#else. typedef RefCounts<InlineRefCountBits> InlineRefCounts; .#endif
Copy the code

We found that this type would look different in C and C++ :

  • You can’t do anything with this type in C
  • In C++ this type isRefCounts<InlineRefCountBits>

And finally we can in the stdlib/public/SwiftShims/RefCount found in h RefCounts and InlineRefCountBits definition:

enum RefCountInlinedness { RefCountNotInline = false, RefCountIsInline = true}; .typedefRefCountBitsT<RefCountIsInline> InlineRefCountBits; .template <typename RefCountBits>
class RefCounts {
  std::atomic<RefCountBits> refCounts; . }Copy the code

We continue to find RefCountBitsT definitions in the same file:


// Basic encoding of refcount and flag data into the object's header.
template <RefCountInlinedness refcountIsInline>
class RefCountBitsT {. BitsType bits; . }Copy the code

So we know that the HeapObject will eventually look like this:

struct HeapObject {
  HeapMetadata const *metadata;

  RefCounts<RefCountBitsT<true>> refCounts;

#ifdef __cplusplus
  HeapObject() = default;

  // Initialize a HeapObject header as appropriate for a newly-allocated object.
  constexpr HeapObject(HeapMetadata const *newMetadata) 
    : metadata(newMetadata)
    , refCounts(InlineRefCounts: :Initialized)
  { }

#ifndef NDEBUG
  void dump() const LLVM_ATTRIBUTE_USED;
#endif

#endif // __cplusplus
}
Copy the code

Structs (including classes) have two memory layouts in C++ :

  • One is the same memory layout as a C struct.

    This layout ensures interoperability with the C API.

  • One is a memory layout with C++ runtime features.

    This layout does not guarantee interoperability with the C API, but it does have C++ -specific runtime features such as virtual functions, value semantically dependent constructors, gutter functions, copy constructors, and assignment functions.

So if Swift classes all have a C++ type as their root type, then:

  • Either an instance of the Swift class will have the same layout as a C struct;
  • Either an instance of a Swift class will have the same memory layout as an instance of a C++ type with C++ features.

Obviously, a Swift class is inheritable, and a class that is not marked final or whose member functions are not marked final will need to be overridden using a technique similar to vtable in C++ (override, So instances of the Swift class cannot have the same layout as C struct.

Thus, if we want to prove or deny that Swift classes have a “hidden” C++ root type, namely HeapObject, we simply need to verify that HeapObject is itself a C++ type with C++ features.

We can see that the source code for the HeapObject above has the following macro:

#ifdef __cplusplus.#endif // __cplusplus
Copy the code

This is a typical macro that determines whether the header is referenced by C or C++ source. If the header is referenced by C++ source, the contents of the macro are valid, if C is invalid.

At the same time, the contents of this macro:

  HeapObject() = default;

  // Initialize a HeapObject header as appropriate for a newly-allocated object.
  constexpr HeapObject(HeapMetadata const *newMetadata) 
    : metadata(newMetadata)
    , refCounts(InlineRefCounts::Initialized)
  { }

#ifndef NDEBUG
  void dump() const LLVM_ATTRIBUTE_USED;
#endif
Copy the code

The purpose is to define a default default constructor (which is tricky), a constructor decorated with a constant expression (CONSTEXPr), and a debug member function. Among these C++ “goodies” :

  • HeapObject(HeapMetadata const *newMetadata) is a constexpr function that C++ will evaluate at compile time. The result of the evaluation is then written to the compilation product;

    The default HeapObject constructor HeapObject() = default; Is the default.

    Therefore, these constructors are trivial. Trivial is a mathematical concept borrowed from C/C++ as an action that can be accomplished by a simple assignment. These actions do not cause the HeapObject to become a type compatible only with C++ — these constructors do not need to use C++ runtime features;

  • Const void dump() const is a non-virtual function.

    This is equivalent to adding a global function void HeapObjectDump(HeapObject * this), and does not cause HeapObject to become a type that can only be used in C++ — since non-virtual functions also do not need to use C++ runtime features.

From this point of view, HeapObject is considered compatible with C — that is, HeapObject should take the same layout as C struct. Now we can give the memory layout of the HeapObject instance:

How can a C++ type that uses a C struct layout be the root type of all Swift classes?

So we know that “all Swift classes have a common root type called HeapObject” should not be true.

So why is the swift_Release formal parameter of type HeapObject *?

In fact, as we’ll see later, swift_Release with a formal parameter of type HeapObject * is a programming trick here to provide a way to access the memory region pointed to by HeapObject * in C/C++.

Continue exploring Swift_Release

good Now let’s continue exploring Swift_Release. Since swift_release is a function in the Swift namespace, we can try searching void swift:: swift_Release (to find the implementation file for this function. In the stdlib/public/Runtime/HeapObject CPP we can find the following content:

.void swift::swift_release(HeapObject *object) {
  _swift_release(object);
}

static void _swift_release_(HeapObject *object) {
  SWIFT_RT_TRACK_INVOCATION(object, swift_release);
  if (isValidPointerForNativeRetain(object))
    object->refCounts.decrementAndMaybeDeinit(1);
}

autoswift::_swift_release = _swift_release_; .Copy the code

We can see that swift_release finally jumps to _swift_release_, _swift_release_ in the reference counting track (SWIFT_RT_TRACK_INVOCATION) and native Swift object check (isValidPointerForNativeRetain) after the call DecrementAndMaybeDeinit this function.

Search decrementAndMaybeDeinit find its header file stdlib/public/SwiftShims/Refcount. H, then we can find the following content:

enum PerformDeinit { DontPerformDeinit = false, DoPerformDeinit = true}; .template <typename RefCountBits>
class RefCounts {
  std::atomic<RefCountBits> refCounts; .LLVM_ATTRIBUTE_ALWAYS_INLINE
  void decrementAndMaybeDeinit(uint32_t dec) { doDecrement<DoPerformDeinit>(dec); }... }Copy the code

We see that decrementAndMaybeDeinit calls the doDecrement template function and passes True to the first template parameter. So we can find this template function in the same file:

template <typename RefCountBits>
class RefCounts {.template <PerformDeinit performDeinit>
  bool doDecrement(uint32_t dec) {
    auto oldbits = refCounts.load(SWIFT_MEMORY_ORDER_CONSUME);
    RefCountBits newbits;
    
    do {
      newbits = oldbits;
      bool fast =
        newbits.decrementStrongExtraRefCount(dec);
      if(! fast)// Slow paths include side table; deinit; underflow
        return doDecrementSlow<performDeinit>(oldbits, dec);
    } while(! refCounts.compare_exchange_weak(oldbits, newbits,std::memory_order_release,
                                              std::memory_order_relaxed));

    return false;  // don't deinit}... }Copy the code

In this function comes the lock-free programming trick we love to talk about, CAS (Compare and Swap), but we don’t care about that here. What we are concerned about is… While loop body if(! Code for this conditional branch: Overall, this whole function is responsible for reducing the object’s reference count, and calls the template function doDecrementSlow when appropriate (not applicable to fast Path, which includes side tables, causes deinit, and reference counts less than zero).

We saw a concept called underflow in the source code above. According to my study of Swift source code and documents, underflow corresponds to at least three meanings in Swift, and these three meanings can actually be regarded as the antisense of overflow in the corresponding context:

  1. You declare an unsigned integer of 90 and subtract 91 from it.
  2. Buffer backbound (low address space) : you declare a buffer of 10 elements and try to access the element in position -1;
  3. The reference count goes down, or the reference count is less than zero;

We can then find the doDecrementSlow template function in the same file:

template <typename RefCountBits>
class RefCounts {.template <PerformDeinit performDeinit>
  bool doDecrementSlow(RefCountBits oldbits, uint32_t dec) {
    RefCountBits newbits;
    
    bool deinitNow;
    do {
      newbits = oldbits;
      
      bool fast =
        newbits.decrementStrongExtraRefCount(dec);
      if (fast) {
        // Decrement completed normally. New refcount is not zero.
        deinitNow = false;
      }
      else if (oldbits.hasSideTable()) {
        // Decrement failed because we're on some other slow path.
        return doDecrementSideTable<performDeinit>(oldbits, dec);
      }
      else {
        // Decrement underflowed. Begin deinit.
        // LIVE -> DEINITING
        deinitNow = true; assert(! oldbits.getIsDeiniting());// FIXME: make this an error?
        newbits = oldbits;  // Undo failed decrement of newbits.
        newbits.setStrongExtraRefCount(0);
        newbits.setIsDeiniting(true); }}while(! refCounts.compare_exchange_weak(oldbits, newbits,std::memory_order_release,
                                              std::memory_order_relaxed));
    if (performDeinit && deinitNow) {
      std::atomic_thread_fence(std::memory_order_acquire);
      _swift_release_dealloc(getHeapObject());
    }

    returndeinitNow; }... }Copy the code

This function will execute _swift_release_dealloc(getHeapObject()) after confirming deinit; To free the memory allocated for objects on the heap, as we’ve already seen in the Debug Navigator’s call stack _swift_RELEase_dealloc.

But what does _swift_release_dealloc do?

_swift_release_dealloc que

We continue to search void _swift_release_dealloc (find its implementation file include/swift/Runtime/HeapObject. H, content is:

void _swift_release_dealloc(HeapObject *object) {
  asFullMetadata(object->metadata)->destroy(object);
}
Copy the code

As you can see, this function passes the metadata member of the instance to which the HeapObject * pointer points to to the asFullMetadata function, and then calls a member of the return value called destroy that looks like a “function.”

Now let’s look at the template function asFullMetadata. We can search for the source code, in include/Swift/ABI/Metadata. H find related content:

./// Given a canonical metadata pointer, produce the adjusted metadata pointer.
template <class T>
static inline FullMetadata<T> *asFullMetadata(T *metadata) {
  return (FullMetadata<T>*) (((typename T::HeaderType*) metadata) - 1); }...Copy the code

As you can see, asFullMetadata is actually a template function. In this template function, the actual action is:

  1. willmetadataAct (cast) intoT::HeaderType *Type;
  2. And then it’s shifted back, or lower address space, by oneT::HeaderTypeThe length of the;
  3. Finally act (cast) intoFullMetadata<T> *Type return.

AsFullMetadata (object->metadata)->destroy(object); What does this sentence do, we must understand:

  1. metadataUnder the type ofHeaderTypeWhat is?
  2. metadataAfter being shifted, the type is finally castFullMetadata<T> *What is?
  3. FullMetadata<T>A member of thedestroyWhat is?
  4. bymetadataShift one to the lower address spaceT::HeaderTypeWhat does the length place hold?

1. What is the HeaderType of the metadata type?

We have used this paper HeapObject header file that the type of metadata is HeapMetadata, in include/swift/Runtime/HeapObject. H, we can find:

struct InProcess;

template <typename Target> struct TargetHeapMetadata;
using HeapMetadata = TargetHeapMetadata<InProcess>;
Copy the code

So we can see that the real body of HeapMetadata is TargetHeapMetadata

.

We now proceed to verify the contents of TargetHeapMetadata. We can search for struct TargetHeapMetadata in include/swift/ABI/Metadata. H is found in the following content:

template <typename Runtime>
struct TargetHeapMetadata : TargetMetadata<Runtime> {
  using HeaderType = TargetHeapMetadataHeader<Runtime>;

  TargetHeapMetadata() = default;
  constexpr TargetHeapMetadata(MetadataKind kind)
    : TargetMetadata<Runtime>(kind) {}
#if SWIFT_OBJC_INTEROP
  constexpr TargetHeapMetadata(TargetAnyClassMetadata<Runtime> *isa)
    : TargetMetadata<Runtime>(isa) {}
#endif
};
Copy the code

So we know that the metadata HeaderType is TargetHeapMetadataHeader

. The definition can also include/swift/ABI/Metadata. Was found in h:

template <typename Runtime>
struct TargetHeapMetadataHeader
    : TargetHeapMetadataHeaderPrefix<Runtime>,
      TargetTypeMetadataHeader<Runtime> {
  constexpr TargetHeapMetadataHeader(
      const TargetHeapMetadataHeaderPrefix<Runtime> &heapPrefix,
      const TargetTypeMetadataHeader<Runtime> &typePrefix)
    : TargetHeapMetadataHeaderPrefix<Runtime>(heapPrefix),
      TargetTypeMetadataHeader<Runtime>(typePrefix) {}
};
Copy the code

2. What type FullMetadata * is cast after metadata is shifted?

We can also include/swift/ABI/Metadata. H is found in the following content:

template <class T> struct FullMetadata : T::HeaderType, T {
  typedef typename T::HeaderType HeaderType;

  FullMetadata() = default;
  constexpr FullMetadata(const HeaderType &header, const T &metadata)
    : HeaderType(header), T(metadata) {}
};
Copy the code

Metadata is cast as FullMetadata

*.

As defined by FullMetadata

, FullMetadata

is specialized to FullMetadata

and inherits TargetHeapMetadataHeader

and TargetHeapMetadata < InProcess >.



And according to our previous research:

  • Because FullMetadata’s first inheritance target, TargetHeapMetadataHeader itself, does not have adata member, we need to verify its data member. We only need an inheritance target TargetHeapMetadataHeaderPrefix TargetHeapMetadataHeader and TargetTypeMetadataHeader, Then you can know some of the data members of FullMetadata.

    We can in include/swift/ABI/Metadata. H is found in the above definition of two types:

    template <typename Runtime>
    struct TargetTypeMetadataHeader {
      /// A pointer to the value-witnesses for this type. This is only
      /// present for type metadata.
      TargetPointer<Runtime, constValueWitnessTable> ValueWitnesses; }; .template <typename Runtime>
    struct TargetHeapMetadataHeaderPrefix {
      /// Destroy the object, returning the allocated size of the object
      /// or 0 if the object shouldn't be deallocated.
      TargetPointer<Runtime, HeapObjectDestroyer> destroy;
    };
    Copy the code

    The first data member of FullMetadata

    is:

    TargetPointer<Runtime, HeapObjectDestroyer> destroy;
    Copy the code

    The second data member is:

    TargetPointer<Runtime, const ValueWitnessTable> ValueWitnesses;
    Copy the code
  • The second target of FullMetadata, TargetHeapMetadata, has no data members. To verify its data members, we only need to expand the target TargetMetadata. You can then know the remaining data members of FullMetadata.

    We can in include/swift/ABI/Metadata. H is found in the above definition of the type:

    template <typename Runtime>
    struct TargetMetadata {
      using StoredPointer = typenameRuntime::StoredPointer; .private:
      /// The kind. Only valid for non-class metadata; getKind() must be used to get
      /// the kind value.StoredPointer Kind; . }Copy the code

    The third data member of FullMetadata

    is:

    StoredPointer Kind;
    Copy the code

So what are TargetPointer and StoredPointer?

In include/swift/ABI/Metadata. In h, we can find a definition: TargetPointer

template <typename Runtime, typename T>
using TargetPointer = typename Runtime::template Pointer<T>;
Copy the code

We see that TargetPointer actually ends up using the Poitner type in the template argument Runtime. So the “find the definition of TargetPointer” task is now converted to find the definition of the type Poitner within the specialized target type of the template parameter Runtime.

The definition of StoredPointer and Pointer can be found in the specialization target of InProcess, which is the template argument Runtime of the above type:

struct InProcess { using StoredPointer = uintptr_t; . template <typename T> using Pointer = T*; . }Copy the code

After we substitute in the Pointer and StoredPointer definitions above, we get the following structure:

struct FullMetadata<HeapMetadata> {
  HeapObjectDestroyer * destroy;
  const ValueWitnessTable * ValueWitnesses;
  uintptr_t Kind;
}
Copy the code

Finally in include/swift/ABI/Metadata. Found in h HeapObjectDestroyer definition:

using HeapObjectDestroyer =
  SWIFT_CC(swift) void(SWIFT_CONTEXT HeapObject *);
Copy the code

We can see that destroy is a member of a pointer to SWIFT_CC(swift) void (*)(SWIFT_CONTEXT HeapObject *). SWIFT_CC and SWIFT_CONTEXT both macro of contents in include/swift/Runtime/Config. H for the compiler to provide swift exclusive call regulation (calling convention. I also do not know how to translate this word, I want to translate) identifier. So eventually this function pointer will point to a Swift function.

Substitute the definition of HeapObjectDestroyer into FullMetadata

:

3. What is the FullMetadata member destroy?

As shown above, destroy is a member of a pointer to the void (*)(HeapObject *) function, which is a Swift function.

4. What does the metadata store when the metadata moves one T::HeaderType into the lower address space?

We can see that metadata is first represented as T::HeaderType *. The result of this specialization is HeapMetadata::HeaderType. If HeapMetadata::HeaderType is the length of two Pointers, metadata moving one HeapMetadata::HeaderType into the lower address space actually destroys the member. Finally, we play the pointer as FullMetadata

*, so we will observe the content behind the pointer as FullMetadata

.

Here I can draw a picture to illustrate it visually:

We also prove that the formal parameter of Swift_Release is HeapObject * is just a programming trick here — it provides a way to access the memory behind the pointer in C/C++.

But where does metadata come from in HeapObject? Why are the two pointer lengths after the metadata pointer to the memory space used by the class destroy? Logically, a better way to answer this question is to look at the ListNode instance allocation process rather than the release process.

trackingListNodeInstance memory allocation process

Again, we see part of the SIL we generated earlier:

// ListNode.init(_:)
sil hidden @$s9CallStack8ListNodeCyACSicfc : $@convention(method) (Int, @owned ListNode) -> @owned ListNode {
  ...
} // end sil function '$s9CallStack8ListNodeCyACSicfc'
Copy the code

Yes, the first clue to look for is in the SIL function corresponding to init(_:). Unfortunately, there is nothing about metadata. But when we search for the SIL name of the init(_:) function: $s9CallStack8ListNodeCyACSicfc, we will find a man called $s9CallStack8ListNodeCyACSicfc SIL function (corresponding to the Swift, the function name: Listnode.__allocating_init (_:) calls the init(_:) function.

// ListNode.__allocating_init(_:)
sil hidden @$s9CallStack8ListNodeCyACSicfC : $@convention(method) (Int, @thick ListNode.Type) -> @owned ListNode {
// %0                                             // user: %4
bb0(%0 : $Int, %1 : $@thick ListNode.Type):
  %2 = alloc_ref $ListNode                        // user: %4
  // function_ref ListNode.init(_:)
  %3 = function_ref @$s9CallStack8ListNodeCyACSicfc : $@convention(method) (Int, @owned ListNode) -> @owned ListNode // user: %4
  %4 = apply %3(%0, %2) : $@convention(method) (Int, @owned ListNode) -> @owned ListNode // user: %5
  return %4 : $ListNode                           // id: %5
} // end sil function '$s9CallStack8ListNodeCyACSicfC'
Copy the code

__allocating_init(_:) uses the ALloc_ref SIL directive to allocate memory to the ListNode instance. But in this case, the entire code still has no clue to metadata.

Investigate __allocating_init: LLVM IR perspective

At this point, let’s take the horizon down a notch: let’s look directly at LLVM IR and see what’s in it. Swiftc-ema-ir callstack.swift > callstack.swift. Ll to generate LLVM IR for callstack.swift. We can be found by searching the $s9CallStack8ListNodeCyACSicfC in line 232 ListNode. __allocating_init (_) function in the LLVM IR corresponding functions:

define hidden swiftcc %T9CallStack8ListNodeC* @"$s9CallStack8ListNodeCyACSicfC"(i64, %swift.type* swiftself) # 0 {
entry:
  %2 = call swiftcc %swift.metadata_response @"$s9CallStack8ListNodeCMa"(i64 0) # 7
  %3 = extractvalue %swift.metadata_response %2, 0
  %4 = call noalias %swift.refcounted* @swift_allocObject(%swift.type* %3, i64 32, i64 7) # 2
  %5 = bitcast %swift.refcounted* %4 to %T9CallStack8ListNodeC*
  %6 = call swiftcc %T9CallStack8ListNodeC* @"$s9CallStack8ListNodeCyACSicfc"(i64 %0, %T9CallStack8ListNodeC* swiftself %5)
  ret %T9CallStack8ListNodeC* %6
}
Copy the code

We can see the word metadata clearly in the LLVM IR function. So let’s analyze the LLVM IR function:

Here are some basics of LLVM IR:

  • LLVM IR is a compiler intermediate representation with a type system
  • %It starts with local variables
  • @It starts with a global variable

First, the first sentence:

%2 = call swiftcc %swift.metadata_response @"$s9CallStack8ListNodeCMa"(i64 0)
Copy the code

Call the LLVM function $s9CallStack8ListNodeCMa using the Swift calling Convention. It passes in an i64 type of 0 and assigns it to %2 after obtaining a return value of type % swift.metadatA_Response.

So what is $s9CallStack8ListNodeCMa? In fact, if you are careful during profiling, you may find records like Type Metadata Accessor in the call stack that allocates Swift object instances (sorry for cheating). In later explorations, it will be easy to know that $s9CallStack8ListNodeCMa is the Type Metadata accessor.

We can find the definition of % swift.metadata_Response in the header of our exported LLVM IR:

%swift.type = type { i64 }
%swift.metadata_response = type { %swift.type*, i64 }
Copy the code

Where, the type keyword is the keyword of the customized struct type in LLVM IR. %swift.type is a struct whose member has a single 64-bit integer value. This is an Objective-C meta-class for a swift type. %swift.type* represents a pointer to %swift.type. So the type % swift.metadata_Response is rewritten to C code to actually say:

struct {
    struct {
        int64_t metaclass;
    } SwiftType;
    struct SwiftType * swiftType;
    int64_t foo;
} MetadataResponse;
Copy the code

And then the second sentence:

%3 = extractvalue %swift.metadata_response %2, 0
Copy the code

By reading the document of LLVM IR instruction extractValue, it is not difficult to know that the meaning of this sentence is to observe the local variable %2 with the structure of type % swift.metadatA_response and take out the 0th element. So we’ll actually get a value of type %swift.type * and assign it to %3.

And then the third sentence:

%4 = call noalias %swift.refcounted* @swift_allocObject(%swift.type* %3, i64 32, i64 7) # 2
Copy the code

Call @swift_allocobject to pass the contents of the local variable %3 to the first argument as type %swift.type*, 32 to the second argument as type i64, and 7 to the third argument as type i64. Finally, the return value is assigned to the local variable %4 as type %swift.refcounted*.

You may not know what a “noalias” means. It doesn’t matter if you don’t understand the noalias. When you modify the return value, the noalias means that the return value of @swift_allocobject is not accessed through the function’s arguments during the execution of the function.

And then the fourth sentence:

%5 = bitcast %swift.refcounted* %4 to %T9CallStack8ListNodeC*
Copy the code

The contents of the local variable %4 are counted from the %swift.refcounted* type cast to the %T9CallStack8ListNodeC* type (expressed in swift) : A reference to the ListNode class in the CallStack module), and assigns the value to the local variable %5.

And then the fifth sentence:

%6 = call swiftcc %T9CallStack8ListNodeC* @"$s9CallStack8ListNodeCyACSicfc"(i64 %0, %T9CallStack8ListNodeC* swiftself %5)
Copy the code

The meaning of this sentence is a Swift call regulation call @ $s9CallStack8ListNodeCyACSicfc “” this function (namely ListNode. Init (_), 0 to i64 type into the first parameter, Pass the contents of the local variable %5 as the second argument of type %T9CallStack8ListNodeC* (that is, a reference to ListNode), and prompt that the second argument is self (swiftself), The return value is then assigned to the local variable %6 as type %T9CallStack8ListNodeC* (a reference to ListNode).

Finally, the sixth sentence:

ret %T9CallStack8ListNodeC* %6
Copy the code

Return the contents of the local variable %6 as type %T9CallStack8ListNodeC* (a reference to ListNode).

To sum up, we can rewrite the above LLVM IR into the following C-like language:

typealias %swift.type * SwiftTypeRef;
typealias %swift.refcounted * SwiftRefCountedRef;
typealias %swift.metadata_response MetadataResponse;
#define ListNodeTypeMetadataAccessor $s9CallStack8ListNodeCMa

ListNode * ListNode__allocating_init(int64_t arg1, SwiftTypeRef self) {
    MetadataResponse %2 = ListNodeTypeMetadataAccessor(0);
    SwiftTypeRef %3 = %2.swiftType;
    SwiftRefCountedRef %4 = swift_allocObject(%3.32.7);
    ListNode * %5 = (ListNode *) %4;
    ListNode * %6 = ListNodeInit(0%,5);
    return %6;
}
Copy the code

So we can see, Listnode.__allocating_init (_:) does access ListNode metadata by calling type metadata accessor ($s9CallStack8ListNodeCMa), It allocates memory space for ListNode instances.

Investigate Type Metadata Accessor: LLVM IR perspective

$s9CallStack8ListNodeCMa; $s9CallStack8ListNodeCMa; Take a look at how Swift retrieves a class metadata at runtime. Search $s9CallStack8ListNodeCMa we can find the definition of this function on line 242:

; Function Attrs: nounwind readnone
define hidden swiftcc %swift.metadata_response @"$s9CallStack8ListNodeCMa"(i64) # 4 {
entry:
  %1 = load %swift.type*, %swift.type** @"$s9CallStack8ListNodeCML", align 8
  %2 = icmp eq %swift.type* %1, null
  br i1 %2, label %cacheIsNull, label %cont

cacheIsNull:                                      ; preds = %entry
  %3 = call %objc_class* @swift_getInitializedObjCClass(%objc_class* bitcast (i64* getelementptr inbounds (<{ void (%T9CallStack8ListNodeC*)*, i8**, i64, %objc_class*, %swift.opaque*, %swift.opaque*, i64, i32, i32, i32, i16, i16, i32, i32, <{ i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, %swift.method_descriptor, %swift.method_descriptor, %swift.method_descriptor, %swift.method_descriptor, %swift.method_descriptor, %swift.method_descriptor, %swift.method_descriptor }>*, i8*, i64, i64, i64 (%T9CallStack8ListNodeC*)*, void (i64, %T9CallStack8ListNodeC*)*, { i8*, %TSi* } (i8*, %T9CallStack8ListNodeC*)*, i64 (%T9CallStack8ListNodeC*)*, void (i64, %T9CallStack8ListNodeC*)*, { i8*, %T9CallStack8ListNodeCSg* } (i8*, %T9CallStack8ListNodeC*)*, %T9CallStack8ListNodeC* (i64, %swift.type*)* }>, <{ void (%T9CallStack8ListNodeC*)*, i8**, i64, %objc_class*, %swift.opaque*, %swift.opaque*, i64, i32, i32, i32, i16, i16, i32, i32, <{ i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, %swift.method_descriptor, %swift.method_descriptor, %swift.method_descriptor, %swift.method_descriptor, %swift.method_descriptor, %swift.method_descriptor, %swift.method_descriptor }>*, i8*, i64, i64, i64 (%T9CallStack8ListNodeC*)*, void (i64, %T9CallStack8ListNodeC*)*, { i8*, %TSi* } (i8*, %T9CallStack8ListNodeC*)*, i64 (%T9CallStack8ListNodeC*)*, void (i64, %T9CallStack8ListNodeC*)*, { i8*, %T9CallStack8ListNodeCSg* } (i8*, %T9CallStack8ListNodeC*)*, %T9CallStack8ListNodeC* (i64, %swift.type*)* }>* @"$s9CallStack8ListNodeCMf", i32 0, i32 2) to %objc_class*))
  %4 = bitcast %objc_class* %3 to %swift.type*
  store atomic %swift.type* %4, %swift.type** @"$s9CallStack8ListNodeCML" release, align 8
  br label %cont

cont:                                             ; preds = %cacheIsNull, %entry
  %5 = phi %swift.type* [ %1, %entry ], [ %4, %cacheIsNull ]
  %6 = insertvalue %swift.metadata_response undef, %swift.type* %5, 0
  %7 = insertvalue %swift.metadata_response %6, i64 0, 1
  ret %swift.metadata_response %7
}
Copy the code

This function involves the restoration of SSA Form Phi Nodes, which is difficult to complete. If you are interested, you can directly obtain the free legal SSA Book here to learn the knowledge related to SSA Form. Here I’ll directly show what it looks like when rewritten as C-like pseudocode:

typealias %swift.type * SwiftTypeRef;
typealias %swift.metadata_response MetadataResponse;
#define ListNodeTypeMetadataAccessor $s9CallStack8ListNodeCMa
#define listNodeMetadataRecord $s9CallStack8ListNodeCMf
static SwiftTypeRef * ListNodeSwiftType = $s9CallStack8ListNodeCML;

MetadataResponse ListNodeTypeMetadataAccessor(i64 arg1) {
    SwiftTypeRef swiftType;
    
    if (*ListNodeSwiftType == NULL) {
        swiftType = (SwiftTypeRef)swift_getInitializedObjCClass(&(listNodeMetadataRecord -> objc_class));
        * ListNodeSwiftType = swiftType
    } else {
        swiftType = * ListNodeSwiftType
    }
    
    MetadataResponse metadataResponse = {
        swiftType,
        0};return metadataResponse;
}
Copy the code

We can see:

  • First this function checks global variablesListNodeSwiftType(i.e.$s9CallStack8ListNodeCMLThe content of this symbol) :
    • If it is0, then will belistNodeMetadataRecord -> objc_classThe incomingswift_getInitializedObjCClassTo obtain aListNodeswiftType;
    • If it is not0, then it will be directly usedListNodeSwiftTypeAs aListNodeswiftType;
  • After thenListNodeswiftTypeConstructed asmetadataResponseAnd return.

To determine which conditional branch we will execute in the above code, we must first know the contents of ListNodeSwiftType, $s9CallStack8ListNodeCML. In line 37 of our exported LLVM IR file we can find the following:

@"$s9CallStack8ListNodeCML" = internal global %swift.type* null, align 8
Copy the code

So we know that $s9CallStack8ListNodeCML or ListNodeTypeMetadata will be null (0) for the ListNode code we compiled. We can also search for $s9CallStack8ListNodeCML in Hopper.

You can see that $s9CallStack8ListNodeCML is indeed 0. That is, during the run of our code, ListNode’s Type Metadata Accessor calls swift_getInitializedObjCClass to help generate ListNode type metadata records.

In the LLVM IR above, Swift_getInitializedObjCClass (&(listNodeMetadataRecord -> objc_class))) the LLVM IR is generally considered to be a point of confusion for most LLVM learners. I’ll take the LLVM IR out of the sentence here for a separate presentation:

%3 = call %objc_class* @swift_getInitializedObjCClass(%objc_class* bitcast (i64* getelementptr inbounds (<{ void (%T9CallStack8ListNodeC*)*, i8**, i64, %objc_class*, %swift.opaque*, %swift.opaque*, i64, i32, i32, i32, i16, i16, i32, i32, <{ i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, %swift.method_descriptor, %swift.method_descriptor, %swift.method_descriptor, %swift.method_descriptor, %swift.method_descriptor, %swift.method_descriptor, %swift.method_descriptor }>*, i8*, i64, i64, i64 (%T9CallStack8ListNodeC*)*, void (i64, %T9CallStack8ListNodeC*)*, { i8*, %TSi* } (i8*, %T9CallStack8ListNodeC*)*, i64 (%T9CallStack8ListNodeC*)*, void (i64, %T9CallStack8ListNodeC*)*, { i8*, %T9CallStack8ListNodeCSg* } (i8*, %T9CallStack8ListNodeC*)*, %T9CallStack8ListNodeC* (i64, %swift.type*)* }>, <{ void (%T9CallStack8ListNodeC*)*, i8**, i64, %objc_class*, %swift.opaque*, %swift.opaque*, i64, i32, i32, i32, i16, i16, i32, i32, <{ i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, %swift.method_descriptor, %swift.method_descriptor, %swift.method_descriptor, %swift.method_descriptor, %swift.method_descriptor, %swift.method_descriptor, %swift.method_descriptor }>*, i8*, i64, i64, i64 (%T9CallStack8ListNodeC*)*, void (i64, %T9CallStack8ListNodeC*)*, { i8*, %TSi* } (i8*, %T9CallStack8ListNodeC*)*, i64 (%T9CallStack8ListNodeC*)*, void (i64, %T9CallStack8ListNodeC*)*, { i8*, %T9CallStack8ListNodeCSg* } (i8*, %T9CallStack8ListNodeC*)*, %T9CallStack8ListNodeC* (i64, %swift.type*)* }>* @"$s9CallStack8ListNodeCMf", i32 0, i32 2) to %objc_class*))
Copy the code

As you can see, this line is really long. We can beautify it by taking the type structure inline in the LLVM IR directive and naming it %SwiftTypeMetadata and breaking the LLVM IR into multiple sentences:

%0 = i64* getelementptr inbounds (%SwiftTypeMetadata, %SwiftTypeMetadata * @"$s9CallStack8ListNodeCMf", i32 0, i32 2)
%1 = %objc_class* bitcast (%objc_class* %0 to %objc_class*)
%3 = call %objc_class* @swift_getInitializedObjCClass(%objc_class* %1)
Copy the code

Now we can explain it very clearly:

  • The getelementptr directive is used to get the address of an element starting from a base address. Many LLVM learners will interpret this instruction as “fetch an element after the start of a base address”, which is incorrect.

    In this case, the instruction starts with the base address of $s9CallStack8ListNodeCMf, first looks at the address as type %SwiftTypeMetadata *, moves to the 0th element from 0 (for i320), Look at the current address with type %SwiftTypeMetadata, move to the second element from 0 (for i32 2), and finally assign the address after two moves to %0.

    So here we actually get a pointer to the second address of $s9CallStack8ListNodeCMf;

  • Then we use bitcast to play %0 as %objc_class* and assign it to %1;

  • Then call @swift_getInitializedobjcClass and pass in %1 with %objc_class* as the argument.

So what does %SwiftTypeMetadata look like? Since we looked at the $s9CallStack8ListNodeCMf symbol with %SwiftTypeMetadata in the LLVM IR above, let’s take a look at what $s9CallStack8ListNodeCMf is. You can see the contents of $s9CallStack8ListNodeCMf in line 38 of our generated LLVM IR. For your convenience, I have drawn the contents of this structure here:

We also know that we’re acting the pointer to the second field from 0 in the field shown above as %objc_class* (a pointer to objective-C class), $s9CallStack8ListNodeCMf $s9CallStack8ListNodeCMf $s9CallStack8ListNodeCMf $s9CallStack8ListNodeCMf

As you can see, there is indeed an Objective-C class structure implicit in this structure.

In fact, if you have a good memory, you’ll also notice that the first member of this structure is our listNode.__deallocating_deinit function — the destroy function. From our previous study of the _swift_dealloc_release function in the Swift runtime, it is not hard to intuit that the second member of the structure is the target to which the HeapObject’s metadata pointer points.

We don’t know for sure yet, because the code hasn’t been studied yet, and we don’t know what’s going to happen.

Next we must know what the swift_getInitializedObjCClass function does.

Searching the LLVM IR file we exported, we can see that swift_getInitializedObjCClass is a file with only declare and no define. This means that the function will be linked to the object file at compile time, that is, it is a function that exists in the Swift standard library.

The survey swift_getInitializedObjCClass

So we can search swift_getInitializedObjCClass in the Swift code, and in the stdlib/public/runtime/SwiftObject mm file found in the following content:

Class swift::swift_getInitializedObjCClass(Class c) {
  // Used when we have class metadata and we want to ensure a class has been
  // initialized by the Objective-C runtime. We need to do this because the
  // class "c" might be valid metadata, but it hasn't been initialized yet.
  return [c class];
}
Copy the code

Yes, this function simply sends a [Class + Class] message to the passed parameter C, which, according to objective-C runtime lore, will ensure that C is initialized in the Objective-C runtime (i.e., [Class +initialize] is called). This function then returns [C class] — c itself. So we can see that this function does what it’s called — it just ensures that the Objective-C class is initialized.

Type Metadata Accessor

We can see that swift_getInitializedObjCClass doesn’t really work here, so our intuition was right: The HeapObject metadata pointer points to the second element from 0 on the structure starting at $s9CallStack8ListNodeCMf.

We can draw a picture to illustrate:

Type the Metadata generated

How can type metadata be generated? We can see that the lib/IRGen/GenMeta CPP file SingletonClassMetadataBuilder the c + + class. This class will generate the LLVM IR of Type Metadata for ListNode using the LLVM API.

class SingletonClassMetadataBuilder :
    public ClassMetadataBuilderBase<SingletonClassMetadataBuilder> {
    
    ...
}
Copy the code

You’ll find other class metadata Builders in this file. In fact, depending on whether the class uses the @_fixed_layout property, whether it is defined outside the module, and whether it is generic, Swift has different types of metadata generation strategies. I haven’t had time to go through them all yet.

We can see its inherited from ClassMetadataBuilderBase < SingletonClassMetadataBuilder >, and we can be found within the same file ClassMetadataBuilderBase definition:

template<class Impl>
class ClassMetadataBuilderBase : public ClassMetadataVisitor<Impl> {
  ...
}
Copy the code

ClassMetadataBuilderBase

inherits from ClassMetadataVisitor

, and ClassMetadataVisitor is a very classic application of the visitor pattern.

We see the Layout function for ClassMetadataVisitor, which is where type metadata is generated.

template <class Impl> class ClassMetadataVisitor
    : public NominalMetadataVisitor<Impl>,
      public SILVTableVisitor<Impl> {
public:
  void layout(a) {
    // HeapMetadata header.
    asImpl().addDestructorFunction();

    // Metadata header.
    super::layout();

    asImpl().addSuperclass();
    asImpl().addClassCacheData();
    asImpl().addClassDataPointer();

    asImpl().addClassFlags();
    asImpl().addInstanceAddressPoint();
    asImpl().addInstanceSize();
    asImpl().addInstanceAlignMask();
    asImpl().addRuntimeReservedBits();
    asImpl().addClassSize();
    asImpl().addClassAddressPoint();
    asImpl().addNominalTypeDescriptor();
    asImpl().addIVarDestroyer();

    // Class members.addClassMembers(Target); }}Copy the code

Super :: Layout () calls the NominalMetadataVisitor

layout function, which we also post here:

template <class Impl> class NominalMetadataVisitor {
public:
  void layout(a) {
    // Common fields.asImpl().addValueWitnessTable(); asImpl().noteAddressPoint(); asImpl().addMetadataFlags(); }}Copy the code

Will the layout function in the body function one by one, before we can get ListNode type of metadata in all fields on the:

Generic non-generic non-@_fixed_layout Swift object memory layout and Type metadata layout

The type metadata layout for the common non-generic, non-@_fixed_layout class is as follows:

Question to ponder: Can you explain why Swift did it this way?

Overlong linked list stack explosion reasons summary

At this point, we can conclude that for the ListNode configuration of the linked list, its release process:

  1. The root node first enters the release process.
  2. The root node of the__deallocating_deinitThe root node’sdeinitFunctions;
  3. The root node of thedeinitAn external code generated automatically by the compiler is calledListNodeMember destruction process;
  4. External automatically generated by the compilerListNodeThe member destruction procedure is called againswift_releaseTo remove thenextStrong references of nodes;
  5. swift_releasecall_swift_release_To remove thenextStrong references of nodes;
  6. Because there’s only one place to thisnextThe node is strongly referenced and does notunownedQuote, so_swift_release_Will eventually pass_swift_release_deallocTo remove thenextNode strong reference
  7. _swift_release_deallocBy calling theListNode__deallocating_deinitFunction to destroynextNode;
  8. nextThe node reference count becomes0, enter the release process;

We can verify that this statement is true by looking at the full call stack.

To see the complete call stack instead of the call stack summary in the Debug Navigator, click on the “Four Squares” icon in the upper left corner of the Xcode edit area, then click on Disassembly, then click on the current stack frame ListNode.deinit to open the full call stack. Finally, we can see the following image:

Note that there are no Procedure activation records for swift_RELEASE and _SWIFt_RELEase_ functions in the figure above. This is because the compiler did tail-call optimization (TCO) for the two functions, replacing the Call family of instructions (which I’ll refer to as “family instructions” here and below, considering that the instructions that perform the same function on 32-bit and 64-bit platforms have different names) with the JMP family of instructions. This allows the subsequent _swift_RELEase_ and _swift_RELEase_dealloc functions to reuse the call stack of the original swift_release function, which is what we observe.

We can verify this by typing symbolic breakpoints in Xcode.

Press Command + 8 to switch the Navigator panel to Breakpoint Navigator. Press the “+” button in the lower left corner of the panel to add two symbolic breakpoints: SWIFt_RELEASE and _SWIFt_RELEase_.

Note that reference counting is also triggered during node generation, so swift_Release will also be called at this point, so we’ll first turn off the two breakpoints:

Print (“Bar”) breakpoint:

Run swift_relase and _SWIFt_RELEase_ again until print(“Bar”) is run and breakpoint is unpaused. We will then see that the program terminates at the entrance of swift_relase and _SWIFt_RELEase_.

We can see that the swift_relase call to _SWIFt_RELEase_ and the _SWIFt_RELEase_ call to _SWIFt_RELEase_dealloc are all done by the JMP family of directives. This is tail-call optimization (TCO) at the micro level of the instruction set.

libswiftCore.dylib`swift_release:
    ....
    0x7fff65730905 <+5>:  jmpq   *0x3351ac9d(%rip)         ; _swift_release
    ....
Copy the code
libswiftCore.dylib`_swift_release_:
    ...
    0x7fff65730ce1 <+145>: jmp    0x7fff65731a50            ; _swift_release_dealloc
    ...
Copy the code

Therefore, we prove that the above description of ListNode release process is correct by observing the process activity record of the function. As we can see, the successor of a ListNode implementation is released before the root node is completely destroyed, and then enters an automatic destruction process because there is no place for strong references to the successor node. This destruction, like nuclear fission, is a chain reaction, and once the “chain” is very long – even if the list itself is very long – the stack will burst. Of course, the result of a stack burst is an application crash.

In addition, we can observe that the compiler does not do tail-recursive optimization for this indirect recursive process.

Will not burst the stack of linked list implementation method

So what is the way to achieve a linked list that never bursts on release?

Method 1: Release manual processing

We can consider reverse releasing the Next node when releasing the ListNode. This is equivalent to a common interview question: reverse the linked list. So we can write the following code:

class ListNode {
    var val: Int
    var next: ListNode?
    
    init(_ val: Int) {
        self.val = val
        self.next = nil
    }
    
    deinit {
        var currentOrNil: ListNode? = self
        var nodes = [ListNode] ()while let current = currentOrNil {
            nodes.append(current)
            currentOrNil = current.next
        }
        
        for each in nodes.reversed() {
            each.next = nil}}}Copy the code

In this way, we can convert a recursion into a loop to avoid the stack burst caused by the chain release of the ListNode. This method is used in the deinit of a toy-level LRUCache that I wrote myself for a two-way linked list.

However, the linked list constructed by this method needs to insert all the linked list nodes into a dynamic array every time it is released. Although the complexity of dynamic array insertion is O(1) evenly, it still consumes O(n) auxiliary space. There is also O(n) overhead in time (which is why I call the LRU cache I wrote toy-level). Is there any way to eliminate this?

Method 2: Conduct Pooling of linked list nodes

Considering that linked list is a data structure that exposes the internal structure characteristics (precursor nodes or successor nodes) to users, we have to consider how to maintain the internal structure of the linked list while using it, thus bringing us a considerable intellectual burden. We can then reduce the intellectual burden of using linked lists by encapsulating them within a type that can only operate on linked lists, and then encapsulating some of the functionality used by data set types.

For that, check out a series of articles written by the developers of StarCraft I that pit their wits against linked lists during starCraft I development:

I. Tough Times on the Road to Starcraft

II. Avoiding Game Crashes Related to Linked Lists

Explaining the Implementation Details of the Fix (Sorry, He’s been Explaining the Implementation Details for seven years)

In this way, we can apply the pooling pattern within this type. Pooling is a common design mode in computance-intensive fields (such as game clients, large software clients and large servers), which is not common in the research and development of iOS platform. You can even say that objective-C 2.0 invalidation [NSObject +allocWithZone:] API Apple does not encourage pooling in Objective-C — you can only do pooling in C++. Pooling patterns of particular way is: a pre-allocated memory pool and then to create the object directly from the memory pool for this object can be divided into part of, so we can avoid the system memory allocation function need access to the operating system for resource scheduling information this overhead, further improve the computing performance. Of course, these are just bonus and not the fundamental purpose of pooling when faced with the problem of solving ListNode configuration lists that are too long and crash on release.

The basic purpose of the pooling pattern here is that if we create a pool for each instance of the encapsulation type that uses the linked list, then we only need to release the pool when we release the linked list.

As mentioned, Apple does not seem to encourage pooling in Objective-C, and it is not easy to apply pooling in Swift, because the memory allocation process for class instances is controlled by the Swift runtime. We can in the stdlib/public/runtime/Heap. CPP found in the following functions:

void *swift::swift_slowAlloc(size_t size, size_t alignMask) {
  void *p;
  // This check also forces "default" alignment to use AlignedAlloc.
  if (alignMask <= MALLOC_ALIGN_MASK) {
    p = malloc(size);
  } else {
    size_t alignment = (alignMask == ~(size_t(0)))? _swift_MinAllocationAlignment : alignMask +1;
    p = AlignedAlloc(size, alignment);
  }
  if(! p) swift::crash("Could not allocate memory.");
  return p;
}
Copy the code

This is the Swift runtime function that allocates heap memory, and ultimately allocates memory for class instances. As we can see, its current implementation uses malloc of the C standard library to allocate memory. Another AlignedAlloc under Darwin also uses POSIx_memalign for memory allocation, a process that is difficult to involve any developer. To use the pooling mode, we can only implement a memory pool using UnsafeMutablePointer in the Swift standard library and struct.

Of course, you can try the more “wild” approach: create a memory pool in C, make sure that every instance in the pool is consistent with objective-C’s memory model, and then bridge over the contents of the pool using CF objects to Swift. One advantage of this is that you can get the class instance without having to manipulate the raw pointer.

If you’re interested in this approach, check out this article on Qiita Japan (also a developer community) :

Swift toll-free Bridge verbal 読む

The overall design idea is as follows:

As shown in the figure, the _ListStorage maintains a memory pool buffer, a head node pointer headOffset (for using a node chain), and a reuse node pointer reuseHeadOffset (for re-using a node chain). In this linked list implementation, because all the nodes are in the memory pool, we can use the offset of the node in the memory pool to record the location of the next node.

  • When initialized, the header node pointer is null (- 1), the reuse node pointer points to the first element of the memory pool, for each cell in the memory poolnextThe Pointers all point to the next cell, the last cellnextPointer is null (- 1), at which point every cell in the memory pool is on the reuse node chain.
  • When joining, remove the head node from the reuse node chain (if not, expand the memory pool) and insert it into the head node that uses the node chain.
  • When deleted, the head node using the node chain is removed and inserted into the head node of the reuse node chain.

Since we always allocate units of the same size, we don’t need a small space to keep track of the size of the allocated space, as Malloc does, so the implementation of this memory pool becomes very simple.

I posted the full code for this implementation on GitHub, and here are a few highlights:

  • Instead of using a pointer directly to represent the subsequent nodes in the cells in the buffer (which corresponds to the bucket concept in the code), I use the node’s offset in the memory pool as the pointer value. There is an advantage to doing this: when we apply copy-on-write (COW), the entire container is simply copied. Otherwise, we’re going to have to sort it out pointer by pointer.

  • I set the growth factor for the memory pool to 1.5

    let newCapacity = max(1, _capacity + ((_capacity + 1) > >1))
    Copy the code

    Because when we use the growth factor of 1.5, the operating system will have the opportunity to reuse space previously allocated. This reduces memory fragmentation.

    Examples are as follows:

    First memory allocation: 1 2 Second memory allocation:.. 1 2 3 Third memory allocation: ..... 1 2 3 4 5 Fourth memory allocation:.......... 1 2 3 4 5 6 7 8 Fifth memory allocation: .................. 12 3 4 5 6 7 8 9 A B C Allocated memory for the sixth time: 12 3 4 5 6 7 8 9 A B C D E F 11 12Copy the code

    As you can see, on the sixth memory allocation, the operating system has an opportunity to reuse the previously allocated space.

    The growth factor of Array and ContiguousArray in Swift is 2. We can see that the stdlib/public/core/ArrayShared. Swift, this file has the following contents:

    @inlinable
    internal func _growArrayCapacity(_ capacity: Int) -> Int {
        return capacity * 2
    }
    Copy the code

    This sentence controls the expansion factor of Swift Array and ContiguousArray. Therefore, for Swift Array and ContiguousArray, the operating system will not have the opportunity to reuse the previously allocated space.

    In practical projects, Tencent IEG’s RapidJSON and Facebook’s FBFolly both use 1.5 as a dynamic array type of container growth factor.

    The STD ::vector growth factor in Clang and GCC’s C++ STL is also 2.

  • We can make it easier to use this type in Swift by making our list type compliant to Sequence and Collection (I’ve already done this in the sample code).

    However, it is important to note that because the Collection protocol comes with Index, the time complexity of Index access for linked lists is O(n), In addition, the Collection automatically implements an iterator that uses the subscript after implementing the getter for the subscript. So if we iterate through the linked list using the usual Swift method of iterating Sequence and Collection (such as a for-in loop), the performance will be very poor (time complexity up to O(n^2)).

    So, you should implement an iterator of your own to do O(n) traversal complexity (I’ve already done that in the sample code).

  • If you want to used in the production environment, you should obey RangeReplaceableCollection let this list type. The protocol has a function called removeAll(keepingCapacity:) that allows us to specify an interface for freeing unwanted space in the memory pool.

Method 3: Only pool references of linked list nodes

The above method is too cumbersome and requires you to create your own memory pool (although this is a very simple special case). Is there an easier way?

There are.

We can go back to the basic method of using class to create list nodes, and then choose to pool only references to list nodes. This is a bit like how reusable cells are handled within UITableView; Facebook’s ComponentKit also creates a reuse pool within each view in the View hierarchy to reuse subviews controlled by ComponentKit. In these examples, only object Pointers are pooled instead of the whole object.

The overall design idea is as follows:

As shown, _ListStorage maintains an array of nodes for linked list nodes, the offset of the head node in the array, headNodeIndex The index of the reusable node in the array. At the same time, we still use the array offset in the list node to record the position of the next node in the array. The main benefit of doing so in this implementation is to circumvent reference counting.

I also put the complete code for this implementation on GitHub.

Also, for this list, if you want to used in the production environment, you should obey RangeReplaceableCollection let this list type. The protocol has a function called removeAll(keepingCapacity:) that allows us to specify an interface to free up unwanted space in data structures.

I also made a simple benchmark in the sample project, and method 2 performed significantly better than Method 3.


This article refers to resource indexes

SSA Book

Tough Times on the Road to Starcraft

Starcraft I Developer II: II. Avoiding Game Crashes Related to Linked Lists

LinkedListExamples I wrote the code for LinkedListExamples that do not burst stacks

Also, check Swift source code and I recommend using CLion (no promotion fee)


This paper uses OpenCC to complete the conversion.