1. Function types

In Swift, functions themselves have their own types, which consist of formal parameter types and return types. An error is reported if a function with the same name does not specify a type when using a function as a variable.

So what is the nature of function types, we open source, in the Metadata. H file find TargetFunctionTypeMetadata:

Watch, TargetFunctionTypeMetadata inherited from TargetMetadata, it must be Kind, and its own with Flags and ResultType, ResultType is the return value types of metadata. Next we see the getParameters function, which casts (this + 1) to Parameter * via reinterpret_cast. It returns a pointer type. So this function returns a contiguous memory space that stores data of type Parameter.

For this + 1, see point 5 of the article “Meta-types and Mirror source code and HandyJson parse Metadata for enumerations, structs, and classes.” So TargetFunctionTypeMetadata structure reduction is as follows:

struct TargetFunctionTypeMetadata {
    var Kind: Int
    var Flags: Int
    var ResultType: Any.Type
    var parameters: ParametersBuffer<Any.Type>}struct ParametersBuffer<Element>{
    var element: Element

    mutating func buffer(n: Int) -> UnsafeBufferPointer<Element> {
        return withUnsafePointer(to: &self) {
        let ptr = $0.withMemoryRebound(to: Element.self, capacity: 1) { start in
            return start
        }
        return UnsafeBufferPointer(start: ptr, count: n)
        }
    }

    mutating func index(of i: Int) -> UnsafeMutablePointer<Element> {
        return withUnsafePointer(to: &self) {
            return UnsafeMutablePointer(mutating: UnsafeRawPointer($0).assumingMemoryBound(to: Element.self).advanced(by: i))
        }
    }
}
Copy the code

TargetFunctionTypeMetadata structure after reduction is out, we want to get the number of parameters, is so get in the source code:

Flags is of type TargetFunctionTypeFlags. TargetFunctionTypeFlags has the following structure:

Its getNumParameters method looks like this:

TargetFunctionTypeFlags stores mask information. Data & NumParametersMask specifies the number of parameters. Because TargetFunctionTypeFlags only has a Data member variable.

At this point, restoring the getNumParameters function yields the following result:

func getNumParameters(a) -> Int { self.Flags & 0x0000FFFF }
Copy the code

Next, we try to print TargetFunctionTypeMetadata stored information, code is as follows:

func add(_ a: Double._ b: Int) -> Double {
    return a + Double(b)
}

let functionType = unsafeBitCast(type(of: add) as Any.Type, to: UnsafeMutablePointer<TargetFunctionTypeMetadata>.self)
let numParameters = functionType.pointee.getNumParameters()

print("Number of function arguments:\(numParameters)")
for i in 0..<numParameters {
    print("The first\(i)Parameter types:\(functionType.pointee.parameters.index(of: i).pointee)")}print("Return value type of function argument:\(functionType.pointee.ResultType)")
Copy the code
Number of function arguments:20Parameter types:Double1Parameter types:IntThe return value type of the function argument:Double
Copy the code

Closure expressions

In Swift, you can define a function through func, or you can define a function through closure expressions.

1. Closure expression writing

Closure expressions are made up of curly braces, argument lists, return value types, in, and the function body, and are written as follows:

{(parameter list) -> Return value type in function body code}Copy the code

Start with {}, followed by the argument list and return value type before in, followed by the function body code.

Closure expressions define functions as follows:

var add = {
    (a: Int, b: Int) - >Int in
    return a + b
}
Copy the code

Func defines the following functions:

func add(_ a: Int._ b: Int) -> Int {
    return a + b
}
Copy the code

Either way, the call is the same and the result is the same.

print(add(10.20)) / / 30
Copy the code

2. Shorthand for closure expressions

We define a function — exec that takes three arguments, the first and second of type Int, and the third of type function. This function type takes two ints and returns an Int.

func exec(v1: Int.v2: Int.fn: (Int.Int) - >Int) {
    print(fn(v1, v2))
}
Copy the code

Here are a few ways to call this function:

/ / the first
exec(v1: 10, v2: 20, fn: { (v1: Int, v2: Int) - >Int in
    return v1 + v2
})
/ / the second
exec(v1: 10, v2: 20, fn: { v1, v2 in
    return v1 + v2
})
/ / the third kind
exec(v1: 10, v2: 20, fn: { v1, v2 in
    v1 + v2
})
/ / the fourth
exec(v1: 10, v2: 20, fn: { $0 + The $1 })
/ / 5 kinds
exec(v1: 10, v2: 20, fn: +)
Copy the code
  • The first is written without any shorthand. The closure expression’s parameter name, parameter type, return value type, and function body are all written.

  • The second notation omits the parameter type and return value type relative to the first.

  • The third option omits the return as opposed to the second. In Swift, you can omit the return if the code in the function that returns the value is simple (that is, if there is only one return line in the function body).

  • The fourth way is to omit the parameter name and in compared with the third way, because the body of the function is relatively simple, so we can directly use 0 and 0 and 0 and 1 to represent v1 and v2, respectively.

  • The fifth method omits the body of the function compared to the fourth method, because the implementation of the body of the function is too simple, just adding two parameters, so it can be directly expressed as +.

3. Trailing closures

If you take a long closure expression as the last argument to a function, using trailing closures can enhance the readability of the function. A trailing closure is a closure expression written outside (after) the function call parentheses. For example, the exec function in point 3 can be called with a trailing closure:

exec(v1: 10, v2: 20) { v1, v2 in
    return v1 + v2
}
Copy the code

If the closure expression is the only argument to the function, and the trailing closure syntax is used, there is no need to enclose parentheses after the function.

func exec(fn: (Int.Int) - >Int) {
    print(fn(10.20))
}

exec(fn: {$0 + The $1})
exec() {$0 + The $1}
exec {$0 + The $1}
Copy the code

Third, the closure

A function combined with its captured variable constant environment is called a closure. Generally refers to a function defined inside a function, which captures the local variable \ constant of the outer function.

1. Capture local variables

We use typeAlias to define a function of type Fn and a function of getFn as follows:

The ones in red are collectively called closures. Num is a local variable. Call getFn to see what happens to it:

let fn = getFn()
print(fn(1))   / / 1
print(fn(1))   / / 2
print(fn(1))   / / 3
Copy the code

Each time fn is called, the value passed is 1, but each print is different. It feels as if num += 1 is stored in heap space. Let’s look at getFn calls in assembly.

Notice that in getFn’s assembly, the swift_allocObject method is called. What this method is doing is applying and allocating memory for heap space. So the closure actually opens up the heap space, puts the num value into the heap space, and every time fn is called, it accesses the heap space and performs the += operation.

(num = swift_allocObject) (swift_allocObject) (num = swift_allocObject) (swift_allocObject)

Return num (0x0000000101019D10); return num (0x0000000101019D10);

After formatting the memory for the output heap space, the value of num is stored at 0x101019d20, which is stored in the heap space.

2. Capture global variables

When a function captures a local variable/constant, it opens up heap space to store the local variable/constant. What if it captures a global variable? The code is as follows:

Let’s set the breakpoint at return plus and return num to see how getFn is called in assembly.

GetFn does not create any heap space in its assembly. It returns the address of the plus function directly.

Note that in the plus function, it is modified directly by taking the global variable num. Functions do not capture global variables/constants, so this behavior is not strictly called a closure.

The nature of closures

When exploring the nature of closures, we need to use IR code for analysis, so let’s familiarize ourselves with some IR syntax.

1. Syntax of IR

Array:

[<elementnumber> x <elementtype>]
//example
alloca [24 x i8], align 8 24I8 are0
alloca [4 x i32] = = = array
Copy the code

Structure:

%swift.refcounted = type { %swift.type*, i64 }

// Representation
%T = type {<type list>} // This is similar to the C language structure
Copy the code

Pointer type:

<type> *

//example
i64* // 64-bit shaping
Copy the code

Getelementptr command:

In LLVM we get the members of arrays and structs using getelementptr. The syntax is as follows:

<result> = getelementptr <ty>.<ty> * <ptrval>{, [inrange] <ty> <idx>}*
<result> = getelementptr inbounds <ty>.<ty> * <ptrval>{, [inrange] <ty> <idx
Copy the code

Here is an example from the LLVM website:

struct munger_struct {
    int f1;
    int f2;
};

void munge(struct munger_struct *P) {
    P[0].f1 = P[1].f1 + P[2].f2;
}

getelementptr inbounds %struct.munger_struct, %struct.munger_struct %1, i64
getelementptr inbounds %struct.munger_struct, %struct.munger_struct %1, i32

int main(int argc, const char * argv[]) {
    int array[4] = {1.2.3.4};
    int a = array[0];
    return 0;
}
Copy the code

Int a = array[0] int a = array[0]

a = getelementptr inbounds [4 x i32], [4 x i32]* array, i64 0, i32 0, i32 0
Copy the code

The summary is as follows:

  • The first index does not change the type of the pointer returned, that is, the type before ptrval is returned.

  • The offset of the first index is determined by the value of the first index and the base type specified by the first TY.

  • The second index is the size of an array or structure that is indexed internally.

  • Each additional index removes a layer from the base type used by the index and the pointer type returned.

For example, the first all-removed type in the [4 x i32] array address is [4 x i32] and the second index is i32.

2. IR analysis closure

The code is as follows:

typealias Fn = (Int) - >Int

func getFn(a) -> Fn {
    var num = 0
    func plus(_ i: Int) -> Int {
        num + = i
        return num
    }

    return plus
}

let fn = getFn()
Copy the code

2.1. Analysis of main function

We will compile the current main.swift file into the main.ll file, and compile the way and command in this article: “method”, after generating the main.ll file, we open, find the main function, as follows:

define i32 @main(i32 %0, i8** %1) #0 {
entry:
    %2 = bitcast i8** %1 to i8*
    {i8*, %swift.refcounted*} counted {i8*, %swift.refcounted*} counted {i8*, %swift.refcounted*} counted {i8*, %swift.
    %3 = call swiftcc { i8*.%swift.refcounted* } @"main.getFn() -> (Swift.Int) -> Swift.Int"(a)%4 = extractvalue { i8*.%swift.refcounted* } %3.0
    %5 = extractvalue { i8*.%swift.refcounted* } %3.1
    store i8* %4, i8** getelementptr inbounds (%swift.function, %swift.function* @"main.fn : (Swift.Int) -> Swift.Int", i32 0, i32 0), align 8
    store %swift.refcounted* %5.%swift.refcounted** getelementptr inbounds (%swift.function, %swift.function* @"main.fn : (Swift.Int) -> Swift.Int", i32 0, i32 1), align 8
    ret i32 0
}
Copy the code

Note: look! The getFn function is called on the %3 line, and its return value is {i8*, %swift.refcounted*}. The return value is searched globally as follows:

%swift.function = type { i8*.%swift.refcounted* }
%swift.refcounted = type { %swift.type*, i64 }
%swift.type = type { i64 }
%swift.full_boxmetadata = type { void (%swift.refcounted*)*, i8**.%swift.type, i32, i8* }
Copy the code

Analysis according to IR grammar:

  • {i8*, % swif.refcounted *} is a structure that contains two member variables, a member of type i8* and a member of type % swif.refcounted *.

  • The %swift.refcounted* is a pointer to a structure whose structure is {%swift.type*, i64} that contains two member variables, a member of type %swift.type* and a member of type i64.

  • %swift.type* is a structure pointer with the structure {i64} that contains only i64 member variables.

  • % swift.full_boxMetadata should be a closed metadata only, which is what swift_allocObject is passing when creating heap space below.

2.2. Analysis of getFn function

The getFn function is implemented as follows:

define hidden swiftcc { i8*.%swift.refcounted* } @"main.getFn() -> (Swift.Int) -> Swift.Int"(#)0 {
entry:
    %num.debug = alloca %TSi*, align 8
    %0 = bitcast %TSi** %num.debug to i8*
    call void @llvm.memset.p0i8.i64(i8* align 8 %0, i8 0, i64 8, i1 false)

    // Call swift_allocObject to create an instance that returns a structure pointer to HeapObject *, as described in Structs and Classes in the first article. So, %swift.refcounted* should be a HeapObject *.
    // The first parameter to swift_allocObject requires metadata, so i64 24 and i64 7 should be allocated memory size and memory alignment.
    %1 = call noalias %swift.refcounted* @swift_allocObject(%swift.type* getelementptr inbounds (%swift.full_boxmetadata, %swift.full_boxmetadata* @metadata, i32 0, i32 2), i64 24, i64 7) #1

    {8 x i8]} Fetch {% swift_allocObject counted HeapObject *;
    %2 = bitcast %swift.refcounted* %1 to <{ %swift.refcounted, [8 x i8] }> *

    {% swif.refcounted, [8 x i8]} Counted {% swif.refcounted, [8 x i8]} counted.
    %3 = getelementptr inbounds <{ %swift.refcounted, [8 x i8] }>.<{ %swift.refcounted, [8 x i8] }> * %2, i32 0, i32 1
    %4 = bitcast [8 x i8]* %3 to %TSi*
    store %TSi* %4.%TSi** %num.debug, align 8
    %._value = getelementptr inbounds %TSi.%TSi* %4, i32 0, i32 0
    store i64 0, i64* %._value, align 8
    %5 = call %swift.refcounted* @swift_retain(%swift.refcounted* returned %1) #1
    call void @swift_release(%swift.refcounted* %1) #1

    // insertValue is an insert and store value.
    Bitcast casts the address of the plus function to i8* and then inserts that value into the i8* of the structure {i8*, % swif.refcounted *}.
    {%swift.refcounted, [8 x i8]} then count the {%swift.refcounted, [8 x i8]} into the %swift.refcounted* of the structure {i8*, %swift.refcounted*}.
    %6 = insertvalue { i8*.%swift.refcounted* } { i8* bitcast (i64 (i64, %swift.refcounted*)* @"partial apply forwarder for plus #1 (Swift.Int) -> Swift.Int in main.getFn() -> (Swift.Int) -> Swift.Int" to i8*), %swift.refcounted* undef }, %swift.refcounted* %1.1

    Return the {i8*, %swift.refcounted*} structure.
    ret { i8*.%swift.refcounted* } %6
}
Copy the code

Note that the line %1 calls swift_allocObject to create an instance that, according to the first structs and Classes, returns a structure pointer to HeapObject *. So, %swift.refcounted* should be a HeapObject *. I64 24 and i64 7 correspond to the size and memory alignment of the allocated memory.

The middle part did some casting and assignment to the HeapObject * returned by swift_allocObject eventually became {% swif.ref8]} counted, [8 x i8].

You see the line %6. {i8*, % swif.refcounted *} Counted {i8*, counted*} of the plus function address and {% swif.refcounted, [8 x i8]} into the structure {i8*, % swif.refcounted *} Finally {i8*, %swift.refcounted*} is returned.

3. Structural restoration of closures

Based on the above analysis, it can be concluded that:

  • The closure is essentially a construct like {i8*, % Swift.refcounted *} where the address of the function is stored, % swif.refcounted * counted is a box * ({% swif.refcounted, [8 x i8]}).

  • And box * has HeapObject * and a 64-bit value.

  • HeapObject * stores metadata and refcount, respectively.

The final structure of the closure can be restored as follows:

struct ClosureData<Box> {
    /// function address
    var ptr: UnsafeRawPointer
    /// Store the value that captures the heap space address
    var object: UnsafePointer<Box>}struct Box<T>{
    var heapObject: HeapObject
    // Capture the value of a variable/constant
    var value: T
}

struct HeapObject {
    var matedata: UnsafeRawPointer
    var refcount: Int
}
Copy the code

Verify this with code:

// Wrap the closure with a structure to facilitate pointer conversion
struct ClosureStruct {
    var closure :(Int) - >Int
}

var fn = ClosureStruct(closure: getFn())
fn.closure(10)

//fn initializes a ClosureStruct type pointer
let ptr = UnsafeMutablePointer<ClosureStruct>.allocate(capacity: 1)
ptr.initialize(to: fn)

ClosureData
      
       >
      
let ctx = ptr.withMemoryRebound(to: ClosureData<Box<Int> >.self, capacity: 1) {$0.pointee
}
print("Closure call address:",ctx.ptr)
print("Heap space address:",ctx.object)
print("Heap space stored values", ctx.object.pointee.value)
ptr.deinitialize(count: 1)
ptr.deallocate()
Copy the code

To verify that the printed address of the function is the address of the plus function, we can break a breakpoint on return Plus and then read the address of the plus function through assembly, as shown in the figure below:

We can see that the address of the plus function is 0x0000000100002C00. We release the breakpoint and continue to test the function address and heap space address printed by the code, as shown in the figure below:

The address of the plus function is verified successfully, and the value of the captured variable is also stored in the heap space, and this value comes after the HeapObject, so the restored structure is correct.

4. Capture reference types

Capture a value type, will open up memory in the heap space, so capture reference type, we through assembly to analyze, the code is as follows:

typealias Fn = (Int) - >Int

class SHPerson {
    var age = 0
}

func getFn(a) -> Fn {
    let person = SHPerson(a)func plus(_ i: Int) -> Int {
        person.age + = i
        return person.age
    }
    return plus
}

let fn = getFn()
Copy the code

Let’s make a breakpoint at return plus and Person. age += I. Let’s look at the assembler implementation of getFn:

As you can see, the only heap space that opens up is the SHPerson initialization, so what does it capture by reading the rax value 0x000000010126C330, which is the memory address of Person. Next, release the breakpoint and print out the closure’s structure with our test code:

As you can see, the address stored by object in ClosureData is directly the person memory address. Because heap space is already created when SHPerson is initialized, there is no need to create another heap space to capture the person, so putting the person’s memory address directly into ClosureData avoids unnecessary memory overhead.

5. Capture multiple values

5.1. Analyze the getFn function

If multiple values are captured, does the closure have the same structure as in point 3?

typealias Fn = (Int) - >Int
func getFn(a) -> Fn {
    var num1 = 0
    var num2 = 0
    func plus(_ i: Int) -> Int {
        num1 + = i
        num2 + = (num1 + 1)
        return num2
    }

    return plus
}

let fn = getFn()
Copy the code

We will compile the current main.swift file into main.ll, and then we will directly look at the implementation of getFn:

The code is long, so I only cut the key parts. As you can see, the swift_allocObject method is called multiple times after capturing multiple values. Note that the first and second calls to swift_allocObject are to store the values of num1 and num2.

Interestingly, the third call to swift_allocObject casts the returned instance to a structure pointer:

<{ %swift.refcounted, %swift.refcounted*.%swift.refcounted* }> *
Copy the code

Notice getelementptr, getelementptr twice after the third call to swift_allocObject. What the getelementptr is doing is storing the swift_allocObject structure of the first two swift_allocobjects into the structure %13.

5.2. Restore the closure structure

Based on the above analysis, we restored the structure of the closure as follows:

struct ClosureData<MutiValue>{
    /// function address
    var ptr: UnsafeRawPointer
    /// Store the value that captures the heap space address
    var object: UnsafePointer<MutiValue>}struct MutiValue<T1.T2>{
    var object: HeapObject
    var value:  UnsafePointer<Box<T1>>
    var value1:  UnsafePointer<Box<T2>>
}

struct Box<T>{
    var object: HeapObject
    var value: T
}

struct HeapObject {
    var matedata: UnsafeRawPointer
    var refcount: Int
}
Copy the code

The test code is as follows:

// Wrap the closure with a structure to facilitate pointer conversion
struct ClosureStruct {
    var closure :(Int) - >Int
}

var fn = ClosureStruct(closure: getFn())
fn.closure(10)

//fn initializes a ClosureStruct type pointer
let ptr = UnsafeMutablePointer<ClosureStruct>.allocate(capacity: 1)
ptr.initialize(to: fn)

ClosureData
      
       >
      
let ctx = ptr.withMemoryRebound(to: ClosureData<MutiValue<Int.Int> >.self, capacity: 1) {$0.pointee
}
print("Closure call address:",ctx.ptr)
print("Heap space address:",ctx.object)
print("Heap space stored values", ctx.object.pointee.value.pointee.value, ctx.object.pointee.value1.pointee.value)
ptr.deinitialize(count: 1)
ptr.deallocate()
Copy the code
The closure's calling address:0x0000000100002840Heap space address:0x000000010111f400The value stored in heap space10 11
Copy the code

5.3. Closure structure summary for capturing single and multiple values

Based on the above analysis, the difference between capturing a single value and multiple values is:

  • Within a single value, the heap space address stored in ClosureData is directly the heap space in which the value resides.

  • For capturing multiple values, the heap space address stored in ClosureData becomes a structure that can store many captured values.

Overall, capturing multiple values is one more layer of wrapping than capturing a single value, so the code structure is summarized as follows:

// Capture ClosureData for a single value
struct ClosureData<Box> {
    /// function address
    var ptr: UnsafeRawPointer
    /// Store the value that captures the heap space address
    var object: UnsafePointer<Box>}// ClosureData that captures multiple values
struct ClosureData<MutiValue>{
    /// function address
    var ptr: UnsafeRawPointer
    /// Store the value that captures the heap space address
    var object: UnsafePointer<MutiValue>}struct MutiValue<T1.T2. >{
    var object: HeapObject
    var value:  UnsafePointer<Box<T1>>
    var value1:  UnsafePointer<Box<T2>>
    // More value
    .
}

struct Box<T>{
    var object: HeapObject
    var value: T
}

struct HeapObject {
    var matedata: UnsafeRawPointer
    var refcount: Int
}
Copy the code

5. Escape closure

When a closure is passed to a function as an actual argument and is called after the function returns, the closure is said to have escaped. When declaring a function that accepts closures as formal arguments, we can make it clear that closures are allowed to escape by writing @escaping before the formal argument.

  • When closures are stored as properties, the closure life is prolonged when the function completes.

  • When a closure executes asynchronously, the closure life cycle is prolonged when the function completes.

  • Closures of alternative types are escape closures by default.

The following closure is also an escape, for the compiler to assign a closure to a variable that the compiler thinks might be executed somewhere else.

func test(a) -> Int{
    var age = 10
    let completeHandler = {
        age + = 10
    }

    completeHandler()
    return age
}
Copy the code

Conditions required to escape closures:

  • Passed as an argument to a function.

  • The current closure is executed asynchronously or stored inside the function.

  • The function ends, the closure is called, and the closure’s life cycle does not end.

Automatic closure

@AutoClosure is an automatically created closure that wraps parameters as closures. This closure takes no arguments and returns the value passed in when it is called. This convenience syntax lets you omit the curly braces of a closure when calling it.

What does this mean? Let’s look at the following code:

// If the first number is greater than 0, return the first number. Otherwise return the second number
func getFirstPositive(_ v1: Int._ v2: Int) -> Int {
    return v1 > 0 ? v1 : v2
}

print(getFirstPositive(10.20))    / / 10
print(getFirstPositive(-2.20))    / / 20
print(getFirstPositive(0.-4))     / / - 4
Copy the code

Using a ternary operator to determine whether v1 or v2 is returned, I add a test function as follows:

func getNum(a) -> Int {
    print("getNum")
    let a = 10
    let b = 20
    return a + b
}

print(getFirstPositive(10, getNum()))
Copy the code
Result: getNum10
Copy the code

Notice that I passed a 10 to getFirstPositive, and it did return a 10, but it printed getNum, but I didn’t need to call getNum to tell v1 > 0, but the compiler did it anyway.

At this point we can turn v2 into a function, that is, we can pass in a function as follows:

func getFirstPositive(_ v1: Int._ v2: () - >Int) -> Int {
    return v1 > 0 ? v1 : v2()
}

print(getFirstPositive(10, {
        print("test")
        return 30 }))
Copy the code
Print result:10
Copy the code

As you can see, a call to getFirstPositive does not print test, which optimizes our code to avoid unnecessary code execution. But if our code is simple, as in the example above, we could write it like this:

print(getFirstPositive(10, {20}))
Copy the code

If you want to use an automatic closure, you need to add a pair of curly braces.

func getFirstPositive(_ v1: Int._ v2:@autoclosure() - >Int) -> Int {
    return v1 > 0 ? v1 : v2()
}

print(getFirstPositive(10.20))
Copy the code

Add @autoclosure to v2: to create an automatic closure, and you can use it without the curly braces.

  • @Autoclosure only supports arguments in () -> T format.

  • @AutoClosure does not support only the last parameter.

  • Null merge operator (??) The @Autoclosure technique is used.

  • There is @autoclosure, but no @autoclosure constructor overload.