Start with an RxSwift call

The problem found

In the scenario of compiling a service interface, the loading page of one interface needs to be displayed. After the data is loaded, the actual interface is rendered. Data can be loaded multiple times, and the Loading page is displayed each time the data view is loaded. So we build a signal in the ViewModel that sends an optional value:

let data: BehaviorRelay<DataType? > = BehaviorRelay(value: nil)Copy the code

If the value of this signal is sent nil, it means that the data is being loaded and the view needs to display Loading state. If it’s non-nil, it means that the new data is loaded, and the view layer picks up the moment and does the page presentation. To make the view layer more responsive, these two operations are separated in the viewModel, as shown below:

To filter nil, you use the compactMap operator, and to keep the view layer from directly touching the Data, you add a map operator that maps the Data itself as a constant. So the signal for the view layer subscription actually reads like this:

let dataReady = data.compactMap { $0 }.map { true } 
Copy the code

The logic of our focus can be summarized as follows:

The view layer performs the view-build action when the ViewModel receives non-empty data

The view layer listens for signals like this:

viewModel.dataReady.subscribe(onNext: {

    // do something with build Content
})
Copy the code

This is the code that builds the binding. As we all know, when a signal is subscribed, the logical closure of the subscription itself is triggered. But because the original signal was a BehaviorRelay with an initial value nil, theoretically the first subscription would not trigger the listening closure after filtering out nil by the compactMap operator. The expected data flow is shown as follows:

But did it live up to expectations?

Of course not

When it runs, you can see from the call stack that the subscription immediately triggers the listening closure (that is, lines 83 and 84 are executed synchronously serially). Huh? Shouldn’t nil values be filtered out by compactMap? The signal shouldn’t be able to go out at all.

Initial problem judgment

The first reaction to this phenomenon is that RxSwift is being used incorrectly because, after all, it has so many operators and concepts that one does not have much confidence that it is being used correctly. So go straight to the source of the CompactMap operator:

You can see that when the source sends the next value, the processing logic in CompactMapSink (red box) does an optional value binding, and if it’s not nil, it continues to send the downstream signal. The _transform closure is the {$0} part above, so self._transform(element) is element itself. In other words, theoretically this if can’t walk in.

We try to make a break point:

You can see that it’s actually inside the if, mappedElement is a (())

Huh? What this (()) is, you can clearly see that element is nil, and then after the {$0} closure is evaluated, it becomes (()), so it binds to the mappedElement and sends the downstream signal.

Realize that (()) is actually an empty Tuple, or Tuple. We know that Swift is an empty tuple for Void, but it doesn’t equal nil, and that causes this problem. So why is {$0}(nil) a ()? Is there some kind of black art behind Rxswift that does something special to nil?

Ideas spread

When finding a problem is stalled, try looking at it from multiple angles, such as writing it differently. So after a couple of attempts at writing it, it’s not surprising that this time it got even weirder. I found that if I converted the signal like this:

data.compactMap { $0 }.map { _ in true } 
Copy the code

The problem is solved…..

(Slightly changed to facilitate debugging of mappedElement values)

Compare this to the original:

PactMap {$0}.map {_ in true} data.compactMap {$0}.map {_ in true}Copy the code

This raises two new questions:

  • The most immediate question: does increasing “_ in” have any effect on code logic?
  • Weirdest question: Why does changing the way a map closure is written affect compactMap logic in turn?

Locating the root cause

Simplify the problem

At this point I came to realize that it’s probably not RxSwift, it’s probably language. To do this, I constructed a simplified version of the Demo:

struct Transform<E> { let element: E func foo<T>(transform: (E) -> T) -> TransformB<T> { let mappedElement = transform(element) return TransformB<T>(element: mappedElement) } } struct TransformB<E> { let element: E func foo<T>(transform: (E) -> T) -> T { return transform(element) } } let instance = Transform(element: Let result = instance.foo{$0}. Foo {true}Copy the code

The Transform structure has a foo method that takes a closure and converts its wrapper to TransformB. The TransformB also receives a closure that outputs the converted value directly.

The first mappedElement output is not “1”, but ().

You can see that mappedElement is indeed a Tuple, and an empty Tuple.

At this point, you realize that Swift treats functions that have no return value as if they return Void, or empty Tuple. Is there something wrong with {$0}?

With that in mind, I tried to separate the two calls to see what type the first closure would be.

The first call results in TransformB, which looks fine.

When I tried to view the second type, the compiler reported an error

The closure context requires an explicit list of parameters that I am not using, so I am advised to insert_ in.

Huh? Why is there no compilation error when connecting calls together? It felt like the compiler was targeting it.

So I put it together again, and it still compiles. But there was a warning that hadn’t been heeded before:

The compiler says I didn’t use the $0 variable. (huh? It is used as a return value.

Then I realized that it was probably the compiler itself that had mistakenly compiled the {$0} closure into a closure that returned Void, bypassing $0 without using the variable, and returning an empty Tuple. That would explain the phenomenon.

After constantly trying various writing methods, summed up similar problems and phenomena:

// The compiler passes, Foo {true} instance.foo {e in e}. Foo {return true} instance.foo {$0}. Foo {return true} instance.foo {$0}. Instance. foo {e in return e}. Foo {true} instance.foo {e in return e}. Foo {return true} Foo {return $0}. Foo {true} Instance. Foo {$0}. Foo {_ in true} instance.foo {return $0}. Foo {_ in true} instance }.foo { _ in true }Copy the code

Let result = instance.foo {$0}. Foo {true}

  1. The first closure was mistakenly compiled by the compiler under certain circumstances to return the value of Void, i.e. {$0}. The result of the closure is () regardless of the input argument.
  2. The two closures make changes to the package syntax sugar (for example, omitting the return keyword, omitting the closure argument list) that do not affect the logic of the various closures. Affects the result of compiling the entire expression.

Looking for positioning

Since programs display limited information at the source level at runtime, we mine phenomena from assembly code. Open Xcode Debug -> Debug WorkFlow -> Always Show Disassembly.

When the program runs, the first transform closure call point is reached:

The address of the transform closure is already in the RDX register. Execute the LLDB register read command.

TestOnMac partial apply forwarder for reabstraction thunk helper from @callee_guaranteed (@guaranteed Swift.String) -> () to @escaping @callee_guaranteed (@in_guaranteed Swift.String) -> (@out ()) at

The {$0}transform closure is of type (String) -> (), not (String) -> (String). It is true that the executable code generated by the compiler is not as expected.

So what went wrong in the compilation chain? I decided to get to the bottom of the problem.

We all know that the difference between swift compilation chain and other Objective-C, C, C++ languages is that swift has a powerful compilation front end system before IR stage, which includes type checking, SIL language and other large-scale optimization.

Now, let’s look at the generated SIL results.

The swift compile front end command is swiftc (actually the same as the swift command), and the output sil command is

swiftc -emit-sil xxx.swift

The first transform, after simplification, is compiled into the SIL output:

// closure #1 in
sil private @$s4mainySSXEfU_ : $@convention(thin) (@guaranteed String) -> () {

// %0 "$0"                                        // user: %1
bb0(%0 : $String):
  debug_value %0 : $String, let, name "$0", argno 1 // id: %1
  %2 = tuple ()                                   // user: %3
  return %2 : $()                                 // id: %3
} // end sil function '$s4mainySSXEfU_'
Copy the code

Closure# 1 is the closure {$0}. As you can see in SIL, it is already of type (String) -> (), and if you look at bb0 (base block), you can see that %0 is the closure argument, which is the String passed in, %2 is the empty tuple, It then returns the empty tuple directly. From this we can see that something has gone wrong at the SIL stage.

Continuing down the compile chain, since it is the type determination that has the problem, go to the AST phase after the type resolution: enter the command

swiftc -dump-ast main.swift

We can see the AST structure of the kan/bin/teng output:

These are the AST structure trees after type resolution, and continue to look for our key Transform closure types in the massive result output:

You can see that the AST structure with the type resolution already judges the transform to be a closure of type (String) -> (). Let’s execute the AST again without type resolution and see:

swiftc -dump-parse main.swift

There is no type resolution at this stage, just an AST tree analyzing swift source text.

At this point, we can find that after the Swift compiler built the original AST structure, there was no type resolution. After semantic analysis of the AST, the wrong type result was obtained, and we finally located the problem.

At this point, how do we move on to the semantic analysis stage? Swift’s official document states:

Run the following command:

swiftc -dump-ast main.swift -Xfrontend -debug-constraints

The above Log outputs the process Log of Type Inference at the stage of semantic analysis. However, this time we are slightly confused, so before going further, we should first understand the main character of the article — Swift Type Inference.

Swift Type Inference was explored

The paper

One of the features of the Swift language is its powerful type inference. It greatly simplifies the developer’s workload. Not only can we omit the explicit designation of many variable types, but type inference also supports code completion. Swift type inference is very powerful, including not only type inference for variables and expressions, but also support for various generic structures, generic functions, associated types of protocols, and so on.

Swift type inference system has three important features:

  1. Type inference works both ways
  2. Type inference is limited to a single expression or statement
  3. The type inference process is implemented based on the constraint system

Type inference works both ways

Swift’s type inference allows type information to flow in two directions.

As an example, we define an add function and call it:

func add(_ lhs: Int, _ rhs: Int) -> Int {
    lhs + rhs
}

let result = add(4, 5) //result: Int
Copy the code

We all know that when an expression or statement is constructed into an AST structure, the result of the expression is usually the root node, and the parts that make up the value are divided into leaves. Let result = add(4, 5) :

In this case, the result type will passaddThe function,4,5Type (literal), which is resolved to Int. The flow of such type judgments “looks” To be Leaf -> Root, as shown in the figure below:

As mentioned above, Swift type judgment is bidirectional, so let’s continue with the following example:

func add(_ lhs: Int, _ rhs: Int) -> Int {
    lhs + rhs
}


func add(_ lhs: Double, _ rhs: Double) -> Double {
    lhs + rhs
}

let result: Double = add(4, 5) //4: Double, 5: Double
Copy the code

For let result = add(4,5), the structure of the AST is the same as in the previous example, but this time result is explicitly specified as a Double, so the swift type judgment conversely determines the literals 4 and 5. In this case, By checking the return value and parameter type of add function and the type of result, the literals 4 and 5 are judged to be Double. Hence the call to the second Add function. In this example, the type judgment flow “looks” to be root -> leaf.

Note:

  • In the two examples above4.5It’s a literal, belonging tointeger_literalCan not be directly equivalent toInt, so its type also needs to be judged.
  • The type inference direction is quoted as “looks”. Because swift type resolutions don’t actually have a concept of direction, as explained below

Bi-directional Type Inference is common in ML-like languages, but it is not available in mainstream C++, Java, C#, Objective-C and other languages.

Second, type inference is based on a single expression or statement for the scope of inference

Swift limits the scope of type inference to a single expression or statement.

Swift is a strongly typed language and has a complex type system. In addition to ordinary expression type inference, it also needs to deal with generics, function overloading, protocol constraints and other complex situations. Therefore, Swift type inference limits the scope to a single expression or statement for the sake of execution performance and more accurate diagnosis.

Type inference is based on constraint system implementation

Like layout constraints, type inference is based on type constraints, such as equality constraints. Transformational constraint; Member constraint; Subclass constraints and so on. Swift is designed so that the constraint system itself can be decoupled from the solution process. The definition and description of the constraint itself will not change basically, and the solving process can be continuously optimized.

workflow

As mentioned above, the process of Swift type inference is based on a single expression or statement. In the solution process, this process can be carried out in a concept called Scope. For a Scope solution, the type inference process is as follows:

Constraint Generation

The first step is to generate all type constraints for this Scope solution. This process may be generated based on some existing context (such as the solution result of the previous Scope).

The Type Checker traverses the AST structure tree and generates a set of constraints that describe the relationship between all types of an expression and a subexpression, such as member Type constraints, parent-child Type constraints, conversion Type constraints, and so on. Some undecided types are marked as variables, using Tx (T0,T1,….) These type variables are then resolved in the Solving phase.

Constraint Solving

When all type constraints in Scope are generated, Swift type inference will proceed to this step, which is also the most important process in type inference. It calculates all undetermined type variables through type constraints generated in the first stage and solving strategies corresponding to different types.

The Solving process contains a large number of strategies, which may be converted to function type or obtain multiple similar solutions. Solving process has a set of Solution comparison strategies to compare the best Solution results.

Of course, there may be errors due to code type mismatch during this solution process. In this case, Swift type inference will record the error and attempt to repair it, which will be used to generate the compiler’s diagnosis and finally reported by the compiler as warning or error.

Solution Application

After Solving successfully for all types of variables, Swift type inference assigns the results to the corresponding types through the Rewrite AST process. In this step, All types must be explicitly resolved as Concrete types.

Solving process analysis

Step

Solving as the core process of type inference, the whole process can be understood as SolverStep Solving process for each type constraint. Several steps of different types are analyzed Step by Step and a complete solution is finally obtained. In the source code, the base class representing the solution step is SolverStep, whose simplified source dock file definition is as follows:

class SolverStep { protected: ConstraintSystem &CS; StepState State; // Step status (Setup, Ready, Running, Suspended, Done) SmallVectorImpl<Solution> &Solutions; // public: // initialize the method that holds the passed constraint system (get the constraint), Explicit SolverStep(ConstraintSystem &cs, SmallVectorImpl<Solution> & Solutions) : explicit SolverStep(ConstraintSystem & CS, SmallVectorImpl<Solution> & Solutions) CS(cs), Solutions(Solutions) {} Virtual ~SolverStep() {} // Destructor method Virtual void setup() {} // Setup method Virtual StepResult take(bool  prevFailed) = 0; // take (); Return StepResult virtual StepResult Resume (bool prevFailed) = 0; // Resume, called when the suspended state is aroused. Return StepResult}Copy the code

SolverStep abstracts the function of the solution step, and its internal state flow diagram is shown below

SolverStep’s flow is mainly supported by 5 states + 2 functions. The meanings of the five states respectively represent

State of the Step role StepResult can be produced
Setup Step is in its initial state Do not produce the Result
Ready Step is in the executable state Do not produce the Result
Running Step is executing Do not produce the Result
Suspended Step is suspended, the solution is still not completed, waiting to be restored Unsolved
Done Step has ended. The solution succeeded/failed Solved / Error

The flow of these five states is very important to the solution process of this Step, and its state flow can be driven internally or operated externally.

For the take() and resume() functions, they drive the logic of the actual solution.

  • take()Function is generally regarded as the core logic of the solution process, and different Step types have different target logic.
  • resume()Function is used as the entry point after the suspended state is awakened. It is usually judged whether the execution can continue based on the previously saved data combined with the operation of the wake. If the execution can continue, the call will continuetake()Function.

SolverStep is a base class that abstracts Step state flows. The actual solving process is realized in the subclass, which has four subclasses, among which the BindingStep subclass is subdivided into two subclasses. Their functions and characteristics are shown in the following table:

The name of the subclass The main role
SplitterStep The entire Constraint Graph is broken down into solvable components using algorithms and then merged
DependentComponentSplitterStep It is responsible for merging dependent solutions with other components
ComponentStep The part that can be solvable independently can produce a BindingStep
BindingStep It is mainly used to do type binding and can be regarded as the basic execution unit of type solving system. It has two subclasses, TypeVariableStep and DisjunctionStep

Solver

SolverStep is a step for solving, so the whole must have a “hand” for scheduling all kinds of steps. This process is reflected in the solveImpl function in the cssolver.cpp file, the core source code is as follows:

Void ConstraintSystem: : solveImpl (SmallVectorImpl < Solution > & solutions) {/ / set to whole ConstraintSystem Solving phase setPhase(ConstraintSystemPhase::Solving); / / to ensure that the function exits, the entire set ConstraintSystem Finalization stage SWIFT_DEFER {setPhase (ConstraintSystemPhase: : Finalization); }; Scope SolverScope Scope (*this); // create workList SmallVector< STD ::unique_ptr<SolverStep>, 16> workList; Push_back (STD ::make_unique<SplitterStep>(*this, solutions)); // Initialize to false bool prevFailed = false; // advance drive function, Auto advance = [](SolverStep * Step, Bool prevFailed) -> StepResult {auto currentState = step->getState(); If (currentState == StepState:: setup) {step->setup(); step->transitionTo(StepState::Ready); } currentState = step->getState(); Step ->transitionTo(StepState:: running); // Check the state before running. If it is ready, call take. If suspended, call resume to invoke return currentState == StepState::Ready? step->take(prevFailed) : step->resume(prevFailed); }; // Execute step tasks in LIFO order while (! Worklist.empty ()) {// take the last (latest) step auto &step = worklist.back (); StepResult auto result = advance(step.get(), prevFailed); Switch (result.getkind ()) {// The error can also be considered complete, indicating that no result was found for this step. You need to exit step. case SolutionKind::Error: fallthrough; WorkList case SolutionKind::Solved: {worklist.pop_back (); break; Case SolutionKind::Unsolved: break; } error prevFailed = result.getkind () == SolutionKind:: error; // Update the worklist based on the result of result, add a new step to the end, or do nothing (add 0) result.transfer(worklist); }}}Copy the code

You can add steps to the top of the stack. When the top step is Solved (or Error), the step is removed from the stack, and the next step is Solved until the stack is empty. The whole scope is solved.

Problem into

So let’s look at it in Log

Now that we have an understanding of the process of type inference, we return to the problem, and at this point we look at the Log output and have a basic conceptual understanding. For our expression, the structure of the Log for the entire solution process looks something like this:

// Constraint solving at let result = instance.foo{$0}. Foo {true [/Users/Rico/Rico/Program/TestOnMac/TestOnMac/main.swift:62:15 - line:62:42]--- ---Initial constraints for the given expression--- Score: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Type Variables: $T0 : ...... $T1 : ...... . . . $T13: ..... Active Constraints: $T4 arg conv (String) -> $T3 $T11 arg conv ($T3) -> $T10 Inactive Constraints: $T4 closure can default to ($T5) -> $T6 $T11 closure can default to () -> $T12 Opened Types: .... ---Solver statistics--- Maximum depth reached while exploring solutions: 9 Time: 7.490000e+00ms -Solution- Fixed score: 0000 2 0000 00 $T0 as Transform<String> $T1 as ((String) -> ()) -> TransformB<()> . . . $T13 as BoolCopy the code

As we can see, the Log output basically represents the solution described above. The whole Log is a little long, so let’s do it step by step.

Type variable

First, type-checker takes the original AST structure corresponding to the expression and iterates through it to generate 14 Type variables. In order to understand the subsequent solution process more clearly, what these 14 type variables represent is disassembled in detail.

Review the source code again

struct Transform<E> { let element: E func foo<T>(transform: (E) -> T) -> TransformB<T> { let mappedElement = transform(element) return TransformB<T>(element: mappedElement) } } struct TransformB<E> { let element: E func foo<T>(transform: (E) -> T) -> T {return transform(element)}} // Instance is resolved to TransformB<String> let instance = transform(element: Let result = instance.foo{$0}. Foo {true}Copy the code

These 14 type variables represent the following types:

Type variable instructions Can be equivalent to the
T0 The type of the Instance Transform<String>
T1 The type of the first foo function ((String) -> $T3) -> TransformB<$T3>
T2 The first entry to foo’s transform (E) String
T3 The first function foo’s transform output argument (T)
T4 The type of the first foo function transform ($T2) -> $T3
T5 The first closure {$0}Tuple Element $0 type
T6 The result of the first closure
T7 The result of the first foo function TransformB<$T3>
T8 The type of the second foo function (($T3) -> $T10) -> $T10
T9 The input parameter to the second foo function transform (E) $T3
T10 The output parameter (T) of the second foo function transform
T11 The type of the second foo function transform ($T3) -> $T10
T12 The result of the second closure
T13 The result of the second foo $T10

As you can see, there are 14 type variables generated in a single expression, mainly because there are many type variables generated in a single closure. Isn’t the closure type directly the transform parameter type? The type of the closure itself is essentially specified by the type in the closure syntax (e.g. {(STR: String) -> Bool in…..) }), but in this case, because the expression is simple enough, we omit the explicit designation of the closure type, so the corresponding type variable is produced because it is undecided.

Type resolution result

Because the solution is a little bit complicated, let’s skip to the results first, have an expectation of what the result will be, and then look at the process with the result, it will be clearer. The solution results are as follows:

Type variable instructions The resolution results
T0 The type of the Instance Transform<String>
T1 The type of the first foo function ((String) -> ()) -> TransformB<()>
T2 Input parameter to the first transform (E) String
T3 The output parameter of the first transform (T) (a)
T4 The type of the first transform (String) -> ()
T5 Tuple Element $0 type String
T6 The result of the first closure (a)
T7 The result of the first foo function TransformB<()>
T8 The type of the second foo function (() -> Bool) -> Bool
T9 Input parameter to the second transform (E) (a)
T10 The output parameter of the second transform (T) Bool
T11 Type of the second transform () -> Bool
T12 The result of the second closure Bool
T13 The result of the second foo Bool

As you can see, there are a lot of types that are determined as phi (), so there must be some problems here. Now, let’s analyze the solution in log.

Solution process Log analysis

Here is the solution in Log (simplified)

($T4 involves_type_vars bindings={(subtypes of) (String) -> $T3}) (attempting type variable $T4 := (String) -> $T3 ($T5 involves_type_vars bindings={(supertypes of) String}) ($T11 involves_type_vars bindings={(subtypes of) ($T3) -> $T10}) (attempting type variable $T5 := String ($T6 involves_type_vars bindings={(supertypes of) String; (supertypes of) ()}) ($T11 involves_type_vars bindings={(subtypes of) ($T3) -> $T10}) Initial bindings: $T6 := String, $T6 := () (attempting type variable $T6 := String ($T3 potentially_incomplete fully_bound involves_type_vars bindings={(supertypes of) String}) ($T11 involves_type_vars bindings={(subtypes of) ($T3) -> $T10}) (attempting type variable $T11 := ($T3) -> $T10 (increasing score due to function conversion) ($T3 potentially_incomplete bindings={(supertypes of) String; (subtypes of) ()}) Initial bindings: $T3 := String, $T3 := () (attempting type variable $T3 := String (failed constraint $T3 subtype () ) (attempting type variable $T3 := () (failed constraint $T6 subtype $T3 ) ) ) (attempting type variable $T6 := () (increasing score due to function conversion) ($T3 potentially_incomplete fully_bound involves_type_vars bindings={(supertypes of) ()}) ($T11 involves_type_vars bindings={(subtypes of) ($T3) -> $T10}) Initial bindings: $T11 := ($T3) -> $T10 (attempting type variable $T11 := ($T3) -> $T10 (increasing score due to function conversion) ($T3  potentially_incomplete bindings={(supertypes of) ()}) ($T17 literal=3 involves_type_vars bindings={(subtypes of) (default from ExpressibleByBooleanLiteral) Bool}) Initial bindings: $T3 := () (attempting type variable $T3 := () ($T17 literal=3 involves_type_vars bindings={(subtypes of) (default from ExpressibleByBooleanLiteral) Bool}) Initial bindings: $T17 := Bool (attempting type variable $T17 := Bool ($T12 involves_type_vars bindings={(supertypes of) Bool; (supertypes of) ()}) Initial bindings: $T12 := Bool, $T12 := () (attempting type variable $T12 := Bool ($T10 potentially_incomplete bindings={(supertypes of) Bool}) Initial bindings: $T10 := Bool (attempting type variable $T10 := Bool (found solution 0 0 0 0 0 0 0 2 0 0 0 0 0 0) ) ) (attempting type variable $T12 := () (increasing score due to function conversion) (solution is worse than the best solution) ) ) ) ) ) ) )Copy the code

The expression information of Log is limited, but it can be seen that the Solving process is a working mechanism similar to the Solving process. In this Log, we can get the following key information:

There is one failure in the whole solution process, specifically the type of $T3, which is the type T of the transform function. Thus backtracking causes $T6 to fail as String.

This leads directly to the problem. The result of the first closure, as well as the output parameter T of the transform function, is resolved to be of type (). So let’s take a look at what led to the failure of T3 resolution.

(attempting type variable $T3 := String
     (failed constraint $T3 subtype () 
)

(attempting type variable $T3 := ()
     (failed constraint $T6 subtype $T3 
  )
Copy the code

First, try T3 as A String and find that it does not satisfy T3 subtype (), where A subtype B means that A is equal to B or that A is A subtype of B. So the () type, which is Void, cannot be a parent of String. So where exactly do you require “T3 subtype ()”? And there was a lot of confusion, and then after a little bit of guesswork, it turned out that this constraint was already in the Log. But not in the solving process, but in the generation of constraints:

Active Constraints:
  $T4 arg conv (String) -> $T3 
  $T11 arg conv ($T3) -> $T10 

Inactive Constraints:
  $T4 closure can default to ($T5) -> $T6 
  $T11 closure can default to () -> $T12
Copy the code
  • $T11 arg conv ($T3) -> $T10
  • $T11 closure can default to () -> $T12

$T11 can default to () -> $T12 and $T11 is equivalent to ($T3) -> $T10.

T6 is the return result of the closure. T3 represents the type T in E->T in the parameter signature. Therefore, T6 needs to be T3 or a subclass of T3. At this time, T6 has already tried to be a String type, but String is not a subtype of Void. The solution failed this time.

Going back to T6, after T6 failed to resolve String, T6 tried to resolve (). We found that the subsequent process did not encounter any failures, and finally the resolution was completed. And then you get what you get in the Log.

Some of you might find in Log, Solution: 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0. The score of this Solution can be simply understood as the “compromise” value in the process of solving. The 14 zeros respectively represent 14 situations, and when the corresponding situation occurs, the flag bit will be +1 point. At the end of the type-checker comparison, the best solution score is 14 zeros. You can see that the final score in this case is actually “0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0” because of the two function type conversions.

Here, from the Log, we can make a few guesses:

  1. The key point of the problem is that the judgment of T3 and T6 is consistent with the phenomenon we see on the surface, that is, the key of the problem is{$0}The type of the return value. We can see that the solution has already tried to determine that T3 and T6 are strings, but because(a)Type of constraint interference, ultimately unsuccessful.
  2. The final resolution is not perfect, two function type conversions are required, and the actual result of the closure can be implicitly converted to(a)Type.

The clues all point to one question:

(a)Where does type come from?

Source code for the answer

Now that the information in the Log has been mined, let’s go to the source code to find the answer. We found that many of the constraints in the solution caused the type to be judged to be Void, so we looked for clues where the constraints were generated. Swift type inferred source in the semantic analysis phase, folder location in lib/Sema. Sure enough, the key information was found where the closure closure constraint was generated.

As we walk through the AST structure, we see that the visitClosureExpr implementation calls a function called inferClosureType. What this function does is generate the closure type. A simplified version of the function source code is as follows:

FunctionType *inferClosureType(ClosureExpr *closure) { SmallVector<AnyFunctionType::Param, 4> closureParams; If (auto *paramList = closure->getParameters()) {for (unsigned I = 0, n = paramList->size(); i ! = n; ++i) { // ..... Omit closureParams. Push_back (param - > toFunctionParam (externalType)); } } auto extInfo = CS.closureEffects(closure); // Closure expressions always have function type. In cases where a // parameter or return type is omitted, a fresh type variable is used to // stand in for that parameter or return type, Where the closure expression has a template type, the parameter or return type can be easily omitted, allowing it to be true. A new Type variable Type resultTy = [&] {// If there is an explicit return Type if (closure->hasExplicitResultType()) {const auto resolvedTy = . if (resolvedTy) return resolvedTy; } // If no return type was specified, Create a fresh type // variable for it and mark it as possible hole. Create a new typeVariable return Type (CS. CreateTypeVariable (CS) getConstraintLocator (closure, ConstraintLocator: : ClosureResult), shouldTypeCheckInEnclosingExpression(closure) ? 0 : TVO_CanBindToHole)); } (); FunctionType* f = FunctionType::get(closureParams, resultTy, extInfo); return f; }Copy the code

As we know from this source code, if the closure’s parameter type and return type are not explicitly specified, the constraint system generates type variables in place of variables that have not yet been determined. The parameters of closures are determined by the AST module function.

In other words, if the argument list is not specified, the closure is equivalent to a function with no arguments, that is, the argument type is ()! A closure that says _ in is treated as a parameter, even if it ignores the value of the parameter, and generates a type variable. Similarly, a closure that uses $0 is treated as a parameter.

The result parameters of the closure are determined by the resultTy code block, and the comment explicitly states that a new TypeVariable will be created for the closure that omits the return.

The following illustration illustrates the effect of different ellipses on type inference variables:

$T11 Closure can default to () -> $T12; $T11 closure can default to () -> $T12; We seem to have found the immediate cause of the problem.

If {$0} and {return $0} generate the same number of type variables, then {$0} and {return $0} generate the same number of type variables.

instance.foo{return $0}.foo{ true }

So this will compile but why is that?

The reason is in the second key code:

ConstraintSystem::TypeMatchResult ConstraintSystem::matchTypes(Type type1, Type type2, ConstraintKind kind, TypeMatchOptions flags, ConstraintLocatorBuilder locator) { //......... if (auto elt = locator.last()) { if (kind >= ConstraintKind::Subtype && (type1->isUninhabited() || type2->isVoid())) { // A conversion from closure body type to its signature result type. if (auto resultElt = elt->getAs<LocatorPathElt::ClosureBody>()) { // If a single statement closure has explicit `return` let's // forbid conversion to `Void` and report an error instead to // honor user's intent. if (type1->isUninhabited() || ! resultElt->hasExplicitReturn() ) { increaseScore(SK_FunctionConversion); return getTypeMatchSuccess(); }}}} //....... }Copy the code

This code is where type matching takes place during solving solving, and you can see that type2 (the target type) can be converted to type () if the closure is a single-statement expression and the return keyword is omitted. And increases the score because of FunctionConversion function type conversion. This phenomenon is also seen in Log.

At this point, the whole question is probably settled. We can make a conclusion that the causes of the problem are as follows:

For let result = instance.foo{$0}.foo{true}

  1. Because the second closure neither writes out a parameter list nor uses anonymous variables, the closure is judged to be valid() -> T0Type.
  2. When determining the return value type of the first closure, the String type is not sufficient(a)The type resolution fails.
  3. Because the first closure is omittedreturnKeyword, so the closure return value type can be determined(a)Type, but at this point the compiler has compromised and is not the best solution.

Now let’s review the previous closures, which seem to be a lot clearer:

// The compiler passes, Instance. foo{$0}. Foo {true} instance.foo{e in e}. Foo {return true} instance.foo{$0}  instance.foo{e in return e}.foo{ true } instance.foo{e in return e}.foo{ return true } instance.foo{return $0}.foo{ True} // The compiler passes, Foo {_ in true} instance.foo{return $0}. Foo {_ in true} instance.foo{e in return e}. Foo {_ in true }Copy the code

It is possible to categorize the causes of the above three phenomena in cases where both closures are single statement expressions:

  • If the second closure specifies parameters, then the entire expression logic works as expected

  • If the second closure does not write arguments:

    • The first closure has a return statement, so compile only
    • The first closure does not have a return statement, so it will compile, but the actual running logic will not be as expected.

Let’s verify the above logic from Log:

Let result = instance.foo {$0}. Foo {_ in true}

As you can see, there is an extra T12 variable in the generation of type variables. This variable refers to the _ in parameter, which is omitted. Thus no () type is produced. Then, in the process of solving the constraint, there is no constraint of type (), and it is automatically resolved to String.

Let result = instance.foo {return $0}.foo {true}

You can see that the generated variables are consistent with the source expression. But the process is slightly different:

When trying to use T6 as (), because the first closure explicitly writes the return keyword, the closure result cannot be converted to type () according to the second source logic. T6 resolution fails, and therefore the entire resolution process fails.

The problem summary

At this point, we can summarize all the questions and puzzles we have encountered:

Why does the way the second closure is written affect the logic of the first closure?

Answer: Because Swift type inference is based on the whole expression Scope, it is an overall resolution, essentially without any difference between before and after.

Q: Why is the exit of {$0} closure inferred to be of type ()?

Answer: In Swift type inference, because the latter closure omitted parameters, the input parameter is directly judged to be of type (), thus affecting the outgoing parameter type of the former closure in the solution process.

Q: Why does the latter closure add _ in to the desired result?

A: The function of _ in means that the latter closure has parameters, which will be used for overall judgment in the process of type judgment, so as to solve the expected result.

Q: Why does the second statement fail when two call statements are separated?

A: Swift type inference is based on a single expression statement. Unlike the first question, the first foo call already gets a TransformB type. However, because the second closure omits parameters, the input parameter () of the second closure conflicts with the output parameter String of the first closure that has already been resolved. So the compiler error “ambiguous”

Summary and follow-up

conclusion

Swift type inference is a very complicated process, believe you also encountered in the process of encoding various compiler error, but when coding, especially like RxSwift continuous closures invoked, generally we are all in the compilation of but, modified automatically by the compiler diagnosis to directly, such as tip insert _ in such a situation, Without exploring the process behind it. If you encounter scenarios like the one in this article, you may encounter unexpected logic errors. Some people may say that the CASE of Rxswift in this article is not common. In fact, there is a more easy to understand case, friends can try it:

struct Data {}

let arr = [1, 2, nil, 3]

let mappedArr = arr.compactMap { $0 }.map { Data() } // result: { Data, Data, Data, Data }
Copy the code

Although the compiler does issue warnings, they can be easily ignored in real large projects, and the results can be confusing.

So some of you might be asking, how do you avoid these kinds of hidden pits? There seems to be no good way to write out all the closure arguments and result types, other than to pay attention to any suspicious warnings from the compiler. Also, I always thought that this seemed to be a bad case of the compiler’s marginal type-solving strategy to satisfy the various syntactic sugar of closures, so I gave Swift a bug to see what the compiler developer would say. There are not many reasonable explanations.

To explore the experience

The study of part of the source code is just for the problems encountered to find some source key points, to their doubts to get verification. Compared with the whole Swift Type Inference, it can be said to be the tip of an iceberg. For example, what are the specific types of Type constraints? What is Constraint Graph? What is the logic of solving Step? How are generic types inferred? They are also worth exploring.

Thanks for reading this long article. The sequence of the whole article is my real exploration process. I found problems in business scene coding, explored the realization of Swift type inference that I never knew about, debugging and running constantly, found key clues, and later asked questions for Swift project and sought official help. The whole process was still fruitful. When encountering problems in daily development, if you have energy, it is suggested to keep a thorough attitude. Not only the problems themselves, but also the process of looking for problems and thinking will bring great harvest, especially for open source projects, such as Swift standard library, compiler and related projects, which are worth studying.

Join us

Feishu – Bytedance’s enterprise collaboration platform is a one-stop enterprise communication and collaboration platform integrating video conferencing, online documents, mobile office and collaboration software. At present, Feishu business is developing rapidly, and there are R&D centers in Beijing, Shenzhen and other cities. There are enough HC positions in front end, mobile terminal, Rust, server, testing and product, etc. We are looking forward to your joining us and doing challenging things together (please link: future.feishu.cn/recruit).

We also welcome the students of flying book to communicate with each other on technical issues. If you are interested, please click on flying book technology exchange group to communicate with each other.