This article is translated from the official blog New Diagnostic Architecture Overview by zhihu user “Man Man Busy”.

Diagnostics plays an important role in the programming language experience. One of the things developers are very concerned about when writing code is that the compiler can provide appropriate guidance and point out problems in any situation (especially if the code is incomplete or invalid).

In this blog post, we want to share some important updates to the upcoming Swift 5.2 to improve the diagnostic features of the new version. This includes the compiler’s new strategy for troubleshooting, originally part of the Swift 5.1 release, which introduces some exciting new results and improved error messages.

challenge

Swift is an expressive language with a rich type system that has many features such as class inheritance, protocol consistency, generics, and overloading. As much as we as programmers do our best to write well-formed code, sometimes we need a little help. Fortunately, the compiler knows what Swift code is valid or invalid. The problem is to do a better job of telling you what’s wrong, where it is, and how to fix it.

The compiler does a lot to make sure programs are correct, but the focus of this work has been on improving the type checker. The Swift type checker enforces rules about how types are used in source code and tells you when you violate them.

For example:

struct S<T> { init(_: [T]) {} } var i = 42 _ = S<Int>([i!] )Copy the code

The following diagnoses are produced:

error: type of expression is ambiguous without more context
Copy the code

Although this diagnosis points to genuine errors, it is not very helpful because it is not clear cut. This is because the old type checkers were mostly used to guess the exact location of errors. This works in many cases, but there are still many programming errors that users can’t accurately identify. To address this problem, we are developing a new diagnostic architecture. Instead of guessing where errors occur, type checkers try to “fix” problems as they occur and remember the fixes applied. Not only does this allow the type checker to pinpoint errors in a wider variety of programs, it also allows it to expose more failures in advance.

Overview of type inference

Since the new diagnostic framework is tightly integrated with the type checker, we need to discuss type inference first. Please note that this is just a brief introduction. For more details on type checking, see Compiler’s Documentation on the Type Checker [1].

Swift uses constraint-based type checker to realize bidirectional type inference, which reminds people of the classical Hindley-Milner[2] type inference algorithm [3] :

• The type checker converts source code into a constraint system that represents the relationships between types in the code.

• Type relationships are expressed through type constraints that either make demands on a single type (for example, it is an integer literal type) or relate two types (for example, one type can be converted to another).

• The type described in the constraint can be any type in the Swift type system, including tuple type, function type, enumeration/structure/class type, protocol type, and generic type. In addition, the type can be a type variable represented as $

.

• A type variable can be used in any other type; for example, the type variable $Foo is used in a tuple type ($Foo, Int).

Constrain the system to perform three steps:

• Create constraints

• Solving constraints

• Apply solutions

The phases of the diagnostic process are constraint generation and solution.

Given an input expression (and sometimes other context information), the constraint solver generates the following information:

• A set of type variables representing the abstract type of each subexpression

• A set of type constraints that describe the relationships between these type variables

The most common type of constraint is binary constraint, which involves two types and can be expressed as:

type1 <constraint kind> type2
Copy the code

Common binary constraints are:

• $X

Y – Bind the type variable $X to the fixed type Y

• X

Y – The conversion constraint requires that the first type X be convertible to a second Y, including subtypes and equivalent forms

• X

Y – Specifies that the first type X must conform to PROTOCOL Y

• (Arg1, Arg2…) $Function – the “applicable Function” constraint requires both types to be Function types with the same input and output types

After constraint generation is complete, the solver will attempt to assign concrete types to each type variable in the constraint system and generate solutions that satisfy all constraints.

Let’s take a look at the following example:

func foo(_ str: String) {
  str + 1
}
Copy the code

It was easy for us to see that there was a problem with the expression STR + 1 and where the problem lay, but the type inference engine had to rely on the constraint simplification algorithm to determine the problem.

As we discussed earlier, the constraint solver first generates constraints for STR, 1, and +. Each distinct child of an input expression, such as STR, is represented as follows:

• Specific type (know in advance)

• A type variable represented by $

can assume any type that satisfies the constraints associated with it.

After the constraint generation phase is complete, the constraint system of the expression STR + 1 will have a combination of type variables and constraints. Let’s take a look.

Type variable

• $Str represents the type of the variable Str, which is the first argument in the + call

• $One represents the type of literal 1, which is the second argument in the + call

• $Result represents the Result type of the call to operator +

• $Plus represents the type of operator + itself, which is a collection of overloaded methods.

The constraint

$Str <bind to> String

The STR argument has a fixed String type.Copy the code

• $One < conforms to > ExpressibleByIntegerLiteral

Due to the Swift, the integer literal (such as 1) can use any type meet ExpressibleByIntegerLiteral agreement (such as Int or Double), so the solver can only rely on the information in the beginning.Copy the code

• $Plus

disjunction((String, String) -> String, (Int, Int) -> Int,…)

The operator '+' forms a disjoint set of choices, where each element represents a separate overloaded type.Copy the code

• ($Str, $One) -> $Result < pp

The type of '$Result' is not clear; It can be determined by testing each overload of '$Plus' using a parameter tuple ($Str, $One).Copy the code

Note that all constraint and type variables are associated with specific positions in the input expression:

The inference algorithm tries to find suitable types for all type variables in the constraint system and tests them against the associated constraints. In our example, $One can be an Int or Double type, because both types ExpressibleByIntegerLiteral protocol conformance requirements. However, simply enumerating all possible types of each “empty” type variable in a system is very inefficient, because many types can be tried when constraints on a particular type variable are insufficient. For example, $Result has no restrictions, so it can take any type. To workaround this problem, the constraint solver first tries to separate the options, which allows the solver to narrow down the range of possible types for each type variable involved. For $Result, this reduces the number of possible types to only the Result types associated with $Plus’s overloaded options, not all possible types.

Now it’s time to run the inference algorithm to determine the types of $One and $Result.

Single round inference algorithm execution steps

• First bind $Plus to its first disjunction option (String, String) -> String

• You can now test the Applicable to constraint because $Plus is bound to the concrete type. ($Str, $One) -> $Result -> String; ($Str, $One) -> $Result -> String The processing process is as follows:

Add new conversion constraint to match argument 0 with parameter 0 - '$Str <convertible to> String' Add new conversion constraint to match argument 1 with parameter 1 - $One <convertible to> String converts $Result to String because the Result type must be equalCopy the code

• Some of the new constraints can be tested/simplified immediately, for example:

$Str <convertible to> String is true because $Str already has a fixed type String and String can be converted to itself by assigning some type of String to $Result according to the equality constraintCopy the code

• At this point, the only remaining constraint is:

$One <convertible to> String
$One <conforms to> ExpressibleByIntegerLiteral
Copy the code

• The possible types for $One are Int, Double, and String. That’s interesting, because all the possible types can not meet the needs of all remaining constraints: Int and Double cannot be converted to a String, the String is not in conformity with the ExpressibleByIntegerLiteral agreement

• After trying all possible types for $One, the solver will stop and consider the current type set and overload selection failed. The solver then goes back and tries the next disjunction selection of $Plus.

As we can see, the location of the error will be determined by the solver when it executes the inference algorithm. Since no possible type matches $One, it should be considered an error location (because it cannot be bound to any type). Complex expressions can have multiple such locations because existing errors lead to new ones as the inference algorithm executes. In order to reduce the range of error locations in this case, the solver selects only the fewest possible solutions.

At this point, we have a more or less clear idea of how to identify error locations, but how to help solvers make progress in such cases is not clear enough to come up with a complete solution.

The solution

The new diagnostic architecture employs “constraint fix” techniques to try to resolve inconsistencies in which solvers fall into situations where they cannot try other types. Our sample solution is to ignore the String is not in conformity with the ExpressibleByIntegerLiteral agreement. The purpose of the fix is to be able to capture all useful information about the error location from the solver and use it for subsequent diagnostics. This is the main difference between the current approach and the new one. The former tries to guess the wrong location, while the new method has a symbiotic relationship with the solver, which provides it with all the wrong locations.

As mentioned earlier, all type variables and constraints contain information about their relationship to the subexpression from which they derive. Such relationships, combined with type information, can easily provide tailored diagnostics and fixes for all the problems diagnosed through the new diagnostics framework.

In our example, the type variable $One has been determined to be the error location, so the diagnostic program can examine how the input expression uses $One: $One said to the operator + location # 2 parameters in the call, and is known issues with the String is not in conformity with the ExpressibleByIntegerLiteral agreement about the fact that. From all this information, one of the following two diagnoses can be formed:

error: binary operator '+' cannot be applied to arguments 'String' and 'Int'
Copy the code

About the second parameter is not in conformity with the ExpressibleByIntegerLiteral agreement, simplified is:

error: argument type 'String' does not conform to 'ExpressibleByIntegerLiteral'
Copy the code

Diagnosis involves a second parameter.

We chose the first scenario and generated diagnostics and comments about the operator for each partial matching overload selection. Let’s take a closer look at the inner workings of the described method.

Diagnostic anatomy

When a constraint failure is detected, a constraint fix is created to capture some information about the failure:

• The type of failure that occurs

• Where the failure occurred in the source code

• The types and declarations involved in the failure

The constraint solver caches these corrections. Once a solution is found, it looks at the fixes in the solution and generates actionable errors or warnings. Let’s see how it all works together. Consider the following example:

func foo(_: inout Int) {}

var x: Int = 0
foo(x)
Copy the code

The problem here relates to the argument x, which cannot be passed as an argument to the inout argument without the explicit use of &.

Now, let’s look at the type variables and constraints for this example.

Type variable

There are three type variables:

$X := Int
$Foo := (inout Int) -> Void
$Result
Copy the code

The constraint

These three types have the following constraints

($X) - >$Result <applicable to> $Foo
Copy the code

The inference algorithm will try to match ($X) -> $Result with (inout Int) -> Void, which produces the following new constraint:

Int <convertible to> inout Int
$Result <equal to> Void
Copy the code

Int cannot be converted to inout Int, so the constraint solver logs the failure as missing & [4] and ignores the

constraint.

By ignoring this constraint, the rest of the constraint system can be solved. The type checker then looks at the recorded Fix and throws errors describing the problem (missing &) and fix-it for inserting & :

error: passing value of type 'Int' to an inout parameter requires explicit '&'
foo(x)
    ^
    &
Copy the code

There is only one type error in this example, but the diagnostic schema can also resolve several different type errors in the code. Consider a slightly more complicated example:

func foo(_: inout Int, bar: String) {}

var x: Int = 0
foo(x, "bar")
Copy the code

When solving the constraint system for this example, the type checker will again log missing &’s failure for foo’s first argument. In addition, it will log a failure for the missing parameter bar. Once two failures are recorded, the rest of the constraint system is solved. The type checker then generates errors (using fix-its) for two problems that need to Fix this code:

error: passing value of type 'Int' to an inout parameter requires explicit '&'
foo(x)
   ^
    &
error: missing argument label 'bar:' in call
foo(x, "bar")
      ^
       bar: 
Copy the code

Documenting each specific failure and then moving on to the rest of the constraint system means that resolving these failures will result in a well-defined solution. This allows the type checker to generate viable diagnostics (often with fixes) that guide developers to use the correct code.

Examples of improved diagnostics

Lack of label

Consider the following invalid code:

func foo(answer: Int) -> String { return "a" }
func foo(answer: String) -> String { return "b" }

let _: [String] = [42].map { foo($0)}Copy the code

Previously, this produced the following diagnostic information:

error: argument labels '(_ :)' do not match any available overloads`
Copy the code

The new diagnostic information is:

error: missing argument label 'answer:' in call
let _: [String] = [42].map { foo($0) }
                                 ^
                                 answer:
Copy the code

Parameter to parameter conversion mismatch

Consider the following invalid code:

let x: [Int] = [1, 2, 3, 4]
let y: UInt = 4

_ = x.filter { ($0 + y)  > 42 }
Copy the code

Previously, this produced the following diagnostic information:

error: binary operator '+' cannot be applied to operands of type 'Int' and 'UInt'`
Copy the code

The new diagnostic information is:

error: cannot force unwrap value of non-optional type 'Int'_ = S<Int>([i!] ) ~ ^Copy the code

Members of the lost

Consider the following invalid code:

class A {}
class B : A {
  override init() {}
  func foo() -> A {
    return A() 
  }
}

struct S<T> {
  init(_ a: T...) {}
}

func bar<T>(_ t: T) {
  _ = S(B(), .foo(), A())
}
Copy the code

Previously, this produced the following diagnostic information:

Error: Generic parameter doesn't 'could not be inferredCopy the code

The new diagnostic information is:

error: type 'A' has no member 'foo'
    _ = S(B(), .foo(), A())
               ~^~~~~
Copy the code

Lack of protocol consistency

Consider the following invalid code:

protocol P {}

func foo<T: P>(_ x: T) -> T {
  return x
}

func bar<T>(x: T) -> T {
  return foo(x)
}
Copy the code

Previously, this produced the following diagnostic information:

error: generic parameter 'T' could not be inferred
Copy the code

The new diagnostic information is:

error: argument type 'T' does not conform to expected type 'P'
    return foo(x)
               ^
Copy the code

Condition in accordance with

Consider the following invalid code:

extension BinaryInteger {
  var foo: Self {
    return self <= 1
      ? 1
      : (2...self).reduce(1, *)
  }
}
Copy the code

Previously, this produced the following diagnostic information:

error: ambiguous reference to member '... '
Copy the code

The new diagnostic information is:

error: referencing instance method 'reduce' on 'ClosedRange' requires that 'Self.Stride' conform to 'SignedInteger': (2... self).reduce(1, *) ^ Swift.ClosedRange:1:11: note: requirement from conditional conformance of'ClosedRange<Self>' to 'Sequence'
extension ClosedRange : Sequence where Bound : Strideable, Bound.Stride : SignedInteger {
          ^
Copy the code

SwiftUI sample

Parameter to parameter conversion mismatch

Consider the following invalid SwiftUI code:

import SwiftUI struct Foo: View { var body: some View { ForEach(1... 5) { Circle().rotation(.degrees($0))}}}Copy the code

Previously, this produced the following diagnostic information:

error: Cannot convert value of type '(Double) -> RotatedShape<Circle>' to expected argument type '() - > _'
Copy the code

The new diagnostic information is:

error: cannot convert value of type 'Int' to expected argument type 'Double'
        Circle().rotation(.degrees($0))
                                   ^
                                   Double( )
Copy the code

Members of the lost

Consider the following invalid SwiftUI code:

Import SwiftUI struct S: View {var body: some View {ZStack {Rectangle().frame(width: 220.0, height: Rectangle()) 32.0). ForegroundColor (.systemred) HStack {Text(.systemred)"A")
        Spacer()
        Text("B")
      }.padding()
    }.scaledToFit()
  }
}
Copy the code

Previously, this was diagnosed as a completely unrelated problem:

error: 'Double' is not convertible to 'CGFloat? '
      Rectangle().frame(width: 220.0, height: 32.0)
                               ^~~~~
Copy the code

Now, the new diagnostics correctly indicate that colors such as systemRed do not exist:

error: type 'Color? ' has no member 'systemRed'
                   .foregroundColor(.systemRed)
                                    ~^~~~~~~~~
Copy the code

Missing parameter

Consider the following invalid SwiftUI code:

import SwiftUI

struct S: View {
  @State private var showDetail = false

  var body: some View {
    Button(action: {
      self.showDetail.toggle()
    }) {
     Image(systemName: "chevron.right.circle"ImageScale (.large).rotationEffect(.degrees(showDetail? 90:0)).scaleEffect(showDetail? 1.5: 1) .padding() .animation(.spring) } } }Copy the code

Previously, this produced the following diagnostic information:

error: type of expression is ambiguous without more context
Copy the code

The new diagnostic information is:

error: member 'spring' expects argument of type '(response: Double, dampingFraction: Double, blendDuration: Double)'
         .animation(.spring)
                     ^
Copy the code

conclusion

The new diagnostic architecture is designed to overcome all the shortcomings of the old approach. It is structured in a way that is designed to simplify/improve existing diagnostics and enable new feature implementers to provide excellent diagnostics. All of the diagnostics we have migrated so far have shown very promising results, and we are working on more transplants every day.

reference

[1]https://github.com/apple/swift/blob/master/docs/TypeChecker.rst

[2]https://en.wikipedia.org/wiki/Hindley%E2%80%93Milner_type_system

[3]https://en.wikipedia.org/wiki/Hindley%E2%80%93Milner_type_system#An_inference_algorithm

[4]https://github.com/apple/swift/blob/master/lib/Sema/CSFix.h#L542L554

[5]https://github.com/apple/swift/blob/master/lib/Sema/CSDiagnostics.cpp#L1030L1047