background

After Douyin went into Swift, there were occasional Segmentation fault: 11 and Illegal Instruction: 4 errors during compilation, both CI/CD and local, and they could be restored to normal after recompilation.

Because it is a Crash thrown by the compiler layer, and the error code prompted is not fixed and not mandatory, it is difficult for a while. There are many similar errors on the Internet, but Segmentation fault is a general error report for accessing the faulty memory and has little reference significance. I’ve talked to teams inside and outside the company, and I’ve encountered similar mistakes, but for different reasons, it’s hard to learn from them.

Although the related code will not participate in the compilation after the Swift library binarization, and the probability of local occurrence is greatly reduced, the source code is still used in the CI/CD/ warehouse binarization task. If any problem occurs, manual retry is required, which affects the efficiency and is cumbersome. Therefore, we go deep into the compiler to seek for a solution.

Crash stack

conclusion

In short, the Swift code uses NSDictionary variables declared as class attributes in OC as Swift dictionaries. An immutable variable is now used as a mutable variable. When the compiler fails to verify SILInstruction, abort() is actively called to end the process or EXC_BAD_ACCESS Crash occurs.

The preparatory work

Compile the Swift,

Due to local reproduction errors, pull swift-5.3.2-release that is consistent with the local version. Meanwhile, it is recommended to use VSCode for debugging and Ninja for building.

Ninja is a small build system focused on speed.

Matters needing attention

  • Reserve 50 GB disk space in advance
  • The first compilation took about an hour and the CPU was almost full

Download and compile the source code

brew install cmake ninja mkdir swift-source cd swift-source git clone [email protected]:apple/swift.git cd swift/utils /update-checkout --tag swift-5.3.2-release --clone./build-scriptCopy the code

The main directory

Extract compilation parameters

The author extracted the relevant code from the douyin project and extracted the compilation parameters from Xcode after reproducing the compilation error locally:

VSCode debugging

Select an appropriate LLDB plug-in and configure the following launch.json using CodeLLDB as an example.

The content of args is to obtain the compilation parameters extracted in the previous step. Each parameter is wrapped in double quotation marks in batches and separated by commas.

{" version ":" 0.2.0, "" configurations: [{" type" : "LLDB", "request" : "launch", "name" : "Debug", "the program" : "${workspaceFolder}/build/Ninja-DebugAssert/swift-macosx-x86_64/bin/swift", "args": ["-frontend","-c","-primary-file"/*and other params*/], "cwd": "${workspaceFolder}", } ] }Copy the code

SIL

LLVM

Before going into SIL, let’s briefly introduce LLVM. The classic LLVM architecture is shown in the following figure, which is divided into Frontend, Optimizer, and Backend. When a new language needs to be supported, only the front-end part needs to be implemented, and the back-end part needs to be supported by a new architecture. The connecting hub of the front and back ends is Intermediate Representation (IR). IR is independent of programming language and machine architecture, so the optimization of IR phase can be abstract and universal.

Frontend

The front end generates AST through Lexical Analysis, Syntactic Analysis, Semantic Analysis, Intermediate Code Generation and other steps to generate IR.

IR

format

IR is a bridging language for the front and back ends of LLVM. It has three main formats:

  • Readable format, ending with.ll
  • Bitcode format, ending in. BC
  • Runtime format in memory

These three formats are completely equivalent.

SSA

LLVM IR and SIL are both SSA(Static Single Assignment) forms. All variables in SSA form must be declared before use and can only be assigned once. The benefit of this implementation is that more efficient, deeper and more customized optimization can be carried out.

As shown in the figure below, after the code is transformed into SSA form, the variable can only be assigned once, and it is easy to judge that y1=1 is an assignment statement that can be removed by optimization.

structure

The infrastructure consists of Modules, each roughly equivalent to a source file. Module contains global variables and functions. Function corresponds to the Function, including the sound implementation of the method, parameters, and return values. The most important part of a Function is the Basic Block.

Basic Block(BB) corresponds to the control flow diagram of functions, which is a set of Instruction, and must end with Terminator Instructions, which represents the end of Basic Block execution and branch jump or function return.

Instruction corresponds to Instruction and is the basic unit of program execution.

Optimizer

IR is optimized by the optimizer, which invokes and executes various passes. By Pass, I mean the code that goes through the IR, that’s doing the processing. LLVM has a number of built-in passes, and developers can customize passes to implement specific functions, such as pile counting.

Xcode Optimization Level

In xcode-build setting-Apple clang-code generation-Optimization Level, you can select the Optimization Level. -o0 means no Optimization, that is, no Optimization Pass is called. Other optimization levels invoke the corresponding Pass.

Backend

The back end converts the IR into machine code that generates the corresponding CPU architecture.

Swiftc

Unlike OC, which uses Clang as the compiler front end, Swift has a custom compiler front end, Swiftc, as shown below.

This is where the benefits of the LLVM threefold come in that support for the new language is as simple as implementing the compiler front end.

Compared with CLANG, Swift adds the processing process for SIL. SIL is a new high-level intermediate language introduced by Swift to achieve higher levels of optimization.

Swift compilation process

Swift source code through lexical analysis, syntax analysis and semantic analysis to generate AST. SILGen generates an SIL after acquiring the AST. This SIL is called Raw SIL. After analysis and optimization, the Canonical SIL is generated. Finally, IRGen converts Canonical SIL into LLVM IR for optimizer and back end processing.

SIL instruction

We assume that there is no upper limit to the number of virtual registers. We name them %+ digits, such as %0, %1, etc.

  • alloc_stack: Allocates stack memory
  • apply: Calls a function by passing arguments
  • Load: Loads the value of the specified address from memory
  • function_ref: Creates a reference to an SIL function

For detailed SIL instruction parsing, refer to the official documentation.

Identifier

There are two basic types of LLVM IR identifiers:

  • Global identifier: Contains methods, global variables, etc., and starts with @
  • Local identifier: Contains the register name, type, etc., and starts with %, where the %+ number represents the unnamed variable variable

In SIL, the identifier begins with @

  • SIL Function names are all named with @+ letters/numbers and are usually mangle
  • SIL Value is also named %+ letter/number to indicate that it is referring to an instruction or Basic block parameter
  • @convention(swift)The Calling Convention of the Swift function is used by default
  • @convention(c)and@convention(objc_method)Means using the calling convention of C and OC, respectively
  • @convention(method)Represents the implementation of a Swift instance method
  • @convention(witness_method)Represents the implementation of the Swift Protocol method

SIL structure

SIL realizes a set of similar structure and IR, and realizes SILModule, SILFunction, SILBasicBlock, SILInstruction.

Debugging process

Reproducibility Crash

After the compilation parameters are set according to the previous preparations, the compilation is started and the Crash is reproduced. Both crashes are reproduced, as shown in the following figure. Abort () and EXC_BAD_ACCESS cause the above mentioned Illegal instruction: 4 and Segmentation fault: 11 errors. Because the upper stack of the two is consistent, the former is taken as an example for analysis.

The stack analysis

Through stack tracing, it can be seen that after the SILFunction is generated and postEmitFunction is executed to verify the validity of the SILFunction, SILVerifier layers are used to iterate and verify the BasicBlock(visitSILBasicBlock). Iterate the SILInstruction inside the BasicBlock (visitSILInstruction).

Calling getKind() while getting the type of SILInstruction returns an exception, triggering Crash.

Abnormal SIL

  • Because at this timeSILInstructionException, it is difficult to locate the exception when checking which instruction, so in traversalSILInstructionTo print the contents of the previous instruction.
  • Execute the following command to compile incrementally
cd build/Ninja-DebugAssert/swift-macosx-x86_64
ninja
Copy the code

The reproduced printed content is as shown in the figure below:

Debug tips: Many LLVM classes implement dump() to print content for debugging purposes.

// function_ref Dictionary.subscript.setter %32 = function_ref @$sSDyq_Sgxcis : $@convention(method) <τ_0_0, τ_0_1 where τ_0_0: Hashable > (@ in Optional < tau _0_1 >, the @in tau _0_0, @ inout Dictionary < tau _0_0, tau _0_1 >) - > () / / user: %33 %33 = apply %32<AnyHashable, Any>(%13, %11, %24) : $@convention(method) <τ_0_0, τ_0_1 where τ_0_0: Hashable > (@ in Optional < tau _0_1 >, the @in tau _0_0, @ inout Dictionary < tau _0_0, tau _0_1 >) - > () % = 34 load [take] % 24: $*Dictionary<AnyHashable, Any> // users: %43, %37Copy the code

Normal SIL

The swiftc-emhee-silgen command can be used to generate Raw SIL. Since this class refers to the OC file, add the bridge file compiler parameters. The complete command is as follows:

swiftc -emit-silgen /Users/cs/code/ThirdParty/Swift_MVP/Swift_MVP/SwiftCrash.swift -o test.sil  -import-objc-header /Users/cs/code/ThirdParty/Swift_MVP/Swift_MVP/Swift_MVP-Bridging-Header.h
Copy the code

Intercept part SIL as follows

%24 = alloc_stack $Dictionary<AnyHashable, Any> // users: %44, %34, %33, %31 %25 = metatype $@objc_metatype TestObject.Type // users: %40, %39, %27, %26 %34 = load [take] %24 : $*Dictionary<AnyHashable, Any> // users: %42, %36 %35 = function_ref @$sSD10FoundationE19_bridgeToObjectiveCSo12NSDictionaryCyF : $@convention(method) <τ_0_0, τ_0_1 where τ_0_0: Guaranteed NSDictionary (@guaranteed NSDictionary <τ_0_0, τ_0_1>) -> @owned NSDictionary // user: %37 %36 = begin_Borrow %34: $Dictionary<AnyHashable, Any> // users: %38, %37 %37 = apply %35<AnyHashable, Any>(%36) : $@convention(method) <τ_0_0, τ_0_1 where τ_0_0: Guaranteed NSDictionary (@guaranteed NSDictionary <τ_0_0, τ_0_1) -> @owned NSDictionary // users: %41, %40Copy the code

SIL analysis

Analyze normal SIL instruction by instruction

  1. Assign type isDictionary<AnyHashable, Any>Store its address in register %24, which is used by register %44, %34, %33, %31
  2. %25 indicates the typeTestObject.Type, i.e.,TestObjectThe type of metaType
  3. Load the value of register %24 into %34 and destroy the value of %24
  4. Create pair function_bridgeToObjectiveC()-> NSDictionaryIs stored in %35
  • Since the function name is mangle, first demangle the function name, as shown in the figure below, to get the function

  • @convention(method)Indicates a Swift instance method with two generic parameters, including the first parameterTau _0_0Hashable protocol is implemented
  1. Generates a value of the same type as %34 and stores it in %36. %34 remains until %36 is used
  2. Execute the function stored in %35, pass in %36, and returnNSDictionaryType, the result exists %37. What it does is it willDictionaryInto theNSDictionary

Dawn breaks

We can see that the execution of the bridge method _bridgeToObjectiveC() failed, so we look at the source code, found that an OC NSDictionary immutable type bridge Swift Dictionary becomes a variable type, its content is modified. Although this writing method may cause logic exceptions, it does not cause compiler Crash, which is a bug of compiler code. More interestingly, the compiler crashes only if this property is declared as a class attribute in OC.

class SwiftCrash: NSObject {
  func execute() {
    //compiler crash
    TestObject.cachedData[""] = ""
  }
}
Copy the code
@interface TestObject : NSObject
@property (strong, nonatomic, class) NSDictionary *cachedData;
@end
Copy the code

The solution

The source code to modify

Find the source of the error and fix it. Change the NSDictionary in the problem code to NSMutableDictionary.

Re-run the Swift compiler to compile the source code without error.

After modifying the douyin source code, there is no compiler Crash problem again, the problem is fixed.

Static analysis

A potential problem

Although NSDictionary can bridge into Swift Dictionary normal use, but when the Swift immutable object is modified, will generate a new object, the original object has no impact, test code and output results are as follows:

We can see that the contents of the variable temp have not changed, and the Swift code modification is invalid.

TestObject *t = [TestObject new];
t.cachedData = [@{@"oc":@"oc"} mutableCopy];
NSDictionary *temp = t.cachedData;
NSLog(@"before execution : temp %p: %@",temp,temp);
NSLog(@"before execution : cachedData %p: %@",t.cachedData,t.cachedData);
[[[SwiftDataMgr alloc] init] executeWithT:t];
NSLog(@"after execution : temp %p: %@",temp,temp);
NSLog(@"after execution : cachedData %p: %@",t.cachedData,t.cachedData);
Copy the code
class SwiftDataMgr: NSObject {
  @objc
  func execute(t : TestObject) {
    t.cachedData["swift"] = "swift"
  }
}
Copy the code

The new rules

Added to douyin source static detection rules, detection of all OC IMmutable class is modified in Swift. Prevents the compiler from crashing and potentially causing logic errors.

All classes to be tested are as follows:

NSDictionary/NSSet/NSData/NSArray/NSString/NSOrderedSet/NSURLRequest/
NSIndexSet/NSCharacterSet/NSParagraphStyle/NSAttributedString
Copy the code

Afterword.

At this point, the compiler Crash problem has been resolved. At the same time, when upgrading Xcode to version 12.5 recently, I encountered another compiler Crash and no specific error file was displayed. The author did the same to find the error and fix it. After in-depth analysis of the root cause of abnormal SILInstruction generation, another article summary.

In addition, I have submitted a bug report for the Swift compiler and attached a minimum reproducible demo, which can be downloaded here:

Bugs.swift.org/browse/SR-1…

Join us

We are the team responsible for basic capability development and new technology exploration of Tiktok client. We focus on engineering/business architecture, R&D tools, and compilation systems to support rapid business iteration while ensuring the r&d efficiency and engineering quality of a large team. We continue to explore performance/stability and strive to provide the ultimate foundation experience for hundreds of millions of users around the world.

If you are passionate about technology, welcome to join douyin basic technology team and let us build a global App with 100 million levels. Currently we have recruitment needs in Shenzhen, Beijing, Shanghai and Hangzhou.

Email address: [email protected], email subject: Name – Working years – Tiktok – Basic Technology -iOS/Android.