background
After Douyin went into Swift, there were occasional Segmentation fault: 11 and Illegal Instruction: 4 errors during compilation, both CI/CD and local, and they could be restored to normal after recompilation.
Because it is a Crash thrown by the compiler layer, and the error code prompted is not fixed and not mandatory, it is difficult for a while. There are many similar errors on the Internet, but Segmentation fault is a general error report for accessing the faulty memory and has little reference significance. I’ve talked to teams inside and outside the company, and I’ve encountered similar mistakes, but for different reasons, it’s hard to learn from them.
Although the related code will not participate in the compilation after the Swift library binarization, and the probability of local occurrence is greatly reduced, the source code is still used in the CI/CD/ warehouse binarization task. If any problem occurs, manual retry is required, which affects the efficiency and is cumbersome. Therefore, we go deep into the compiler to seek for a solution.
Crash stack
conclusion
In short, the Swift code uses NSDictionary variables declared as class attributes in OC as Swift dictionaries. An immutable variable is now used as a mutable variable. When the compiler fails to verify SILInstruction, abort() is actively called to end the process or EXC_BAD_ACCESS Crash occurs.
The preparatory work
Compile the Swift,
Due to local reproduction errors, pull swift-5.3.2-release that is consistent with the local version. Meanwhile, it is recommended to use VSCode for debugging and Ninja for building.
Ninja is a small build system focused on speed.
Matters needing attention
- Reserve 50 GB disk space in advance
- The first compilation took about an hour and the CPU was almost full
Download and compile the source code
brew install cmake ninja mkdir swift-source cd swift-source git clone [email protected]:apple/swift.git cd swift/utils /update-checkout --tag swift-5.3.2-release --clone./build-scriptCopy the code
The main directory
Extract compilation parameters
The author extracted the relevant code from the douyin project and extracted the compilation parameters from Xcode after reproducing the compilation error locally:
VSCode debugging
Select an appropriate LLDB plug-in and configure the following launch.json using CodeLLDB as an example.
The content of args is to obtain the compilation parameters extracted in the previous step. Each parameter is wrapped in double quotation marks in batches and separated by commas.
{" version ":" 0.2.0, "" configurations: [{" type" : "LLDB", "request" : "launch", "name" : "Debug", "the program" : "${workspaceFolder}/build/Ninja-DebugAssert/swift-macosx-x86_64/bin/swift", "args": ["-frontend","-c","-primary-file"/*and other params*/], "cwd": "${workspaceFolder}", } ] }Copy the code
SIL
LLVM
Before going into SIL, let’s briefly introduce LLVM. The classic LLVM architecture is shown in the following figure, which is divided into Frontend, Optimizer, and Backend. When a new language needs to be supported, only the front-end part needs to be implemented, and the back-end part needs to be supported by a new architecture. The connecting hub of the front and back ends is Intermediate Representation (IR). IR is independent of programming language and machine architecture, so the optimization of IR phase can be abstract and universal.
Frontend
The front end generates AST through Lexical Analysis, Syntactic Analysis, Semantic Analysis, Intermediate Code Generation and other steps to generate IR.
IR
format
IR is a bridging language for the front and back ends of LLVM. It has three main formats:
- Readable format, ending with.ll
- Bitcode format, ending in. BC
- Runtime format in memory
These three formats are completely equivalent.
SSA
LLVM IR and SIL are both SSA(Static Single Assignment) forms. All variables in SSA form must be declared before use and can only be assigned once. The benefit of this implementation is that more efficient, deeper and more customized optimization can be carried out.
As shown in the figure below, after the code is transformed into SSA form, the variable can only be assigned once, and it is easy to judge that y1=1 is an assignment statement that can be removed by optimization.
structure
The infrastructure consists of Modules, each roughly equivalent to a source file. Module contains global variables and functions. Function corresponds to the Function, including the sound implementation of the method, parameters, and return values. The most important part of a Function is the Basic Block.
Basic Block(BB) corresponds to the control flow diagram of functions, which is a set of Instruction, and must end with Terminator Instructions, which represents the end of Basic Block execution and branch jump or function return.
Instruction corresponds to Instruction and is the basic unit of program execution.
Optimizer
IR is optimized by the optimizer, which invokes and executes various passes. By Pass, I mean the code that goes through the IR, that’s doing the processing. LLVM has a number of built-in passes, and developers can customize passes to implement specific functions, such as pile counting.
Xcode Optimization Level
In xcode-build setting-Apple clang-code generation-Optimization Level, you can select the Optimization Level. -o0 means no Optimization, that is, no Optimization Pass is called. Other optimization levels invoke the corresponding Pass.
Backend
The back end converts the IR into machine code that generates the corresponding CPU architecture.
Swiftc
Unlike OC, which uses Clang as the compiler front end, Swift has a custom compiler front end, Swiftc, as shown below.
This is where the benefits of the LLVM threefold come in that support for the new language is as simple as implementing the compiler front end.
Compared with CLANG, Swift adds the processing process for SIL. SIL is a new high-level intermediate language introduced by Swift to achieve higher levels of optimization.
Swift compilation process
Swift source code through lexical analysis, syntax analysis and semantic analysis to generate AST. SILGen generates an SIL after acquiring the AST. This SIL is called Raw SIL. After analysis and optimization, the Canonical SIL is generated. Finally, IRGen converts Canonical SIL into LLVM IR for optimizer and back end processing.
SIL instruction
We assume that there is no upper limit to the number of virtual registers. We name them %+ digits, such as %0, %1, etc.
alloc_stack
: Allocates stack memoryapply
: Calls a function by passing argumentsLoad
: Loads the value of the specified address from memoryfunction_ref
: Creates a reference to an SIL function
For detailed SIL instruction parsing, refer to the official documentation.
Identifier
There are two basic types of LLVM IR identifiers:
- Global identifier: Contains methods, global variables, etc., and starts with @
- Local identifier: Contains the register name, type, etc., and starts with %, where the %+ number represents the unnamed variable variable
In SIL, the identifier begins with @
- SIL Function names are all named with @+ letters/numbers and are usually mangle
- SIL Value is also named %+ letter/number to indicate that it is referring to an instruction or Basic block parameter
@convention(swift)
The Calling Convention of the Swift function is used by default@convention(c)
and@convention(objc_method)
Means using the calling convention of C and OC, respectively@convention(method)
Represents the implementation of a Swift instance method@convention(witness_method)
Represents the implementation of the Swift Protocol method
SIL structure
SIL realizes a set of similar structure and IR, and realizes SILModule, SILFunction, SILBasicBlock, SILInstruction.
Debugging process
Reproducibility Crash
After the compilation parameters are set according to the previous preparations, the compilation is started and the Crash is reproduced. Both crashes are reproduced, as shown in the following figure. Abort () and EXC_BAD_ACCESS cause the above mentioned Illegal instruction: 4 and Segmentation fault: 11 errors. Because the upper stack of the two is consistent, the former is taken as an example for analysis.
The stack analysis
Through stack tracing, it can be seen that after the SILFunction is generated and postEmitFunction is executed to verify the validity of the SILFunction, SILVerifier layers are used to iterate and verify the BasicBlock(visitSILBasicBlock). Iterate the SILInstruction inside the BasicBlock (visitSILInstruction).
Calling getKind() while getting the type of SILInstruction returns an exception, triggering Crash.
Abnormal SIL
- Because at this time
SILInstruction
Exception, it is difficult to locate the exception when checking which instruction, so in traversalSILInstruction
To print the contents of the previous instruction. - Execute the following command to compile incrementally
cd build/Ninja-DebugAssert/swift-macosx-x86_64
ninja
Copy the code
The reproduced printed content is as shown in the figure below:
Debug tips: Many LLVM classes implement dump() to print content for debugging purposes.
// function_ref Dictionary.subscript.setter %32 = function_ref @$sSDyq_Sgxcis : $@convention(method) <τ_0_0, τ_0_1 where τ_0_0: Hashable > (@ in Optional < tau _0_1 >, the @in tau _0_0, @ inout Dictionary < tau _0_0, tau _0_1 >) - > () / / user: %33 %33 = apply %32<AnyHashable, Any>(%13, %11, %24) : $@convention(method) <τ_0_0, τ_0_1 where τ_0_0: Hashable > (@ in Optional < tau _0_1 >, the @in tau _0_0, @ inout Dictionary < tau _0_0, tau _0_1 >) - > () % = 34 load [take] % 24: $*Dictionary<AnyHashable, Any> // users: %43, %37Copy the code
Normal SIL
The swiftc-emhee-silgen command can be used to generate Raw SIL. Since this class refers to the OC file, add the bridge file compiler parameters. The complete command is as follows:
swiftc -emit-silgen /Users/cs/code/ThirdParty/Swift_MVP/Swift_MVP/SwiftCrash.swift -o test.sil -import-objc-header /Users/cs/code/ThirdParty/Swift_MVP/Swift_MVP/Swift_MVP-Bridging-Header.h
Copy the code
Intercept part SIL as follows
%24 = alloc_stack $Dictionary<AnyHashable, Any> // users: %44, %34, %33, %31 %25 = metatype $@objc_metatype TestObject.Type // users: %40, %39, %27, %26 %34 = load [take] %24 : $*Dictionary<AnyHashable, Any> // users: %42, %36 %35 = function_ref @$sSD10FoundationE19_bridgeToObjectiveCSo12NSDictionaryCyF : $@convention(method) <τ_0_0, τ_0_1 where τ_0_0: Guaranteed NSDictionary (@guaranteed NSDictionary <τ_0_0, τ_0_1>) -> @owned NSDictionary // user: %37 %36 = begin_Borrow %34: $Dictionary<AnyHashable, Any> // users: %38, %37 %37 = apply %35<AnyHashable, Any>(%36) : $@convention(method) <τ_0_0, τ_0_1 where τ_0_0: Guaranteed NSDictionary (@guaranteed NSDictionary <τ_0_0, τ_0_1) -> @owned NSDictionary // users: %41, %40Copy the code
SIL analysis
Analyze normal SIL instruction by instruction
- Assign type is
Dictionary<AnyHashable, Any>
Store its address in register %24, which is used by register %44, %34, %33, %31 - %25 indicates the type
TestObject.Type
, i.e.,TestObject
The type of metaType - Load the value of register %24 into %34 and destroy the value of %24
- Create pair function
_bridgeToObjectiveC()-> NSDictionary
Is stored in %35
- Since the function name is mangle, first demangle the function name, as shown in the figure below, to get the function
@convention(method)
Indicates a Swift instance method with two generic parameters, including the first parameterTau _0_0
Hashable protocol is implemented
- Generates a value of the same type as %34 and stores it in %36. %34 remains until %36 is used
- Execute the function stored in %35, pass in %36, and return
NSDictionary
Type, the result exists %37. What it does is it willDictionary
Into theNSDictionary
Dawn breaks
We can see that the execution of the bridge method _bridgeToObjectiveC() failed, so we look at the source code, found that an OC NSDictionary immutable type bridge Swift Dictionary becomes a variable type, its content is modified. Although this writing method may cause logic exceptions, it does not cause compiler Crash, which is a bug of compiler code. More interestingly, the compiler crashes only if this property is declared as a class attribute in OC.
class SwiftCrash: NSObject {
func execute() {
//compiler crash
TestObject.cachedData[""] = ""
}
}
Copy the code
@interface TestObject : NSObject
@property (strong, nonatomic, class) NSDictionary *cachedData;
@end
Copy the code
The solution
The source code to modify
Find the source of the error and fix it. Change the NSDictionary in the problem code to NSMutableDictionary.
Re-run the Swift compiler to compile the source code without error.
After modifying the douyin source code, there is no compiler Crash problem again, the problem is fixed.
Static analysis
A potential problem
Although NSDictionary can bridge into Swift Dictionary normal use, but when the Swift immutable object is modified, will generate a new object, the original object has no impact, test code and output results are as follows:
We can see that the contents of the variable temp have not changed, and the Swift code modification is invalid.
TestObject *t = [TestObject new];
t.cachedData = [@{@"oc":@"oc"} mutableCopy];
NSDictionary *temp = t.cachedData;
NSLog(@"before execution : temp %p: %@",temp,temp);
NSLog(@"before execution : cachedData %p: %@",t.cachedData,t.cachedData);
[[[SwiftDataMgr alloc] init] executeWithT:t];
NSLog(@"after execution : temp %p: %@",temp,temp);
NSLog(@"after execution : cachedData %p: %@",t.cachedData,t.cachedData);
Copy the code
class SwiftDataMgr: NSObject {
@objc
func execute(t : TestObject) {
t.cachedData["swift"] = "swift"
}
}
Copy the code
The new rules
Added to douyin source static detection rules, detection of all OC IMmutable class is modified in Swift. Prevents the compiler from crashing and potentially causing logic errors.
All classes to be tested are as follows:
NSDictionary/NSSet/NSData/NSArray/NSString/NSOrderedSet/NSURLRequest/
NSIndexSet/NSCharacterSet/NSParagraphStyle/NSAttributedString
Copy the code
Afterword.
At this point, the compiler Crash problem has been resolved. At the same time, when upgrading Xcode to version 12.5 recently, I encountered another compiler Crash and no specific error file was displayed. The author did the same to find the error and fix it. After in-depth analysis of the root cause of abnormal SILInstruction generation, another article summary.
In addition, I have submitted a bug report for the Swift compiler and attached a minimum reproducible demo, which can be downloaded here:
Bugs.swift.org/browse/SR-1…
Join us
We are the team responsible for basic capability development and new technology exploration of Tiktok client. We focus on engineering/business architecture, R&D tools, and compilation systems to support rapid business iteration while ensuring the r&d efficiency and engineering quality of a large team. We continue to explore performance/stability and strive to provide the ultimate foundation experience for hundreds of millions of users around the world.
If you are passionate about technology, welcome to join douyin basic technology team and let us build a global App with 100 million levels. Currently we have recruitment needs in Shenzhen, Beijing, Shanghai and Hangzhou.
Email address: [email protected], email subject: Name – Working years – Tiktok – Basic Technology -iOS/Android.