History of compilers
Compilers are generally constructed
A traditional compiler is usually divided into three parts, frontEnd, Optimizer, and backEnd. In the process of compiling, the front end is mainly responsible for lexical and grammar analysis, and transforms the source code into abstract syntax tree. Optimizer is on the basis of the front-end, to get the middle code optimization, make the code more efficient; The back end transforms the optimized intermediate code into machine code for each platform. As shown in figure:
GCC (Ancient Times)
GCC (GNU Compiler Collection) is a programming language Compiler developed by GNU. GCC was originally called the GNU C compiler because it was originally designed to handle only C. GCC rapidly evolved to handle other languages such as C++, Fortran, Pascal, Objective-C, Java, and Ada
Early OC programmers had a taste of GCC compiler, but why did Apple use GCC instead?
GCC also does a better job of three-step, and implements a lot of front-end and supports a lot of languages. But the fatal flaw with these compilers is that they are a complete executable file and do not provide an interface for code reuse for developers of other languages. Even though GCC is open source, source code reuse is difficult.
- GCC’s objective-c Frontend does not work: apple does not maintain the GCC Frontend, so you have to ask the GCC Frontend to add syntax hints and other features.
- GCC plug-ins, tools, IDE support is weak: many compiler features are not available, automatic completion, code hints, warning, static analysis and other processes are not very powerful, are required by the IDE to call the bottom command to complete, the results need to be exposed in the form of plug-ins, this part of GCC is not very good.
- GCC compilation efficiency and poor performance: Apple Clang out, its compilation efficiency is 3 times of GCC, compiler performance is good, compiled files are small.
- Apple takes back control of the toolchain (LLDB, LLD…) : Apple’s early transition from the GCC front end to the LLVM back end compiler to the Clang-LVVM compiler, followed by the replacement of GDB, took back control of the compilation tool chain step by step and laid the foundation for swift.
LLVM (Dawn of a New Era)
LLVM adopts a standard three-stage design architecture, which is divided into a front end, an intermediate optimizer and a back end. The front end is responsible for parsing, validating, and diagnosing errors in the input code, and then converting the parsed code into LLVM Intermediate Representation (LLVM IR for short), which is designed to host the Intermediate analysis and transformations found in the optimizer section of the compiler. As shown in figure:
Clang/Swift + LLVM
Apple currently uses the compilation mode of Clang/Swift + LLVM. Clang is mainly used as the front-end of code compilation compiled by OC and OC++ before, while Swift will use its own independent front-end (SwiftC) to compile. The structure is shown as follows:
- The compilation process of Clang is shown below:
- SwiftC compilation process, as shown below:
As can be seen from the figure, SwiftC compilation generates an Intermediate code SIL (Swift Intermediate Language) for code analysis and optimization before transforming AST into IR, from which we will learn some truth hidden under the Language.
SwiftC
Just as OC code can be written into C++ code with Clang -rewrite-objc, we can use swiftc to generate SIL intermediate code to observe Swift’s underlying implementation mechanism, or even to generate executable files. Here are some common commands:
- Generate executable file:
swiftc -o main.out main.swift
- Generate abstract syntax tree:
swiftc main.swift -dump-ast
- Generate Intermediate Language (SIL) :
swiftc main.swift -emit-sil
- Generate LLVM Intermediate Language (LLVM IR) :
swiftc main.swift -emit-ir
- Generate assembly:
swiftc main.swift -emit-assembly
Additional command functions can be assisted with the -h option.
Swiftc outputs to the standard output stream by default. If we want to easily view or save these intermediate files, we can redirect the output to a file for saving. Like this:
swiftc -emit-sil main.swift > main.sil
When we read these intermediate files, we might see a lot of confused class, method, and variable names to make it easier to overload methods and implement polymorphism. Fortunately, there is a corresponding swift-demangle command in the Xcode tool chain to reverse obfuscate these names. We can pipe the swiftC-EMIT -sil output as the input to xcRun swift-demangle like this:
swiftc -emit-sil main.swift | xcrun swift-demangle
Of course, you can also export redirection to save these results to a file:
swiftc -emit-sil main.swift | xcrun swift-demangle > main.sil
In SIL files, we see a lot of unfamiliar keywords, which we can help with from GitHub’s official documentation. If we are working on iOS, we will inevitably reference SDK frameworks such as UIKit. In this case, we need to add some parameters to ensure that symbols in these frameworks are found at compile time. Be sure to change the emulator version number to the version you are currently using.
Swiftc-ema-sil-target x86_64-apple-ios14.2-simulator -sdk $(xcrun -- show-sdK-path -- SDK iphonesimulator) ViewController.swift > ViewController.sil