preface
Language type
There are many dimensions in which computer languages can be classified. In the dimension of compilation/execution, computer languages can be classified into:
-
Compiled language
- C++ Objective C Swift Kotlin
- Machine code is first generated by the compiler, which can be executed directly on the CPU
- 👍 Higher execution efficiency
- 👎 The debugging period is long
-
Literal language (scripting language)
- JavaScript Python
- No compilation is required, but at execution an intermediate interpreter interprets the code into code that the CPU can execute
- 👍 writing debugging is convenient
- 👎 Low execution efficiency
The compilation process for compiled and literal languages is as follows
As you can see from the figure above, compiled languages need to compile all the code before they are run, and the final run file is the compiled executable. We refer to the compiler language as AOT (Ahead of time) compilation.
In a literal language, you compile and execute as you run, and the final file you run is actually the source code you wrote in the first place. The method of compilation used by literal languages is called JIT (Just in Time) compilation.
IOS compiler
IOS used the GCC compiler until Xcode 5, and in Xcode 5, GCC was completely abandoned and replaced with LLVM. LLVM consists of three modules: compiler front end, optimizer, and compiler back end. Swift is pretty much the same except for a slight difference in the compiler front end and objective-C.
Objective-c uses Clang as a compiler front end
Swift uses Swift as the compiler front end
LLVM
The author Chris Lattner
Chris Lattner in 2000 developed a suite of compiler development tools called The Low Level Virtual Machine, which has since expanded to include general compilers, JIT compilers, assemblers, debuggers, The abbreviation LLVM became the official name for a series of programming langue-related work, including static analysis tools. Chris Lattner later developed Clang, allowing LLVM to directly challenge GCC.
Joined Apple in 2005 and fully converted GCC used by Apple to LLVM.
Began to lead the development of Swift language in 2010.
In 2017, he left Apple and joined Tesla in charge of the development of self-driving software. In the second half of the same year, he joined Google’s deep learning and ARTIFICIAL intelligence research and development team.
Introduction of LLVM
LLVM is an open source, modular and reusable collection of compiler and toolchain technologies, or a compiler suite.
You can use LLVM to compile Kotlin, Ruby, Python, Haskell, Java, D, PHP, Pure, Lua, and many other languages
The LLVM core library also provides an optimizer to support code generation for popular cpus.
LLVM supports both AOT precompilation and JIT just-in-time compilation
In 2012, LLVM was awarded the ACM Software Systems Award, along with UNIX, WWW, TCP/IP, Tex, and JAVA.
LLVM IR
LLVM IR (Intermediate Representation) is a kind of Intermediate code generated during the whole compilation process, which is different from the source code and machine code. IR provides a source language independent of any particular machine architecture, so it is the key to LLVM optimization and code generation, and it is what sets LLVM apart from other compilers. The core functionality of LLVM is built around the IR, which is the output of the front end and the input of the back end of the LLVM compilation process.
IR is similar to the JVM’s Java Bytecode in that both are models for expressing computation, but at different levels of abstraction. Java Bytecode is higher level (more abstract) and contains a large number of java-like object-oriented language operations. LLVM IR is lower level (closer to the machine). The existence of IR means that it can be used as a backend for multiple languages, so LLVM can provide language-independent optimizations while also being easily generated for multiple CPU codes.
Why do we need IR
The compiler architecture is divided into front end, optimizer, and back end. Traditional compilers, such as CGG, are not completely decouple from the front end and the back end, so supporting a new language or hardware platform requires a lot of work. The biggest difference between LLVM and traditional compilers is that any language input by the front end is processed by the front end of the compiler, and the generated intermediate code is in IR format.
Traditional static compilers
The LLVM compiler
The advantage of this is that if we need to support a new programming language, we only need to implement a new front end. If we need to support a new hardware device, we just need to implement a new back end. The optimization phase is a general purpose phase because it is for the unified LLVM IR, and there are no changes to the optimization phase, whether to support new programming languages or new hardware devices. So this is where LLVM IR comes in.
LLVM IR in three formats:
- Compile intermediate language in memory
- A readable intermediate format stored on a hard disk (to
.ll
At the end) - A binary intermediate language stored on a hard disk
.bc
At the end)
These three intermediate formats are completely equivalent.
Bitcode
Those of you who develop iOS may not be familiar with IR, but I’m sure you’ve heard of Bitcode. Bitcode is basically the third of the three LLVM IR formats we mentioned earlier, that is, a binary file stored on disk (ending in.bc).
Bitcode is singled out because Apple has made additional optimizations for Bitcode. Starting with Xcode 7, Apple supports submitting the Bitcode of the App at the same time as submitting the compiled product (optional), and after submitting the Bitcode of the App is separately compiled and packaged in the cloud. That is, even if a locally compiled IPA is submitted to the App Store at the time of submission, Apple will eventually use Bitcode to repackage it in the cloud, and the version that the end user downloads to their phone will be the one that Apple compiled in the cloud, not the one that the developer compiled locally.
Here is an article on the workflow and security assessment of Xcode 7 Bitcode, which reveals how the 360 team used a trick to verify whether the App installed by Apple reviewers was directly compiled from the local version or the cloud version compiled by Bitcode. And thus found a possible bypass audit loophole.
Why do you need Bitcode
One reason for this is that Apple can do extra targeted optimizations in the cloud build process that are not available in the local environment. Second, Apple can do binary optimization for the target device on which the App is installed, reducing the download size of the installation package.
For example, the locally compiled IPA contains multiple CPU architectures (ARMV7, ARM64), and the ARMV7 architecture file is useless for the iPhone X. Apple, on the other hand, can compile ipAs for different CPU architectures for different devices in the cloud, so the iPhone X download App only contains the required ARM64 architecture files.
Even worse, since Bitcode is device-independent, it can be converted to any supported CPU architecture, including those that haven’t been invented yet. In the future, if Apple releases a new mobile phone with a newly designed CPU, the background server of Apple can compile and convert the Bitcode of the App into an executable program on the new CPU, which can be downloaded and run by new mobile phone users, without the need for developers to re-compile, package and upload the App locally.
Clang & Swift
Clang compiler
Clang, a subproject of LLVM, is a C, C++, and Objective-C compiler aimed at replacing the traditional compiler GCC. Clang acts as a compiler front end throughout the Objective-C compilation process, and also participates in the Objective-C API mapping phase of the Swift compilation process.
The main function of Clang is to output the abstract syntax tree (AST) corresponding to the code, accurately advise the user on compilation errors, and compile the code into LLVM IR.
Clang is characterized by fast compilation, modularity, simple code, readable diagnostic information, small memory footprint, and easy to scale and reuse.
In the case of Xcode, Clang builds objective-C code three times as fast as GCC used before Xcode 5, and generates an AST that uses only about one-fifth the memory of GCC.
Clang’s main work:
- Preprocessing: such as embedding macros in appropriate locations, importing header files, and removing comments (clang -e main.m)
- Lexical analysis: This is where the code is cut into tokens, such as braces, equal signs, and strings
- Syntax analysis: Verify that the syntax is correct
- Generate AST: Compose all nodes into an abstract syntax tree AST
- Static analysis: Analyze the code to see if there are problems, and give error messages and fixes
- Generate LLVM IR:
CodeGen
Will be responsible for translating the syntax tree from top to bottom to LLVM IR step by step
Swift, the compiler
Like Clang, the Swift compiler is mainly responsible for static analysis and error correction of the Swift source code and conversion to LLVM IR. It is a front-end module compiled by Swift. Unlike Clang, however, the Swift compiler has an extra SIL Optimizer, which converts the Swift file into an intermediate SIL and then generates an IR based on the SIL. The Swift compiler generates two different types of intermediate code during compilation. Why? This is explained in detail below.
The main work of Swift compiler:
- Parsing: The parser is responsible for generating an abstract syntax tree (AST) without any semantic or type information and for issuing warnings or errors about the syntax of the input source
- Lexical analysis: Takes the parsed AST and converts it to a well-formed, fully type-checked form of the AST, warning or error for lexical problems in the source code
- Clang importer: Imports Clang modules and maps the C or Objective-C API they export to the corresponding Swift API
- SIL generation: Demotes a type-checked AST to an SIL
- SIL normalization: Perform additional data flow diagnostics (such as using uninitialized variables)
- SIL optimizations: Perform additional advanced Swift specific optimizations for programs, including automatic reference count optimizations, virtualization, and general specialization
- LLVM IR generation: demote SIL to LLVM IR
Why add the SIL layer
Swift Integration Layer is a high-level, Swift-specific intermediate language suitable for further analysis and optimization of Swift code. SIL is a high-level IR with a higher Level of abstraction than LLVM IR and is specific to the Swift language.
IR is not suitable for analysis and inspection of source code due to the very large abstract gap between source code and LLVM IR. So Clang uses Analysis to analyze and examine the code through a CFG (control flow diagram). But CFG itself is not precise enough, and not in the main process (will be executed in parallel with the IR generation process), so CFG and IR generation will appear in part repeated analysis, do useless work.
In the Swift compilation process, all the analysis and normalization are done before the LLVM IR is generated, and the level is lowered to LLVM IR with the help of IRGen, which avoids some repetitive tasks and makes the whole compilation process more uniform.
And because Swift does the method binding at compile time and calls directly from the address are a strongly typed language, method calls are no longer message forwarding like objective-C, so the compiler can get more information to use for back-end optimization. Therefore, we can make targeted optimizations to Swift on SIL, which cannot be achieved by LLVM IR.
These optimizations include:
- Critical split: No arbitrary base is supported
block
Parameters are critically split through the terminal - Generic optimization: Analyzes specific calls to generic functions and generates new specific versions of the functions. The specific uses of generics are then overridden entirely as direct calls to the corresponding specific functions
- Optimization of de-virtualization for the Witness and virtual tables: The virtual table of the associated class or the witness table of the type is searched based on the given type, and the virtual function call is replaced by the call function mapping
- Inline optimization: Inline the transparent function
- Memory upgrade: will
alloc_box
Structural optimization isalloc_stack
- Reference count optimization
- Advanced domain-specific optimizations: For basic Swift type containers (similar
Array
或String
) implements advanced optimizations
The security SIL that is analyzed and examined is converted by IRGen into LLVM IR and further subject to LLVM optimizations.
## Hands-on
- First, let’s write a simple program with only one entry function and simple logic.
#import <Foundation/Foundation.h>
#define DEFINEEight 8
void test(int a, int b) {
int c = a + b - DEFINEEight;
}
Copy the code
- Once you’ve written your code, LLVM preprocesses your code with the following commands, such as embedding macros in appropriate locations, importing headers, removing comments, and so on.
clang -E main.m
Copy the code
So that’s what you get
# 1 "/ Applications/Xcode. App/Contents/Developer/Platforms/MacOSX platform/Developer/SDKs/MacOSX10.14 SDK/System/Library/Fram eworks/Foundation.framework/Headers/FoundationLegacySwiftCompatibility.h" 1 3 # 185 "/ Applications/Xcode. App/Contents/Developer/Platforms/MacOSX platform/Developer/SDKs/MacOSX10.14 SDK/System/Library/Fram eworks/Foundation.framework/Headers/Foundation.h" 2 3 # 2 "main.m" 2 void test(int a, int b) { int c = a + b - 8; }Copy the code
-
After the pre-processing is completed, lexical analysis will be carried out. Here, the code will be cut into a Token, and each Token represents a characteristic element.
The classification of the Token
-
Keywords: keywords in syntax, if else while for, etc.
-
Identifier: the variable name
-
Literals: values, numbers, strings
-
Special symbols: symbols for addition, subtraction, multiplication and division
The following code is used to get the lexed code.
clang -fmodules -fsyntax-only -Xclang -dump-tokens main.m Copy the code
The resulting lexical analysis code
void 'void' [StartOfLine] Loc=<main.m:4:1> identifier 'test' [LeadingSpace] Loc=<main.m:4:6> l_paren '(' Loc=<main.m:4:10> int 'int' Loc=<main.m:4:11> identifier 'a' [LeadingSpace] Loc=<main.m:4:15> comma ',' Loc=<main.m:4:16> int 'int' [LeadingSpace] Loc=<main.m:4:18> identifier 'b' [LeadingSpace] Loc=<main.m:4:22> r_paren ')' Loc=<main.m:4:23> l_brace '{' [LeadingSpace] Loc=<main.m:4:25> int 'int' [StartOfLine] [LeadingSpace] Loc=<main.m:5:5> identifier 'c'[LeadingSpace] Loc=<main.m:5:9> equal '=' [LeadingSpace] Loc=<main.m:5:11> identifier 'a' [LeadingSpace] Loc=<main.m:5:13> plus '+' [LeadingSpace] Loc=<main.m:5:15> identifier 'b' [LeadingSpace] Loc=<main.m:5:17> minus '-' [LeadingSpace] Loc=<main.m:5:19> numeric_constant '8'[LeadingSpace] Loc=<main.m:5:21 <Spelling=main.m:2:21>> semi '; 'Loc=<main.m:5:32> r_brace '}' [StartOfLine] Loc=<main.m:6:1> eof '' Loc=<main.m:6:2> Copy the code
-
-
Then there is parsing to verify that the syntax is correct. After confirmation, all nodes are composed into an abstract syntax tree AST.
-
clang -fmodules -fsyntax-only -Xclang -ast-dump main.m Copy the code
The AST
|-FunctionDecl 0x7f9a2108a0c0 <line:4:1, line:6:1> line:4:6 test 'void (int, int)' | |-ParmVarDecl 0x7f9a21089f48 <col:11, col:15> col:15 used a 'int' | |-ParmVarDecl 0x7f9a21089fc0 <col:18, col:22> col:22 used b 'int' | `-CompoundStmt 0x7f9a2108a348 <col:25, line:6:1> | ` -DeclStmt 0x7f9a2108a330 <line:5:5, col:32> | ` -VarDecl 0x7f9a2108a1e0 <col:5, line:2:21> line:5:9 c 'int' cinit | `-BinaryOperator 0x7f9a2108a308 <col:13, line:2:21> 'int' '-' | |-BinaryOperator 0x7f9a2108a2c0 <line:5:13, col:17> 'int' '+' | | |-ImplicitCastExpr 0x7f9a2108a290 <col:13> 'int' <LValueToRValue> | | | ` -DeclRefExpr 0x7f9a2108a240 <col:13> 'int' lvalue ParmVar 0x7f9a21089f48 'a' 'int' | | `-ImplicitCastExpr 0x7f9a2108a2a8 <col:17> 'int' <LValueToRValue> | | ` -DeclRefExpr 0x7f9a2108a268 <col:17> 'int' lvalue ParmVar 0x7f9a21089fc0 'b' 'int' | `-IntegerLiteral 0x7f9a2108a2e8 <line:2:21> 'int' 8 `-<undeserialized declarations> Copy the code
For ease of view, we present the AST as a tree
Classification of nodes:
TranslationUnitDecl: The root node, representing a source file
Decl: statement
Expr: expression
Literal: Literal, which is special Expr
Stmt: statement
-
After you get the AST, the Clang Static Analyzer statically analyzes the code.
The Clang Static Analyzer architecture consists of an Analyzer Core analysis engine and checkers for checking specific code, All Checkers implement specific code checks based on the basic functionality provided by The Analyzer Core.
After the AST is generated, the Clang Static Analyzer uses checkers to check the code for undeclared variables, and so on. You can also write new checkers to add custom checks. This makes it easy for users to extend code checking rules or bug types, but this architecture has a downside: the more checkers, the slower the analysis engine will iterate through all the callbacks in the Checkers after each statement is executed.
Clang’s static profiler not only exposes where the code is in trouble, but also provides multiple ways to fix the code.
-
After completing these steps, the IR intermediate code is generated. CodeGen will be responsible for translating the AST from top to bottom into LLVM IR.
clang -S -fobjc-arc -emit-llvm main.m -o main.ll Copy the code
The generated code is as follows (only the test method is intercepted here)
; Function Attrs: noinline nounwind optnone ssp uwtable define void @test(i32, i32) #0{%3 = alloca i32, align 4 %4 = alloca i32, align 4 %5 = alloca i32, align 4 store i32 %0, i32* %3, align 4 store i32 %1, i32* %4, align 4 %6 = load i32, i32* %3, align 4 %7 = load i32, i32* %4, align 4 %8 = add nsw i32 %6%,7 %9 = sub nsw i32 %8.8 store i32 %9, i32* %5, align 4 ret void } Copy the code
Look at the head is not big. Well, IR isn’t that hard to do, because if you know a little bit about IR syntax you should be able to read the logic.
LLVM IR syntax:
; annotation
@ global
% local
Alloca Allocates memory space
I32 32-bit, that is, 4 bytes
Align memory alignment
Store Write to memory
Load Reads memory
Icmp Comparison of two integer values, returning a Boolean value
Br selects a branch, goes to label according to cond, and goes goto if not conditional
Indirectbr indirectly jumps to a label based on the condition, and the label is usually in an array, so the jump destination is variable and determined by the runtime
Label Code label
For those of you who have studied machine code, IR code actually looks a lot like machine code.
-
LLVM will do some optimization work here. You can also set the optimization level -01 / -03 / -0s in Xcode’s build Settings and write your own passes.
Pass is a node of the LLVM optimization effort, a node that does things that together make up the entire LLVM optimization and transformation.
We can control the tuning register by adding the -O3 / -O2 tuning parameter to the above command.
clang -O3 -S -fobjc-arc -emit-llvm main.m -o main.ll Copy the code
Here is a small example to show the specific optimizations that LLVM has made to IR. Let’s write a method that contains a loop.
int main() { int i = DEFINEEight; while (i < 10) { i++; printf("%d",i); } return 0; } Copy the code
The -O3 parameter is used to generate IR
; Function Attrs: nounwind ssp uwtable define void @demo() local_unnamed_addr # 0 { %1 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([3 x i8], [3 x i8]* @.str, i64 0, i64 0), i32 1), ! clang.arc.no_objc_arc_exceptions ! 9 %2 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([3 x i8], [3 x i8]* @.str, i64 0, i64 0), i32 2), ! clang.arc.no_objc_arc_exceptions ! 9 %3 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([3 x i8], [3 x i8]* @.str, i64 0, i64 0), i32 3), ! clang.arc.no_objc_arc_exceptions ! 9 %4 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([3 x i8], [3 x i8]* @.str, i64 0, i64 0), i32 4), ! clang.arc.no_objc_arc_exceptions ! 9 %5 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([3 x i8], [3 x i8]* @.str, i64 0, i64 0), i32 5), ! clang.arc.no_objc_arc_exceptions ! 9 %6 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([3 x i8], [3 x i8]* @.str, i64 0, i64 0), i32 6), ! clang.arc.no_objc_arc_exceptions ! 9 %7 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([3 x i8], [3 x i8]* @.str, i64 0, i64 0), i32 7), ! clang.arc.no_objc_arc_exceptions ! 9 %8 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([3 x i8], [3 x i8]* @.str, i64 0, i64 0), i32 8), ! clang.arc.no_objc_arc_exceptions ! 9 %9 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([3 x i8], [3 x i8]* @.str, i64 0, i64 0), i32 9), ! clang.arc.no_objc_arc_exceptions ! 9 %10 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([3 x i8], [3 x i8]* @.str, i64 0, i64 0), i32 10), ! clang.arc.no_objc_arc_exceptions ! 9 ret void }Copy the code
At this point we change the loop condition (I < 10) to (I < 100)
; Function Attrs: nounwind ssp uwtable define void @demo() local_unnamed_addr #0 { br label %1 ; <label>:1:; preds = %1%,0 %2 = phi i32 [ 0%,0], [%3%,1 ] %3 = add nuw nsw i32 %2.1 %4 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([3 x i8], [3 x i8]* @.str, i64 0, i64 0), i32 %3), !clang.arc.no_objc_arc_exceptions !9 %5 = icmp eq i32 %3.100 br i1 %5, label %6, label %1 ; <label>:6:; preds = %1 ret void } Copy the code
It can be found that when the number of cycles is low, the generated IR will directly write out the logic of all cycles to maximize the execution efficiency. When the number of loops is too large, the implementation will fall back on more complex logic. In addition to the simple optimizations above, LLVM provides other optimizations Pass:
- Generate various classes, methods, member variables, etc., and place them in the corresponding Mach-o section
- Non – Fragile ABI synthesis
OBJC_IVAR_$_
Offset value constant ObjCMessageExpr
Translated into the appropriate versionobjc_msgSend
.super
Translated intoobjc_msgSendSuper
strong
.weak
.copy
.atomic
synthetic@property
automaticallysetter
和getter
@synthesize
The processing of- generate
block_layout
The data structure _block
和__weak
_block
_invoke
- ARC processing, insertion
objc_storeStrong
和objc_storeWeak
The ARC, such as code - ObjCAutoreleasePoolStmt objc_autorealeasePoolPush/Pop is automatically added
[super dealloc]
And give eachivar
The class of synthesis.cxx_destructor
Method automatically frees class member variables.
-
If Bitcode is turned on, Apple will further optimize it
-
clang -emit-llvm -c main.m -o main.bc Copy the code
-
To generate the assembly
clang -S -fobjc-arc main.m -o main.s Copy the code
-
Generate object file
clang -fmodules -c main.m -o main.o Copy the code
-
Generate executable files
clang main.o -o main Copy the code
-
perform
./main Copy the code
conclusion
LLVM compilation process
- pretreatment
- Lexical analysis
- Syntax analysis
- Generate AST
- Static analysis
- Generate the LLVM IR
- Compiler optimization
- Bitcode (optional)
- To generate the assembly
- Generate object file
- Generate executable files
Division of LLVM
Compiler front end
In iOS compilation, Clang is the front end of the compiler. It contains a lexical analyzer, a syntax analyzer, a static analyzer, an IR generator, and a number of other components. These components work together to provide LLVM with a wide range of functions such as preprocessing, parsing, semantic analysis, static analysis, error handling, IR generation, and more.
Compiler optimization
IR is the output of the front end of the compiler and the input of the back end of the compiler, acting as the link between the LLVM compiler. Arguably, the core functionality of LLVM is built around IR. The OPTIMizer of LLVM directly optimizes IR through a variety of passes, making the code optimization process independent of language and platform, and greatly improving the development efficiency.
Compiler back end
In iOS compilation, the compiler back end is actually a set of back ends provided by LLVM itself. It includes machine code generators, linkers, and other tools that optimize IR for machine-independent code and generate machine language.
The product of compiler backend generation
The LLVM machine code generator generates different machine codes for different architectures.
The practical application
With a deeper understanding of LLVM and Clang, and the development tools they provide, we were able to implement a lot of interesting features. For example, with Libclang, libTooling, syntax tree analysis and language conversion (e.g. converting objective-C to Swift) were achieved. We can also develop our own Clang plugin to do personalized checks on our code. We can also write passes for custom code optimization and code obfuscation. We can even develop our own new language, all you need to do is write a compiler front end that converts your code to LLVM IR.
In addition, understanding the internal implementation process and details of the compilation also helps us find new ideas when solving problems. OCEval is a third-party open source library for iOS that dynamically performs hotfixes. Similar to javascript’s eval function, OCEval converts string objective-C code into actual running code and executes it. One of the most important implementation details is the implementation of a simple set of lexing and parsing to turn the string into an AST and finally get the method names, parameters, and so on needed for execution.
reference
Clang LLVM for iOS By Deming
The principles of the iOS compilation process and the application By 🐺 face the high winds
Face the wind
Swift’s high-level intermediate language: SIL By SEA_biscute
LLVM
Clang
Swift