This article focuses on understanding the LLVM compilation process and the development of the Clang plug-in.
LLVM
LLVM is a framework system for architecture compilers. It is written in C++ to optimize compile-time, link-time, run-time, and idle-time for programs written in any programming language. Keep it open to developers and compatible with existing scripts.
Traditional compiler design
Source Code + Frontend + Optimizer + Back-end CodeGenerator + Machine Code, as shown in the following figure
It is mainly divided into three parts:
1. Compiler Frontend
The task of the front end of the compiler is to parse the source code, it will do lexical analysis, syntax analysis, semantic analysis, check whether there are errors in the source code, The Abstract Syntax Tree (AST) is then constructed, and the LLVM front end generates intermediate representation (IR).
2. Optimizer
The optimizer is responsible for various optimizations that improve the runtime of your code, such as eliminating redundant calculations.
3. Backend/CodeGenerator
Map code to the target instruction set. Generate the specified platform machine language and perform machine-specific code optimizations.
The ios compiler architecture
Objective C/C/C++ uses a compiler with Clang front end,Swift Swift and LLVM back end.
The design of the LLVM
The most important aspect of LLVM is that it supports multiple source languages or multiple hardware architectures. Through the common code representation IR, which is similar to the bridge mode, the front and back ends are separated.
Clang
A sub-project of LLVM project, responsible for C,C++, object-C compiler, in the whole LLVM architecture, belongs to the compiler front end. Through the learning of Clang, it can be better applied to projects. For example, through the Clang plug-in, it can not only check code specifications, but also conduct useless code analysis, automatic pile driving, offline test analysis, method name confusion, etc.
Clang plug-in itself is not complicated to write and use, the key is how to better apply to work,
The compilation process
Let’s look at the whole process with a simple example. Create a new main.m file
#import <Foundation/Foundation.h>
#define DEFINEEight 8
int main(){
@autoreleasepool {
int eight = DEFINEEight;
int six = 6;
NSString* site = [[NSStringAlloc] initWithUTF8String: "starming"];int rank = eight + six;
NSLog(@" % @rank %d ", site, rank); }return 0;
}
Copy the code
Enter the clang-ccC-print-Phases main.m command to view the phases.
+- 0: input, "main.m", objective-c
+- 1: preprocessor, {0}, objective-c-cpp-output
+- 2: compiler, {1}, ir
+- 3: backend, {2}, assembler
+- 4: assembler, {3}, object
+- 5: linker, {4}, image
6: bind-arch, "x86_64", {5}, image
Copy the code
There are six main stages
- Enter the file and find the source file
- pretreatment:
Macro replacement, header file import, conditional compilation
- compile:
Lexical analysis, grammatical analysis
, check whether the syntax is correct,Generate IR
- The back-endLLVM will pass one by one
Pass to optimize
, and finally generate assembly code - Assembly: Assembly code generates object files
- link:
Link dynamic and static libraries,
Generate an executable file - Binding: Generate corresponding executables from different schemas
Pretreatment stage
perform
clang -E main.m
Copy the code
After execution, you can see the import of the header file and the macro replacement.
Compilation phase
The compilation stage is mainly for analysis and check of morphology, grammar, etc., and then generate intermediate code IR
1. Lexical analysis
Here the code is sliced into tokens, such as brackets, equals signs, and strings. You can run the following command to view the information.
clang -fmodules -fsyntax-only -Xclang -dump-tokens main.m
Copy the code
If the header file cannot be found, specify the SDK.
Clang-isysroot (own SDK path) -fmodules-fsynth-only -xclang-dump -tokens main.m clang-isysroot /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator141..sdk/ -fmodules -fsyntax-only -Xclang -dump-tokens main.m
Copy the code
2. Grammatical analysis
After the completion of lexical analysis, it is the task of grammar analysis, which is to verify whether the grammar is correct. Based on the lexical analysis, word sequences are combined into all kinds of this method phrases, such as programs, statements, expressions and so on, and then all nodes are formed into Abstract Syntax Tree (AST). The parser determines whether a program is structurally correct. You can run the following command to view the information.
clang -fmodules -fsyntax-only -Xclang -dump-tokens main.m
Copy the code
Generate abstract syntax tree
3. Generate intermediate code IR (Intermediate Representation)
After completing the above steps, the intermediate Code IR will be generated. The Code Generation will gradually translate the syntax tree from the top down to LLVM IR, which can be generated by the following command. Ll text file.
clang -S -fobjc-arc -emit-llvm main.m
Copy the code
OC code in this step will do the runtime bridge, : property synthesis, ARC processing and other IR basic syntax
@ % local symbol
unnamed
- @ global symbol
- % local symbol
- Alloca opens up space
- Align Memory alignment
- I32 32 bits 4 bytes
- Store writes to memory
- Load data
- Call calling function
- Ret return
The LLVM optimization levels of IR are -O0-O1-O2-O3-OS
clang -Os -S -fobjc-arc -emit-llvm main.m -o main.ll
Copy the code
IR files can be optimized in OC. The general Setting is target-build setting-optimization Level. The optimization levels of LLVM are -o0-O1-O2-O3-OS (the first one is uppercase O), and the following is the command bitCode with optimization to generate intermediate code IR
- After Xcode7 opens bitcode, Apple will further optimize and generate the intermediate code of.bc. We will generate the.BC code through the optimized IR code
clang -emit-llvm -c main.ll -o main.bc
Copy the code
Generating assembly code We generate assembly code from our final.bc or.ll code
Clang-s-fobjc-arc main.bc -o main. S clang-s -fobjc-arc main.ll -o main. S Copies the codeCopy the code
In addition, generating assembly code can also be optimized
clang -Os -S -fobjc-arc main.m -o main.s
Copy the code
Generate object file
The generation of the object file is that the assembler takes the assembly code as the insert, converts the assembly code into machine code, and finally outputs the object file.
Clang-fmodules -c main.s -o main.o Copies the codeCopy the code
You can run the nm command to view the symbols in main.o
$xcrun nm -nm main.o
Copy the code
The following symbols in main.o are in object file format
undefined
Indicates that the current file is temporaryCan't find symbol
external
That means that the symbol isExternally accessible
the
link
Links are mainly needed to link dynamic libraries and static libraries, which generate executable files
- Static libraries are merged with executable files
- Dynamic libraries are independent
Clang main.o -o main copies the codeCopy the code
See the symbol after the link
$xcrun nm-nm main copies the codeCopy the code
The binding
The binding is mainly used to generate the corresponding mach-O format executable through different architectures
The LLVM compiler
I have sorted out the corresponding download via xcode script
#! /bin/bash
LLVMPath=`pwd`
# 1. Download LLVM project
git clone https://mirrors.tuna.tsinghua.edu.cn/git/llvm-project.git
# 2. Download Clang
cd llvm/tools/
git clone https://mirrors.tuna.tsinghua.edu.cn/git/llvm/clang.git
# 3. Download compiler-rt,libcxx,libcxxabi
cd ../projects
git clone https://mirrors.tuna.tsinghua.edu.cn/git/llvm/compiler-rt.git
git clone https://mirrors.tuna.tsinghua.edu.cn/git/llvm/libcxx.git
git clone https://mirrors.tuna.tsinghua.edu.cn/git/llvm/libcxxabi.git
# 4. Install extra
cd ../tools/clang/tools
git clone https://mirrors.tuna.tsinghua.edu.cn/git/llvm/clang-tools-extra.git
# 5. Install cmake
if cmake >/dev/null 2>&1
then
echo "Cmake has been installed"
else
echo "Cmake not installed"
echo "Cmake performs installation"
brew install cmake >> /dev/null
if test $? -eq
then
echo "Cmake installed successfully"
else
echo "Failed to install cmake"
fi
fi
# 6. Compile with Xcode
echo "Compile with Xcode"
cd $LLVMPath
mkdir llvm_build
cdllvm_build cmake -G Xcode .. /llvmCopy the code
Llvm_build, which is the project we compiled with Xcode.
Clang plug-in
Create a CLPlugin folder in this path,Create a clplugin. CPP file and add the cmakelists. TXT file to the cmakelist.txt file
add_llvm_library( CLPlugin MODULE BUILDTREE_ONLY CLPlugin.cpp)
Cmake -g Xcode.. /llvm
Write plug-in code
In the clplugin. CPP file in the CLPlugin directory, add the following code
#include <iostream>
#include "clang/AST/AST.h"
#include "clang/AST/DeclObjC.h"
#include "clang/AST/ASTConsumer.h"
#include "clang/ASTMatchers/ASTMatchers.h"
#include "clang/Frontend/CompilerInstance.h"
#include "clang/ASTMatchers/ASTMatchFinder.h"
#include "clang/Frontend/FrontendPluginRegistry.h"
using namespace clang;
using namespace std;
using namespace llvm;
using namespace clang::ast_matchers;
namespace CLPlugin {
class CLMatchCallback: public MatchFinder::MatchCallback{
private:
CompilerInstance &CI;
// Check if it is your own file
bool isUserSourceCode(const string fileName){
if (fileName.empty()) return false;
// Code that is not in Xcode is assumed to belong to the user
if (fileName.find("/Applications/Xcode.app/") = =0) return false;
return true;
}
// Check whether copy should be used
bool isShouldUseCopy(const string typeStr){
if(typeStr.find("NSString") != string::npos ||
typeStr.find("NSArray") != string::npos ||
typeStr.find("NSDictionary") != string::npos){
return true;
}
return false;
}
public:
//3. Custom callback class inherits from MatchCallback's scanned callback function
CLMatchCallback(CompilerInstance &CI):CI(CI){}
void run(const MatchFinder::MatchResult &Result) {
// Get the node object from the result
const ObjCPropertyDecl * propertyDecl = Result.Nodes.getNodeAs<ObjCPropertyDecl>("objcPropertyDecl");
// Get the file name (including the path)
string fileName = CI.getSourceManager().getFilename(propertyDecl->getSourceRange().getBegin()).str(a);if (propertyDecl && isUserSourceCode(fileName)) {// If the node has a value && is not a system file!
// The type of the node is converted to a string
string typeStr = propertyDecl->getType().getAsString(a);// Get the description of the node
ObjCPropertyAttribute::Kind attrKind = propertyDecl->getPropertyAttributes(a);// Check whether copy should be used, but copy is not used
if (isShouldUseCopy(typeStr) && ! (attrKind & ObjCPropertyAttribute::kind_copy)) {// Copy should be used but copy is not used
// Diagnostic engine
DiagnosticsEngine &diag = CI.getDiagnostics(a);/ / the Report Report
diag.Report(propertyDecl->getLocation(),diag.getCustomDiagID(DiagnosticsEngine::Error, "This place should use Copy."));
// cout<}}}};//2. Custom CLConsumer, inherited from ASTConsumer, used to listen for AST node information - filter
class CLConsumer:public ASTConsumer{
private:
// Filter for the MatchFinder AST node
MatchFinder matcher;
CLMatchCallback callback;
public:
CLConsumer(CompilerInstance &CI):callback(CI){
// Add a MatchFinder to match the ObjCPropertyDecl node
/ / callback!
matcher.addMatcher(objcPropertyDecl().bind("objcPropertyDecl"), &callback);
}
// Once a top-level declaration is parsed, it is called back
bool HandleTopLevelDecl(DeclGroupRef D){
// cout<<" parsing...." <
return true;
}
// Call back when the entire file is parsed!!
void HandleTranslationUnit(ASTContext &Ctx) {
cout<<"File parsing complete!!"<<endl;
matcher.matchAST(Ctx); }};//1. Define a class that inherits PluginASTAction to implement our custom actions and custom AST syntax tree behavior
class CLASTAction:public PluginASTAction{
public:
bool ParseArgs(const CompilerInstance &CI, const vector<string> &arg) {
return true;
}
std::unique_ptr<ASTConsumer> CreateASTConsumer(CompilerInstance &CI, StringRef InFile) {
return unique_ptr<CLConsumer> (new CLConsumer(CI)); }}; }//4. Register the plugin!
static FrontendPluginRegistry::Add<CLPlugin::CLASTAction> X("CLPlugin"."this is the description");
Copy the code
A brief summary of the writing process:
- The first step is to write the PluginASTAction code to handle the entry parameters.
- Second, access all AST nodes through ASTConsumer to get the desired content.
- Third, write the MatchCallback callback function.
- Step 4: Register the Clang plug-in for external use.
The clang file path is in lib
// Command formatThe clang file path is -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator151..sdk/ -xclang-load-xclang plug-in (.dyld) path -xclang-add-plugin -xclang plug-in name -c source code pathCopy the code
Change to your own SDK path
Xcode integration plug-in
Using the Clang plug-in, you can load the dynamic library containing the plug-in registry with the -load command-line option,
– The load command line loads all registered Clang plug-ins. Use the -plugin option to select the Clang plug-in to run. Other parameters of the Clang plug-in are passed through -plugin-arg-.
The CC1 process is similar to a preprocessing process that takes place before compilation. Cc1 and Clang driver are two separate entities. Cc1 is responsible for front-end preprocessing, while Clang driver is mainly responsible for managing compilation task scheduling. Each compilation task accepts the parameters of CC1 front-end preprocessing and then adjusts them.
There are two ways to get options like -load and -plugin into the CC1 process of Clang:
One is to use the -cc1 option directly, which has the disadvantage of specifying the full system path configuration on the command line.
Alternatively, use -xclang to add these options to the CC1 process. – The Xclang parameter only runs the preprocessor and passes the following parameters directly to the CC1 process without affecting the clang driver.
- in
Build Settings
Two user-defined Settings are added to the column, respectivelyCC
andCXX
CC
The corresponding one is compiled by itselfclang
Absolute path ofCXX
The corresponding one is compiled by itselfclang++
Absolute path of- The next in
Build Settings
In the searchindex
That will beEnable Index-Wihle-Building Functionality
theDefault
Instead ofNO
Finally, by recompiling the project, our plug-in begins code review.