1, the introduction

As an iOS developer, it’s hard not to hear about LLVM from a variety of sources, such as the compiler transition from GCC to LLVM-gcc in the early years, and then the GCC open source protocol change that caused Apple to completely abandon GCC in favor of Clang, which is currently in use. For example, the compilation optimization shared by wechat team before. Also known as Clang, instrumentation was later mentioned in binary rearrangements shared by the industry’s influential byte team. In terms of security, the development of LLVM Pass is also a major force…

Therefore, a proper understanding of LLVM is of great help to my knowledge breadth, so that I will not be unable to reach the threshold when reading at least one in-depth article.

What is LLVM and what is Clang

LLVM was originally the abbreviation of Low Level Virtual Machine (Low Level Virtual Machine), but due to the rapid development of the project, the Low Level Virtual Machine is not enough to introduce the project itself, and it has developed into a complete compilation framework including front-end, optimizer and back end, and its full name is LLVM. It’s not just any English abbreviation. It is mainly written in C++.

What is LLVM

Traditional compiler architecture

Traditional compiler architectures (such as GCC) coupled the front end, optimizer, and back end together, making optimization difficult and unfriendly to multi-architecture compatibility, requiring a lot of repetitive work.

LLVM architecture

LLVM architecture three ends (front-end, optimizer, and back-end) are clear. 1. The front-end is source oriented, and the source code is converted into the same LLVM Intermediate Representation (LLVM IR). Garbage elimination, memory optimization, even code obfuscation… 3. The back end converts IR into the corresponding machine code.

It can be seen from the design of the two architectures that the biggest advantage of LLVM lies in the separation of the three terminals. Therefore, if we want to write an independent language, we only need to write the corresponding front-end to be compatible with various terminal devices. If a terminal device is added later, we only need to write the back end once, and it will be compatible with all major languages.

What is Clang

Clang is a subproject of the LLVM project, a C/C++/Objective-C compiler front end based on the LLVM architecture (the front end to Swift is Swift).

When Apple switched from GCC to LLVM in the early years, it started to use a set of LLVM front-end written based on GCC library. However, due to Apple’s higher requirements for code optimization, and GCC officials were reluctant to update targeted, a set of branch LLVM-gcc was derived, which was maintained by Apple itself. As a result, the GCC version used by Apple was much lower than the official version. Finally, due to the change of GCC open source agreement, Apple completely abandoned GCC and switched to its own Clang.

Compared with GCC, Clang has the following advantages: · Fast compilation :Clang compiles significantly faster than GCC on some platforms (OC compiles 3 times faster than GGC in Debug mode) · Low memory footprint: THE AST generated by Clang takes up about 1/5 of the memory footprint of GCC · Modular design :Clang uses library-based modular design, Easy reuse for IDE integration and other uses · Readability of diagnostic information: During compilation, Clang creates and retains a large amount of detailed metadata for debugging and error reporting · The design is clear and simple, easy to understand, and easy to extend and enhance

3. Compilation process

1) download

Find a convenient directory and download it directly from Github (the time depends on the network condition, including the total size of the Git file is about 3G) :

> mkdir llvm_all && cd llvm_all   
> git clone https://github.com/llvm/llvm-project.git
Copy the code

When you’re done, you should see a directory like this:

For now, we only need to focus on the two files clang and LLVM are the source of CLang and LLVM respectively

(2) compilation

cmake -S llvm -B build -G <generator> [options]
Copy the code

There are four kinds of compilation tools:

  • Ninja — for generating Ninja build files. Most llvm developers use Ninja.
  • Unix Makefiles — for generating make-compatible parallel makefiles.
  • Visual Studio — for generating Visual Studio projects and solutions.
  • Xcode — for generating Xcode projects.

Ninja is the official recommendation because it’s the fastest, and I tried it myself, taking about 20 minutes to compile. However, as an iOS developer, I am still used to compiling with Xcode. After all, the interface looks friendly, and we can get good code prompts when we write plug-in or IR Pass later. The disadvantage is that it is slower. It took me about 40 minutes to compile using Xcode, depending on your PC’s configuration, which was slightly less than that, but more than an hour was normal.

There are also some optional parameters:

  • -DLLVM_ENABLE_PROJECTS='... 'Select some LLVM subprojects to co-compile, such asclang.clang-tools-extra.libcxx.libcxxabi.libunwind.lldb.compiler-rt.lld.pollyOr,cross-project-testsFor example, if you want to compile includeClang.libcxx, andlibcxxabiLLVM, can be increased-DLLVM_ENABLE_PROJECTS="clang; libcxx; libcxxabi"
  • -DCMAKE_INSTALL_PREFIX=directorySpecifies an absolute address to store the compiled results /usr/local.
  • -DCMAKE_BUILD_TYPE=type— This is the compiled type, such asDebug.Release.RelWithDebInfo, andMinSizeRel.default isDebug.
  • -DLLVM_ENABLE_ASSERTIONS=On— Compile with assertion checking enabled (default Yes for Debug builds and No for all other build types).
Ninja compilation

So to use Ninja build commands, you need to create a Ninja template (about 5 minutes) and then compile:

// create the ninja directory and go to mkdir llvm_ninja && CD llvm_ninja // specify the LLVM source directory, create the build directory, create the Ninja template, add the subproject clang, Cmake -s.. llVM_release cmake -s.. /llvm-project/llvm -B build -G Ninja -DLLVM_ENABLE_PROJECTS="clang" -DCMAKE_INSTALL_PREFIX=/Volumes/ExDisk/LLVM/llvm_all/llvm_releaseCopy the code

As you can see from the figure above, the clang we added participated in the compilation, but clang-tools-extra did not.

If you see something like this, congratulationsninjaThe template has been built

Then go to the template directory and enter the command to start compiling

cd build
ninja && ninja install
Copy the code

When you’re done, you can see all the commands in the specified release directory

Xcode compilation

Again, generate the Xcode template in the corresponding directory, but Xcode does not specify the corresponding release directory, because iOS developers should know that Xcode compiled products have the corresponding product file (about 10 minutes).

mkdir llvm_xcode && cd llvm_xcode cmake -S .. /llvm-project/llvm -B build -G Xcode -DLLVM_ENABLE_PROJECTS="clang"Copy the code

Again, you can see this directory at the end

Open the LLVM project and return to the familiar picture

After selecting Automatically Creat Schemes, select al_build and CMD + B to start compiling (about 40 minutes).

You can see the corresponding compilation in a familiar place

In the same directory, you can see the corresponding command-line tools, such as Clang, which we’ll cover in the next chapter.

4, summarize

Why compile separately? LLVM is like a shy masked woman, most people can only watch from a distance, but do not dare to dirty play. In fact, masked women themselves are also eager to have a warrior to uncover their veil, and compiling their OWN LLVM is like this step, as long as they take this step, then it will be a new world.

In the next chapter, we’ll start talking about fun things like how to compile your own plug-ins, how to develop your own obfuscator, and how to develop your own development language.

reference

GCC, LLVM, Clang compiler comparison 2, wechat team shared compiler optimization 3, in-depth analysis of iOS compiler Clang LLVM 4, LLVM Github address