Wechat official account: Ilulaoshi, personal website: lulaoshi.info/

In the last article we looked at how Hello World is compiled. Even a very simple program relies on the C standard library and system library. Linking is the process of fusing other third-party libraries with binary object files generated by your own source code. After being linked, functions defined in third-party libraries can be called and executed. Early operating systems used static linking, but now almost all use dynamic linking.

Static and dynamic links

Although both static and dynamic links can generate executables, the costs are very different. The following diagram illustrates the difference between dynamic and static links:

The man on the left is like a dynamically linked executable, and the walrus on the right is a statically linked executable. Walruses are much more bloated than humans because static links pack up all the third-party library functions they rely on when linking, resulting in very large executable files. Dynamic linking does not take those library files directly when linking, but at runtime, when it finds some functions in some libraries, it reads the methods it needs from those third-party libraries.

We refer to compiled but unlinked binary machine code files as Object files. Those third-party libraries are compiled and packaged Object files that contain functions that we can call directly without having to write them ourselves. When compiling and building your own executable, you use static linking, which is essentially packaging the required static libraries with the object files. The resulting executable contains these third-party static libraries in addition to its own programs, and the executable is bloated. By contrast, dynamic linking does not package all third party libraries into the final executable, but only records which dynamically linked libraries are used and loads those third party libraries at run time. Load is the loading of programs and data from disk into memory. For example, Program 1 is loaded first, and then libx.so is loaded after Program 1 is found to depend on libx.so.

So static links are like walruses in GIFs, carrying everything you need with you. Dynamic links only take the condensed content with you, what you need, you can get it at run time.

The DLC file format varies from operating system to operating system. The DLC file format varies from Operating system to operating system. The DLC file format varies from Operating system to operating system.

Address has nothing to do

Function calls involving third-party libraries in object files generated using dynamic linking are address independent on any operating system. If Program 1 calls printf() of C standard library, in the generated object file, the specific address of printf() is not immediately determined. Instead, the function is loaded at runtime, and the address of printf() is determined at the loading stage. The address mentioned here refers to the virtual address of the process in memory. The address of a dynamically linked library function is uncertain at compile time. At load time, the loader dynamically allocates a virtual address space according to the current address space.

Statically linked libraries actually address library functions at compile time. For example, we use the printf() function. The printf() function should have an object file printf.o, and when statically linked, it packs the printf.o link into the executable. In an executable, the offset of the printf() function with respect to the file header is fixed, so its address is fixed after the link is compiled.

Advantages and disadvantages of dynamic linking

In contrast, dynamic linking mainly has the following benefits:

  • Multiple executables can be shared using a shared library in the system. Each executable is smaller and takes up relatively little disk space. Static linking packages dependent libraries into executable files, ifprintf()If it is used thousands of times by other programs, it is packaged into thousands of executables, which can take up a lot of disk space.
  • The isolation between shared libraries means that a shared library can be upgraded to a smaller version of the code, recompiled and deployed to the operating system without affecting its invocation by the executable. Any function of the static link library is changed, except that the static link library itself needs to be rebuilt, and all executables that depend on this function need to be rebuilt.

Of course, shared libraries have their drawbacks:

  • If an object file is ported to a new operating system and the new operating system lacks the corresponding shared libraries, the program will not run, and the corresponding libraries must be installed on the operating system.
  • The shared library must be upgraded according to certain development and upgrade rules. All interfaces cannot be reconstructed suddenly, and the new library file directly overwrites the old library file; otherwise, the program cannot run.

The LDD command displays dynamic link library dependencies

On Linux, dynamically linked libraries have a default deployment location, and many important libraries are located in the system’s /lib and /usr/lib directories. Some common Linux commands rely on /lib and /usr/lib64 libraries, such as SCP, rm, cp, and mv. If you delete these paths, many commands and tools in the system may become unavailable.

We can use the LDD command to see which dynamic link libraries an executable depends on.

# on Ubuntu 16.04 x86_64
$ ldd /bin/ls
  linux-vdso.so.1 =>  (0x00007ffcd3dd9000)
	libselinux.so.1 => /lib/x86_64-linux-gnu/libselinux.so.1 (0x00007f4547151000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f4546d87000)
	libpcre.so.3 => /lib/x86_64-linux-gnu/libpcre.so.3 (0x00007f4546b17000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f4546913000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f4547373000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f45466f6000)
Copy the code

As you can see, the ls command we often use relies on a number of libraries, including the C standard library libc.so.

If a Linux program reports an error indicating that a library is missing, you can use the LDD command to check which libraries the program depends on and whether the. So file can be found in a path on the disk. If not, you need to use the environment variable LD_LIBRARY_PATH, which is described below.

SONAME file naming rules

So files are often followed by many numbers, which indicate different versions. The so file naming convention is called SONAME:

libname.so.x.y.z
Copy the code

Lib is a prefix, it’s an established rule. X is the Major Version, Y is the Minor Version, and Z is the Release Version.

  • Major versions represent Major upgrades, and libraries with different Major versions are incompatible. After Major Version updates, or programs that rely on older Major versions need to update their code and recompile to run on the new Major Version. Or the operating system can keep the old Major Version so that the old program can still run.
  • Minor Version indicates an incremental update. Generally, new interfaces are added while the original interfaces remain unchanged. So, Minor versions are compatible from high to low in the case of the same Major Version.
  • Release Version indicates some bug fixes, performance improvements, etc., without adding any new interfaces or changing the original interfaces.

But the.so we just saw only has a Major Version because it is a soft link, libname.so.x soft links to the libname.so.x.y.z file.

$ls -l /lib/x86_64-linux-gnu/libpcre.so.3 /lib/x86_64-linux-gnu/libpcre.so.3 -> libpcre.soCopy the code

Because different Major versions are incompatible, and both Minor and Release versions are backward compatible, the soft link will point to the same Major Version. So files with the highest Minor Version and Release Version.

Dynamic link library lookup process

As mentioned, most of Linux’s dynamically linked libraries are located in /lib and /usr/lib, and the operating system searches for dynamically linked libraries in these two paths by default. In addition, a path can be configured in the /etc/ld.so.conf file, which tells the operating system which path to search for the dynamic link library. There are many dynamic link libraries in these locations. If the linker traverses these paths every time, it will be time-consuming. Linux provides the LDConfig tool, which will create soft links for the dynamic link libraries in these paths according to SONAME rules. It also generates a Cache in /etc/ld.so. Cache, so that the linker can find each.so file faster. Each time a new library is installed in /lib or /usr/lib, or a change is made to the /etc/ld.so.conf file, the ldconfig command is called to do an update to regenerate the soft connection and Cache. However, you are advised to use the root account to run the /etc/ld.so.conf file and ldconfig command. Non-root users can install the library file in a directory, add the directory to the /etc/ld.so.conf file, and then call ldconfig as root users.

For non-root users, an alternative is to use the LD_LIBRARY_PATH environment variable. LD_LIBRARY_PATH holds several paths. The linker will go to those paths to look for libraries. Non-root can install a library in a non-root directory and add it to an environment variable.

The search sequence of dynamic link library is as follows:

  • LD_LIBRARY_PATHPath in environment variables
  • /etc/ld.so.cacheCache file
  • /usr/liband/lib

For example, if we install CUDA under /opt, we can use the following command to add CUDA to the environment variable.

export LD_LIBRARY_PATH=/opt/cuda/cuda-toolkit/lib64:$LD_LIBRARY_PATH
Copy the code

If the above command is executed before a specific program is executed, the program will use CUDA in this path. If this line is added to the.bashrc file, this command will be executed as soon as the user logs in, so all of the user’s programs will also use CUDA in this path. When you have multiple versions of the same dynamic linked library. So files, you can install them under different paths and use the LD_LIBRARY_PATH environment variable to control which library to use. This is ideal for using multiple versions of libraries on a shared server, such as CUDA, which changes quickly and is highly dependent on by deep learning programs.

In addition to the LD_LIBRARY_PATH environment variable, there is also a LD_PRELOAD environment variable. LD_PRELOAD takes precedence over LD_LIBRARY_PATH. LD_PRELOAD is A list of shared objects. LD_LIBRARY_PATH is A list of directories.

GCC compilation options

When compiling links with GCC, there are two arguments to be aware of: -l (lowercase L) and -l (uppercase L). As we mentioned earlier, Linux has a quick convention that if the library name is name, the dynamically linked library file name is libname.so. When compiling links with GCC, -lname tells GCC which library to use. When linking, GCC’s linker LD looks for libname.so in the LD_LIBRARY_PATH environment variable, in the /etc/ld.so.cache file, and in the /usr/lib and /lib directories. We can also use -l /path/to/library to make the linker ld go to/ path/to/library to find the library file.

If the dynamic link library file is at /path/to/library and the library name is name, compile the link as follows:

$ gcc -L/path/to/library -lname myfile.c
Copy the code