• Self-cultivation of an iOS programmer (I) Compile and link
  • What’s in Mach-O
  • An iOS programmer’s Self-cultivation (iii) Mach-O file static links
  • Self-cultivation of an iOS programmer (iv) Executable file loading
  • An iOS programmer’s Self-cultivation (5) Mach-O file dynamic linking
  • The Self-cultivation of an iOS Programmer (6) Dynamically linked Applications: The Fishhook Principle
  • The self-cultivation of an iOS programmer (7) Static link Applications: The Principle of static library staking
  • Self-cultivation of an iOS programmer (8) Memory

Loading executable files from an operating system perspective

Process establishment

The most critical feature of a process is that it has a separate virtual address space, the size of which is determined by the number of CPU bits. The process from creating the process to loading the executable is as follows:

  1. Create a virtual address space.

Under Linux on the I386, creating a virtual address space is really just allocating a page directory, which is like pre-partitioning a directory, but there is no real content in it. It is through this mechanism of page mapping that executable files are loaded by the operating system, the equivalent of populating a directory with information. Page mapping is an important part of the virtual storage mechanism. Data in the memory and on disks is divided into several pages based on page size. In the iOS operating system, a page is 16kb.

  1. Read the executable header, and establish a virtual space and executable mapping relationship.

When the operating system reads the executable file from the memory, if the memory does not load the page and it is a blank page, the program will have a page error. The system will allocate a physical page from the physical memory, and then read the “missing page” from the disk to the memory, and then set the mapping relationship between the virtual page and the physical page. When the system catches a missing page error, it should know where the currently required page is in the executable, which requires mapping between the virtual space and the executable.

Since executables are actually mapped to virtual space when they are loaded, they are often referred to as images.

When a Segment is mapped to a Virtual space, it is called a Virtual Memory Area (VMA). For example, sections with read-only permissions are grouped together into read-only segments, which are mapped to a virtual memory space that may be a page or pages, collectively referred to as a VMA. What’s in Mach-O?

  1. Start execution by setting the CPU instruction register to the executable entry.

This step can simply be considered as the operating system executing a jump instruction to jump directly to the entry address of the executable.

Page error

The above steps only map the executable file to the virtual memory of the process using the information in the executable header. They do not load the actual instructions and data into memory. Assume that the CPU going to start from the entry address found in memory during execution is an empty page, they think it is a “mistake” pages, and then the operating system will go to the second part to establish the mapping relationship of query this page location, calculate the corresponding page in the executable file offset, and then assign a physical page in physical memory, Map the virtual page in the process to the assigned physical page, then give control back to the process, which restarts execution from where the page error occurred.

This is why binary rearrangement, which has been popular for some time, can save startup time. It reduces the number of pages that need to be loaded during startup and thus reduces the number of Page faults.

Process virtual space distribution

Link and execution views for executables

The operating system does not allocate memory space in segments (such as the text segment and data segment mentioned above), because each segment should be an integer multiple of the page when mapped. If not, the extra portion will occupy a page. Usually, an executable file has dozens of segments or even dozens of segments. This can waste a lot of memory, so the operating system does not care about the contents of the Segment, but the permissions of the Segment, such as readable, writable, executable, etc. This will merge multiple segments with the same permissions into one load. “Segment” and “Section” divide the same executable from different perspectives. These are called different views in executables, which are linked views from the “Section” perspective and execution views from the “Segment” perspective.

Stack and heap

In addition to binary files, a process also needs stack space to execute. In fact, they also exist in the virtual space of the process in the form of Vmas. Each stack and heap in a process corresponds to a VMA. A process is basically divided into the following VMA regions:

  1. Code VMA, permissions read-only, executable: image files.
  2. Data VMA with read-write and executable permissions: there are image files.
  3. Heap VMA, permissions read-write, executable: no image files, can scale up (heap from low address to high address).
  4. Stack VMA, permissions read and write, not executable: no image file, can scale down (stack from high address to ground address).

Unmapped files refer to whether they are mapped to executable files.

From Programmer self-cultivation, p. 167.

As you can see from the figure above, the executable has been redivided into three parts: Some period consigned to readable executable, they are mapped to the CODE VMA, another part is readable to write, they are mapped to the DATA the vmas, and part of the program loading is not mapping, they are some contain debugging information and string table section, the section of the program execution time is useless, so don’t need to be mapped. There will also be two extra Vmas in the virtual memory space, one representing the stack.

Segment address alignment

Executable files are eventually loaded and run by the operating system, usually through the virtual memory page mapping mechanism. In the iOS operating system, the page size is 16kb, that is, the mapping relationship between physical memory and virtual memory must be an integer multiple of 16kb. However, this alignment creates a lot of fragmentation in memory. To solve this problem, the operating system internally shares a Segment with the segments that border each other. From a certain perspective, the executable file from the beginning to the end of the file is divided into several blocks of 16 KB size, each block is loaded into physical memory, for section two paragraph border, mapping system will again, ensure that physical memory without fragments, in the virtual memory page still can only contain a section.

reference

Self-cultivation of the Programmer