Mach-O

Mach-O(Mach Object) is the file format for storing programs and libraries in macOS, iOS, and iPadOS. The system uses the Application Binary Interface (ABI) to run files in this format.

The Mach-O format is used to replace the A.out format of BSD systems. The Mach-O file format preserves machine code and data generated during compilation and linking, thus providing a single file format for statically linked and dynamically linked code.

Common files in Mach-O format

  • The target file.o
  • The library files
    • .a
    • .dylib
    • Framework
  • Executable file
  • dyld
  • .dsym

It can be passed through the terminalfileCommand to view the file typeSee currentDemoThe file is aMach-OType of 64 bitsx86_64Executable files for the schema

Mach-o file structure

inMachOViewView a Mach-O file on the tool

You can see that the Mach-O file format consists of headers, Load Commands, __TEXT code, __DATA code, symbol tables, and some other information. Dyld finds the specific code based on the information stored in Load Commands

Mach-O can be understood as file configuration plus binary code, i.e. :

Mach-o = File configuration + binary code

Summarized in apple’s official image:

The mach-O structure, as shown in the figure, includes:

  • HeaderContains general information about the binary file
    • Byte order, schema type, number of loading instructions, etc
    • This allows you to quickly verify information such as whether the current file is 32-bit or 64-bit, the corresponding processor, and the file type
  • Load commandsA table with a lot of content
    • The content includes the location of the region, symbol table, dynamic symbol table, etc
  • DataUsually the largest part of the object file – containsSegementSpecific data of

Header

64-bit Header structure

struct mach_header_64 {
	uint32_t	magic;		/* Magic, fast location belongs to 64 or 32 bits */
	cpu_type_t	cputype;	/* CPU type, such as ARM */
	cpu_subtype_t	cpusubtype;	/* CPU type, for example, arm64/armv7 */
	uint32_t	filetype;	/* File type */
	uint32_t	ncmds;		* / / * loadCommands quantity
	uint32_t	sizeofcmds;	/ * * / loadCommands size
	uint32_t	flags;		/* Flag bit identifies binary file support. Mainly related to system loading and connection
	uint32_t	reserved;	/* reserved */
};

Copy the code

inMachOViewView the Header on the tool

Use objdump — macho-private-header to view the mach-o file and output the corresponding header data structure as above

LoadCommands

64-bit load_command structure

struct segment_command_64 { /* for 64-bit architectures */
	uint32_t	cmd;		/* Command type LC_SEGMENT_64 */
	uint32_t	cmdsize;	/* section_64 size */
	char		segname[16];	/* Segment name */
	uint64_t	vmaddr;		/* Segment virtual memory address */
	uint64_t	vmsize;		/* Segment virtual memory size */
	uint64_t	fileoff;	/* File offset */
	uint64_t	filesize;	/* Segment size in the file */
	vm_prot_t	maxprot;	/* Maximum virtual machine protection */
	vm_prot_t	initprot;	/* Initial virtual protection */
	uint32_t	nsects;		/* Number of sections */
	uint32_t	flags;		/ * marked * /
};
Copy the code

Some other load_commands:

LC_SEGMENT_64 Maps segments (32-bit or 64-bit) in a file to the process address space
LC_DYLD_INFO_ONLY Dynamically link related information
LC_SYMTAB Symbolic address
LC_DYSYMTAB Dynamic symbol table address
LC_LOAD_DYLINKER Who uses load, we use dyLD
LC_UUID File the UUID
LC_VERSION_MIN_MACOSX Supports the lowest operating system version
LC_SOURCE_VERSION Source code version
LC_MAIN Set the main thread entry address and stack size of the program
LC_ENCRYPTION_INFO_64 Obtaining encrypted information
LC_LOAD_DYLIB Path to dependent libraries, including tripartite libraries
LC_FUNCTION_STARTS Function start address table
LC_DATA_IN_CODE A list of non-instructions defined in a code section
LC_CODE_SIGNATURE Code signing

Data

Store data: code, character constants, classes, methods, etc

  • Code snippet (__TEXT)

The starting address of the code block is 0, so the starting location of MachO in memory is read from the beginning of the code block

  • Data segment (__DATA)