This is the seventh day of my participation in the August More text Challenge. For details, see: August More Text Challenge
IOS underlying principles + reverse article summary
This article focuses on a demonstration of the internal structure of mach-o files
This article focuses on a demonstration of the internal structure of mach-o files
Mach-o file structure
The following is more of a mach-o structure demonstration, more mach-O introduction can see ios-Underlying Principles 32: Startup Optimization (I) basic concepts, not covered in detail here
Mach-o internal structure
Because the MachO file itself is a file format, we definitely need to understand its internal file structure. The internal results are shown in the figure below:The mach-O structure is divided into three parts
-
Header contains general information about the binary
- Byte order, schema type, number of load instructions, and so on.
- This allows you to quickly verify information such as whether the current file is being used for 32-bit or 64-bit, what processor is it on, and what type of file is it
-
A table that contains many contents
- The content includes
Location of the region, symbol table, dynamic symbol table
And so on.
- The content includes
-
Data is usually the largest part of the object file
- contains
Segement
The specific data, first of allpiecewise
And then divide the segments
- contains
Terminal command: otool
In the terminal, we go throughotool
Check out some of the mach-o instructions
- To view the Header information for Mach-o:
otool -f 12-macho
MachOView software
- through
MachOView
View the Mach-o file
- Is there paging between ARM_V7 and ARM_V7s?
ARM_V7 = ARM_V7s16384 plus 79376 minus 98304 is minus 2544
- 'ARM_V7' size: '-2544-79376 = -81920', check if this value is a multiple of pageSize (4096) - check if this value is a multiple of pageSize: '81920/4096 = 20' (MacOS), but because of iOS, '20/4 = 5' pages, indicating paging, i.e. page alignCopy the code
Both comparisons
- Otool is compared to MachOView’s mach-o view
MachOView demo
throughMachOView
The Mach-O structure is divided into three parts:Header
,Load Commands
,Data
- 1, check the
arm64
Under the Mach – OHeader
- 2, view
Load Commands
– VM Addr
: Indicates the virtual memory address
- 'VM Size' : virtual memory Size, at run time, Size in memory, 4g - 64-bit address: 0x12345678a23456778-32-bit address: 0x12345678 - 'File offset' : Data offset in the File - 'File size' : The size of data in the FileCopy the code
- 3. The Section is divided into two categories:
__TEXT (code), __DATA (data)
1. The data structure of headers
- in
CMD+shift+O
searchloader.h
To findMach_Header_64
(ARM64 architecture) data structures are shown below, andmach_header
It’s just one morereverse
<! Struct uint32_t mach_header {uint32_t magic; /* cpu_type_t cputype; /* CPU type */ cpu_subtype_t cpusubtype; / uint32_t fileType; / uint32_t filetype; /* Uint32_t NCMDS; /* uint32_t NCMDS; /* uint32_t sizeofcmds; / uint32_t sizeofcmds; /* uint32_t flags; /* uint32_t flags; The /* flag bit identifies the functions supported by binary files, mainly related to system loading and linking */}; <! -- 3, mach_header_64--> struct mach_header_64 {uint32_t magic; /* cpu_type_t cputype; /* CPU type */ cpu_subtype_t cpusubtype; / uint32_t fileType; / uint32_t filetype; /* Uint32_t NCMDS; /* uint32_t NCMDS; /* uint32_t sizeofcmds; / uint32_t sizeofcmds; /* uint32_t flags; /* uint32_t flags; /* uint32_t reserved; /* uint32_t reserved; /* reserved */ };Copy the code
- 2. Check the fileType type
-
OC file: #define MH_OBJECT 0x1
-
Executable file: #define MH_EXECUTE 0x2
-
.
-
2, the Load Commands
The meanings of the related fields in Load Commands are shown below
LoadCommands | instructions |
---|---|
LC_SEGMENT_64 | A file (32-bit or 64-bit) segment is mapped into the process address space, which is divided into __TEXT, __DATA, and LINKEDIT chunks |
LC_DYLD_INFO_ONLY | Dynamically link related information |
LC_SYMTAB | Symbolic address |
DYSYMTAB | Dynamic symbol table address |
LC_LOAD_DYLINKER | To use who loads, we use dyld |
LC_UUID | A mach-o file is uniquely identified as a UUID |
LC_VERSION_MIN_MACOSX | The lowest operating system version is supported |
LC_SOURCE_VERSION | Source code version |
LC_MAIN | Sets the entry address and stack size for the main thread of the program |
LC_ENCRYPTION_INFO_64 | Encrypted information |
LC_LOAD_DYLIB | The path of the dependent libraries, including the three party libraries |
LC_FUNCTION_STARTS | Function start address table |
LC_CODE_SIGNATURE | Code signing |
demo
- 1.
LC_SEGMENT_64
In the__TEXT, __DATA, LINKEDIT
Is shown in the following figure
- 2, view
LC_DYLD_INFO_ONLY
Dynamically linked information
Rebase indicates redirection. The redirection process is described as follows:1)Code segment
In a Mach-o file, an offset address is generated at compile time
- 2) At run time, 'mach-o files are put into virtual memory', which is also randomly changed (allocated by the system - ASLR) - 3) so the previous code segment is not used in mach-O, and needs to be redirected via 'ASLR + Rebase Info Offset', The main change is the assembly codeCopy the code
- 3, check the
LC_SYMTAB
Symbolic address
- 4, check the
LC_LOAD_DYLINKER
Which link to use, in this case, is dyld
- 6, check the
LC_UUID
, the unique identifier of a Mach-O file
- 7, view
LC_VERSION_MIN_MACOSX
, the lowest supported version
- 8, view
LC_SOURCE_VERSION
, code version
- 9, view
LC_MAIN
, entry function
– role: Is used when the entry point cannot be found in reverse (for example: after protection, the operation will flash back), which can be found here
- 10, view
LC_ENCRYPTION_INFO_64
At this time,Crypt ID
for0
, indicating that the encryption is not available
3, the Data
- 1. If we want to quickly locate a piece of code, we need to go through
LC_SEGMENT_64(__TEXT)
In theVM Adress
- 2, view
Code segment
theThe starting position
You can also go throughobjdump
Command to view:objdump --macho -d 12-macho
As you can see here, this corresponds exactly to the mach-o file
- Stub, stub_helper: this is used for symbolic binding
0x1000065d4
It’s all pointing000325D4
Offset, and the first six sentences of assembly are doneSymbol binding
- 4, look at the external symbol table (that is, call the external function, only at runtime binding), there are two: lazy load, non-lazy load
In this case, we first bind the function that is specifically used to bind external functions, and then we use this function to bind other functions
conclusion
-
Mach-o internal structure
-
Header: Used to quickly determine the CPU type and file type of the file
-
Load Commands: Instructs the loader how to set and Load binary data
-
Data: Stores Data, such as code, Data, string constants, classes, methods, etc.
Section
There are two main categories:__TEXT (code), __DATA (data)
-
-
You can run the otool command to view the Mach -o information, for example, the Header information: otool -f 12-macho
-
The code snippet can be viewed with the objdump command :objdump –macho -d 12-macho