This is the seventh day of my participation in the August More text Challenge. For details, see: August More Text Challenge

IOS underlying principles + reverse article summary

This article focuses on a demonstration of the internal structure of mach-o files

This article focuses on a demonstration of the internal structure of mach-o files

Mach-o file structure

The following is more of a mach-o structure demonstration, more mach-O introduction can see ios-Underlying Principles 32: Startup Optimization (I) basic concepts, not covered in detail here

Mach-o internal structure

Because the MachO file itself is a file format, we definitely need to understand its internal file structure. The internal results are shown in the figure below:The mach-O structure is divided into three parts

  • Header contains general information about the binary

    • Byte order, schema type, number of load instructions, and so on.
    • This allows you to quickly verify information such as whether the current file is being used for 32-bit or 64-bit, what processor is it on, and what type of file is it
  • A table that contains many contents

    • The content includesLocation of the region, symbol table, dynamic symbol tableAnd so on.
  • Data is usually the largest part of the object file

    • containsSegementThe specific data, first of allpiecewiseAnd then divide the segments

Terminal command: otool

In the terminal, we go throughotoolCheck out some of the mach-o instructions

  • To view the Header information for Mach-o:otool -f 12-macho

MachOView software

  • throughMachOViewView the Mach-o file

  • Is there paging between ARM_V7 and ARM_V7s?

ARM_V7 = ARM_V7s16384 plus 79376 minus 98304 is minus 2544

- 'ARM_V7' size: '-2544-79376 = -81920', check if this value is a multiple of pageSize (4096) - check if this value is a multiple of pageSize: '81920/4096 = 20' (MacOS), but because of iOS, '20/4 = 5' pages, indicating paging, i.e. page alignCopy the code

Both comparisons

  • Otool is compared to MachOView’s mach-o view

MachOView demo

throughMachOViewThe Mach-O structure is divided into three parts:Header,Load Commands,Data

  • 1, check thearm64Under the Mach – OHeader

  • 2, viewLoad Commands

VM Addr: Indicates the virtual memory address

- 'VM Size' : virtual memory Size, at run time, Size in memory, 4g - 64-bit address: 0x12345678a23456778-32-bit address: 0x12345678 - 'File offset' : Data offset in the File - 'File size' : The size of data in the FileCopy the code
  • 3. The Section is divided into two categories:__TEXT (code), __DATA (data)

1. The data structure of headers

  • inCMD+shift+Osearchloader.hTo findMach_Header_64(ARM64 architecture) data structures are shown below, andmach_headerIt’s just one morereverse
<! Struct uint32_t mach_header {uint32_t magic; /* cpu_type_t cputype; /* CPU type */ cpu_subtype_t cpusubtype; / uint32_t fileType; / uint32_t filetype; /* Uint32_t NCMDS; /* uint32_t NCMDS; /* uint32_t sizeofcmds; / uint32_t sizeofcmds; /* uint32_t flags; /* uint32_t flags; The /* flag bit identifies the functions supported by binary files, mainly related to system loading and linking */}; <! -- 3, mach_header_64--> struct mach_header_64 {uint32_t magic; /* cpu_type_t cputype; /* CPU type */ cpu_subtype_t cpusubtype; / uint32_t fileType; / uint32_t filetype; /* Uint32_t NCMDS; /* uint32_t NCMDS; /* uint32_t sizeofcmds; / uint32_t sizeofcmds; /* uint32_t flags; /* uint32_t flags; /* uint32_t reserved; /* uint32_t reserved; /* reserved */ };Copy the code
  • 2. Check the fileType type
    • OC file: #define MH_OBJECT 0x1

    • Executable file: #define MH_EXECUTE 0x2

    • .

2, the Load Commands

The meanings of the related fields in Load Commands are shown below

LoadCommands instructions
LC_SEGMENT_64 A file (32-bit or 64-bit) segment is mapped into the process address space, which is divided into __TEXT, __DATA, and LINKEDIT chunks
LC_DYLD_INFO_ONLY Dynamically link related information
LC_SYMTAB Symbolic address
DYSYMTAB Dynamic symbol table address
LC_LOAD_DYLINKER To use who loads, we use dyld
LC_UUID A mach-o file is uniquely identified as a UUID
LC_VERSION_MIN_MACOSX The lowest operating system version is supported
LC_SOURCE_VERSION Source code version
LC_MAIN Sets the entry address and stack size for the main thread of the program
LC_ENCRYPTION_INFO_64 Encrypted information
LC_LOAD_DYLIB The path of the dependent libraries, including the three party libraries
LC_FUNCTION_STARTS Function start address table
LC_CODE_SIGNATURE Code signing

demo

  • 1.LC_SEGMENT_64In the__TEXT, __DATA, LINKEDITIs shown in the following figure

  • 2, viewLC_DYLD_INFO_ONLYDynamically linked information

Rebase indicates redirection. The redirection process is described as follows:1)Code segmentIn a Mach-o file, an offset address is generated at compile time

- 2) At run time, 'mach-o files are put into virtual memory', which is also randomly changed (allocated by the system - ASLR) - 3) so the previous code segment is not used in mach-O, and needs to be redirected via 'ASLR + Rebase Info Offset', The main change is the assembly codeCopy the code
  • 3, check theLC_SYMTABSymbolic address

  • 4, check theLC_LOAD_DYLINKERWhich link to use, in this case, is dyld

  • 6, check theLC_UUID, the unique identifier of a Mach-O file

  • 7, viewLC_VERSION_MIN_MACOSX, the lowest supported version

  • 8, viewLC_SOURCE_VERSION, code version

  • 9, viewLC_MAIN, entry function

role: Is used when the entry point cannot be found in reverse (for example: after protection, the operation will flash back), which can be found here

  • 10, viewLC_ENCRYPTION_INFO_64At this time,Crypt IDfor0, indicating that the encryption is not available

3, the Data

  • 1. If we want to quickly locate a piece of code, we need to go throughLC_SEGMENT_64(__TEXT)In theVM Adress

  • 2, viewCode segmenttheThe starting position

You can also go throughobjdumpCommand to view:objdump --macho -d 12-macho As you can see here, this corresponds exactly to the mach-o file

  • Stub, stub_helper: this is used for symbolic binding0x1000065d4It’s all pointing000325D4Offset, and the first six sentences of assembly are doneSymbol binding

  • 4, look at the external symbol table (that is, call the external function, only at runtime binding), there are two: lazy load, non-lazy load

In this case, we first bind the function that is specifically used to bind external functions, and then we use this function to bind other functions

conclusion

  • Mach-o internal structure

    • Header: Used to quickly determine the CPU type and file type of the file

    • Load Commands: Instructs the loader how to set and Load binary data

    • Data: Stores Data, such as code, Data, string constants, classes, methods, etc.

      • SectionThere are two main categories:__TEXT (code), __DATA (data)
  • You can run the otool command to view the Mach -o information, for example, the Header information: otool -f 12-macho

  • The code snippet can be viewed with the objdump command :objdump –macho -d 12-macho