What is a symbol table

The symbol table debugger Symbols is a.dsym file generated by the Xcode compilation project

In layman’s terms, we define methods and variables that have names, and the table that holds these names is the symbol table.

Symbol binding and its process

ASLD and redirection

Image is an image, and a locally compiled executable is an image after it is run into memory. The executable files of dynamic libraries such as UIKit and Foundation are copied to memory as image files.

Breakpoint outputimage listCommand to view the start address of the project executable

Each time machO runs, the location of the head address is different because the system adds a random offset (ASLR) to the virtual address.

By subtracting the address of the symbol from the initial address, you get an offset that doesn’t change no matter how many times you run it again.

Next, what is redirection: ASLR + offset = memory

  • Internal function addresses can be computed by redirection

  • External functions (dynamic library functions such as NSLog) cannot compute function addresses

The loading time of the symbol

The DYLD dynamic linker loads the image, and the Load Commands shown in MachOview are the basic information of the engineering dependency library. The application is not loaded until the dependency library is loaded.

Symbol table uses PIC technology, that is, position code independence. When the application is started for the first time but the NSLog method is not called, we can still see the NSLog and some information in the Indirect Symbols of the Dynamic Symbol table, but the Symbol binding has not been carried out at this time, and the function address is not actually recorded in it.

DYLD doesn’t do symbolic binding (telling the current application the real address of the function) until the first time the method is used, and the next time it can be called directly from the symbol table.

This is lazy loading of symbols

Symbolic binding process

For every external method call, the pins of the external symbols are first found. The symbol pointer STUbs table is the symbol pointer stubs table where the jump instruction is stored and the parameter is passed.

Next you need to look up information about the method in the Lazy Symbol Pointers table.

If this method is called for the first time, the Lazy load table records the pass parameter to the first step of the dyLD_stub_binder method in non-lazy Symbol Pointers.

Dyld_stub_binder gets the real method address from the dynamic library, replaces lazy table data and calls it.

Each subsequent call can call the function directly from the address of the successfully bound function, without binding again.

How to get rid of symbols

Why do I get rid of signs

Removing the symbol makes the installation package smaller and makes the application more secure

There are several symbols

  • Global symbol (export symbol), generally provided for external use, as a dynamic library. So dynamic libraries can’t get rid of symbols altogether.
  • Indirect symbol (import symbol) fromFoundationDynamic library functions, such as the need to do symbol binding, so can not be removed

The APP’s own methods generally do not need to provide external use, so except for indirect symbols, other symbols can be removed.

For this file, the symbol for the static modifier function is a local symbol.

There are no static functions available to this file that can be called by other files, which are global symbols.

Local symbols are not generated if they are not used.

Global symbols are also generated when not used.

The symbol table can be viewed with the objcdump -macho -t symboldemo1 directive

  • Prefix L = local, local symbol

  • Prefix g = gobal, global symbol

Xcode-related configuration

Xcode allows you to control the symbols that you want to keep in the symbol table

Build Setting -> Target -> Deployment postprocessing set to NO the symbol is reserved for compilation and removed for packaging.

Build Setting -> strip Style

  • All SymbolIn general, for APP, you only have the indirect notation
  • Non-Global Symbol, generally used for dynamic libraries, only global symbols
  • Debugging SymbolDuring compilation, unsign operations are performed, and debug cannot break

How to recover symbols

Reverse engineering will have techniques to recover symbols, what can recover symbols?

OC has dynamic language features, runtime can get classes, methods from strings…

The above variables are saved as strings in the MachO file even if the removal symbol is set.

Local symbols for OC and SWIFT can be restored using the restore_symbol tool (search download, message not found).

  • Swift can be restored because it is dynamically dispatched

  • OC has dynamic language features, so it can be restored

  • C functions have no dynamic properties and cannot be restored

How to change symbols

fishhookThe framework

FishHook is a tool provided by Facebook to dynamically modify linked Mach-O files

By modifying the symbol table pointing to achieve the effect of hook method

The code is simple. FishHook provides two hook methods and a hook structure.

Build a rebinding structure as follows and pass in the rebind_symbols function

- (void)viewDidLoad { [super viewDidLoad]; // hook NSLog struct rebinding nslog = {}; Nslog. name = "nslog "; // The replacement method nslog.replacement = new_NSLog; // The replaced method pointer nslog.replaces = (void *) &sys_nslog; Struct rebinding RBS [1] = {nslog}; struct rebinding RBS [1] = {nslog}; // hook rebind_symbols(rbs, 1); Static void (*sys_NSlog)(NSString *format,...) ; // define new NSLog void new_NSLog(NSString *format,... { format = [format stringByAppendingString:@"\n hook Succ"]; sys_NSlog(format); } // (void)touch began :(NSSet< touch *> *)touches withEvent:(UIEvent *)event {NSLog(@"test"); }Copy the code

fishhookThe process of

Fishhook works by rebinding symbols. First let’s look at how to use symbols to find the name of a method

The order of records for Lazy Symbol Pointers and Indirect Symbols is the same.

In the Data section of the Lazy Symbol Pointers table, the Symbols’ number on the overall Symbol table is recorded.

By looking up the total symbol Table by number, you can see the Library that the method belongs to and the offset of the String Table record.

Add the pFile address of the first row of the string table index to the corresponding row to find the corresponding function name.

In fact, just reverse the above process, you can find the symbol location by method name!

  • 1. Fishhook is used in String tables. Split the string, compare the method name to find the row, and subtract the first line pFile from the row to calculate the string table index

  • 2. Check the position of symbols in the general symbol Table (String Table index) by traversing the String Table index.

  • 3. In the Indirect symbol table, use the Symbol index to query the offset value of the symbol.

  • 4. The locations of Symbols in the Lazy Symbol Pointers list can be found by one-to-one mapping between the Symbols on the Lazy Symbol Pointers list and the Indirect Symbols list.

  • 5. The final values are those recorded in the Lazy Symbol Pointers table

The lazy Symbol Pointer is used to find the address of the method.

Resolve symbol conflicts in third-party libraries

The dynamic library situation

When two dynamic libraries have a method of the same name, the dynamic library method is called to find the method’s library first, and then to find the method symbol.

So if you find library A, you don’t look for symbols in library B. There are no sign conflicts between dynamic libraries

Static libraries

When both static libraries have methods of the same name

  • If a static library is not used, it will not be linked. The smallest unit of a link is a class
  • 2. The linker will no longer link to the A symbol of the same name after linking to it

If you add categories to the static library, things will be different. Categories are added dynamically and are not linked at compile time.

The static library is also not sure if the dynamic library uses the classification code when linking, so it will not link the classification, and the runtime will crash and cannot find the method.

The Other Link Flags option is to add the -objc parameter so that all OC code will be compiled with the class code.

Build Setting -> Other Link Flags -forceload specifies the library load code to resolve the conflict.

When both static libraries have dynamic code that needs to be linked, but no source code is available, use llvmobjCopy to modify the library binaries and prefix them to resolve the conflict.