IOS symbolization: Basics and Advancements

Author: Mi Guang, likes iOS development, likes to do things, wechat subscription number: Chop finger north, Bilibili channel: YZ06276

Review:

Backgammon, editor of Old Driver Technology Weekly, focuses on MNN Workbench, a one-stop machine learning platform, which can be downloaded from www.mnn.zone

Damonwong, iOS developer, editor of veteran driver technology weekly, works in the technology department of tao department

preface

Symbolization helps us locate bugs, crashes, and performance bottlenecks by finding root code causes in runtime logs and stacks; I’m sure you’re familiar with common symbolic tools such as ATOS or dSYM, but how do these tools work? This article will focus on the definition, principle, practice and skills of symbolization to lead you to a deeper understanding of symbolization; This article is based on Session 10211 – Symbolication: Beyond the Basics. The Session speaker is Alejandro Lucena, an engineer from Apple’s Performance Tools team

What is symbolization?

“Map App runtime information to source” to make a long story short is to convert the runtime information into source information, symbolization is a mechanism, we will run the device App memory address and associated instruction information into the source file specific file name, method name, line number, etc.; This can be interpreted as translating information about how the runtime machine views processing our App into how we developers view processing our App (source code). Without this layer of transformation, bug locating becomes difficult even in an App with only a few lines of code;

Demo

In order to guide you to understand the principle of symbolization, the project used in this paper is a simple Demo App with only a few lines, and all his codes are as follows:The logic of demo is simple:randomValue()It can generate random numbers ranging from 1 to 100numberChoices()An array of 10 of the above random numbers can be generatedselectMagicNumber(choices: numbers)We can fetch an element with a specified index from the numbers arraygenerateMagicNumber()Perform the above operations step by step to return the element that fetched the subscript where MAGIC_CHOICE is a random value

Symbolization of the daily crash log

The first time you do thisAppThe error log generated by the crash, there is no intuitive information, is a pile of memory address, I can only seeAppOn the main threadcrashA;

I try to be directdebugmyApp, but the problem does not recur in execution, so it seems that the debugger may not be able to help; After several attempts finally reappeared, but the program crashed in assembly, there is no intuitive information, assembly is too hard core, do not know.

Neither the crash log nor the assembly stack above obviously solves the problem directly, but with the help of symbolization, we can avoid mining errors from these raw memory addresses; I’m sure you all know thatXcode OrganizerIn the loadApp 的 dSYMFile, he’ll reprocess the crash log, and when it’s loaded, we’ll get a crash log like this, which is readable, with call information, file name, number of lines, and the crash log tells me directly that an array was accessed out of bounds, which is very intuitive; With this information, it is also easy to trace back to the code to find random valuesMAGIC_CHOICEEasy to cause the array to be out of bounds when accessing an array of only 10 lengths;

useatosCommand line tools, we can also get the above information

daily`Instruments`Symbolization of the stack

Another symbolic example is when a performance optimization in Instruments detects that the App periodically performs a large number of writes, resulting in periodic high and low load intervals. However, by default, the stack information displayed in the lower right corner can only indicate that the App is writing files, regardless of high load or low load, indicating the same stack;This is because the current Instruments stack is partially symbolized. In general, symbolization is incomplete without a specific file name or line number in the stack. At this point, we can also manually load the dSYM file on Instruments. When we look at the high load area, we are explicitly reminded that there is an extra debug code addDebugLog(), while the method is not called in the low load area. DSYM not only makes crash logs that contain only memory address information readable, but also helps make the Instruments stack information useful, all of which helps us find the code behind the problem.

Symbolic principle

Since symbolic tools can help us locate code problems, you must ask, What? According to? Why can dSYM help symbolization? How? How does dSYM help accomplish symbolization? Is dSYM symbolic of everything? Except for crash logs andInstrumentsYou can load it somewhere elsedSYM?atos 的 -o -i -lWhat is the use of each?InstrumentsWhy not provide a fully symbolic stack directly?XcodeHow do compile Settings affect symbolization? With these questions in mind, let’s delve into some symbolic principles.

To this end, we first decompose and introduce the two steps of symbolization:Step 1: Trace back from the memory address to the file Step 2: Restore the runtime debugging information

Step 1 – address and translation associated with symbolization

Backtracking from memory address to file address refers to converting random memory addresses at runtime into file information that is stably available in binary files on disk; Just as memory addresses have memory space, binaries have address space on disk; However, these two address Spaces cannot be translated directly, so an address translation mechanism is needed.

Address space and binary address on disk

The address of the disk address space is the address given to the binary by the Linker Linker at compile time; Linker groups binary code into segments. Each Segment contains data and attributes, such as the name, size, and address of the Segment. For example, the __TEXT section of a binary file contains the corresponding methods and functions, and the __DATA section contains the global state of the program, such as global variables. Each segment is assigned a unique starting address, which ensures that segments do not overlap.

Specifically, Linker records segment information in the executable header as part of the Mach-O header; As we all know, Mach-O is a file format of executable files and libraries. The Mach-O header contains many load instructions related to the attributes of segments. The operating system kernel loads the corresponding binary segments into memory by reading these load instructions. If the App uses Universal2 packaging, each architecture will have a corresponding mach-o header and associated segment information.

There’s a little bit of information andloadDirectives, let’s take a look at the relevant practices in conjunction with the initial small demoloadInstruction; We can get throughotool -lTo outputloadInstruction information, combinedgrep(string filter tool) can filter outLC_SEGMENT_64 的 loadInstruction, as shown in the figure below; Output result prompt__TEXTThe starting position of the segment is zerovmaddrThe length of the segment isvmsizeSize of bytes shown;

Load the binary file into memory

From the above information, we learn thatloadThe instruction will contain the address and size of the load, so why does the kernel actually passloadAfter the instruction loads, the memory address of the binary segment and thislinkerInconsistent generated addresses? In the figure below, the memory address andlinker 的 A,B,CWhat does the address matter? This will be discussed laterlinkerThe generated address is abbreviated asA,B,C

Address Space Layout randomization – Address space layout randomization technique

In fact, “address space layout randomization” is a computer security technique that prevents memory corruption vulnerabilities from being exploited.ASLRBy randomly placing the address space of the process’s critical data area, an attacker can be prevented from reliably jumping to a specific location in memory to attack the specification function. Modern operating systems typically add this mechanism to prevent malicious programs from attacking known addressesReturn-to-libcAttack. In short, the kernel initializes a random value, called a random value, before loading the binary segmentASLR Slide“Memory space random distribution offset”, referred to as “memory space random distribution offset”S; The kernel then takes the offsetSOverlay thelinkerThe generatedloadInstruction addressA,B,COn; Therefore, the kernel is executingloadCommands will not follow the originallinkerAddresses are loaded directly into memory addressesA,B,CInstead, load toA+S,B+S,C+SWe can take these practicalloadThe loading address is calledLoad Address“Load address”, hereafter,Load AddressWill be referred to asL

By understandingASLRTechnology, we figured it outlinker address 和 load addressThe difference between theta and theta is thetaASLR SlideRandom memory address distribution offset; We can get that formulaALSR Slide = Load Address - Linker Address, simplified asS = L - A

How to get the actual Linker Address and Load Address

As mentioned earlierotoolTo help us look at the binariesloadInstruction information, and then getlinker address(This address can also be viewed asfile address“File address”) to get the runtime memory addressLoad Address, can be accessed through the crash logBinary ImageList,InstrumentsProvide the stack, or throughvmmapCommand line tool to get; Specific UsagevmmapThis will be explained later

Calculate the ASLR Slide random memory offset

With practice, we need to knowASLR SlideRandom memory offset before being able to log and crashInstrumentsMemory address in the stack, minusASLR SlideAnd get the file address; So we have to figure outASLR SlideTo calculate theASLR SlideUsually with a specific paragraph (e.g__TEXT)load address 和 linker addressHow do I get these two addresses, as I said above, in practice we get them from the crash log__TEXTBinary segmentload address 为 0x10045c000; throughotoolI can get__TEXTBinary segmentlinker address 为 0x100000000; We can subtract these two and we can get thatASLR Slide = 0x45c000;

There are theASLR Slide, we can convert the file address in the disk address space from the runtime memory address of the crash log, as shown in the figure below, we can get the file address of the crashed stack in our demo is0x10003b70With the file address, we can use it to view the source code, which will be discussed later. Let’s go ahead and explore some other calculationsASLR SlideThe position of

As shown in the picture below,otoolThe command line tool can be used to view the incoming command information when a crash occurs-tVCan output assembly stack;-arch arm64In order to makeotoolCorrectly handleUniversal 2The product of technical compilation; Output structure corresponding to the above file address, showing this isbrkInstruction, in assemblybrkGenerally stands forAppExceptions or problems occur;

atosCommand-line tools can also help us calculateASLR Slide.atos 的 -oThe instruction will outputfile segment address.-lThe instruction will outputload address;

In addition toatos 和 otool, as well asvmmapCommand line tools can also help us get thisload addressWe can usevmmapTo verify the above calculation,vmmapWhen output crashes__TEXT segment 的 load address, can be calculated using the previous formulaASLR Slide 为 0x104d14000In the crash logruntime address - ASLR Slidegotfile address 为 0x100003b70And what we calculated beforefile addressThe same;

The above two different runtimes, different crash logs, differentASLR SlideCan get the samefile addressThis is no coincidence; Because the kernel runs every timeASLR SlideIs different, so the memory address in the crash log for different devices will change at different times, but the actuallinkder addressIt’s the same; Based on this, although the memory address changes each time, we can still locate the samefile address;So far, we’ve found a mechanism that allows me to locate us in random runtime memoryAppWhat happens at the source level; This mapping mechanism allows us to trace back from the stack information at run timeAppIn the source;

Summary – Trace back from memory address to file address

This is the first step in the symbolization two-step: go back from the memory address to the file and summarize the content and tools in this step

AppAnd the binary file format of the library isMach-O, includingMach-OThe header contains the binary segment association information andloadInstructions, these binary segments arelinkerCreated, which includes the address information of the binary segmentlinker address;
otool -lCan help us exportMach-OAddress and property information for the specified binary segment, includinglinker address;
In crash logsbinary imageThe list can be obtained at the time of the crashload address;
vmmapCan also get runningApp 的 load address
ASLR Slide + Linker address = Load address

Step 2 – Analyze the debug information

With that in mind, we can move on to the second step of symbolization: analyzing debugging information; Debugging information is generally includedfile addressRelationship information between source code andXcodeThis relational information is generated at compile time and stored asdSYMFiles, or you can build this relational information into binary compilations;

There are three types of debugging information, each of which provides different levels of debugging information associated with file address.

Function starts
Nlist symbol table
DWARF

The following figure shows that each of the three tools provides debugging information for the corresponding dimension

Function Starts

As can be seen from the figure above,function startsCompared to other tools, this tool can only provide the starting address of the function. Specifically,function startsThe starting address of the function and the address from which it was called are provided; But it doesn’t tell you if there are other functions in the call address, it just tells you that there is a problem with one function;

function startsThrough the coding__LINKEDITBinary segmentlinkerAddress lists to provide this functionality;function startsBased on directly built inAppIn the compilation product, passmach-OOf the fileloadThe directiveLC_FUNCTION_STARTSTo describe thefunction starts;

In practice, you can passsymbols -onlyFuncStartsDataCommand line tool to outputfunction startsThe relevant information is shown in the figure belownullbecausefunction startsFunction names are not provided, so usenullTo be a placeholder for the function name;

Based on thefunction startsWe can process unsymbolized crash logs from the memory address of the crash log0x10045fb70Minus what we calculatedASLR Slide 0x45c000getfile address 0x100003b70; Then combining withfunction startsOn the output, we find that there is only the first address0x100003a68Less than what we figured outfile address 0x100003b70, so only this first address contains the address where the error occurred; Based on this we calculate that the two addresses are offset0x108In decimal terms, it is264That is, usfile addressAnd the actual error occurred at the address264Byte offset;

So function starts helps us understand how the function in the crash log is set up and which registers are changed. Because function Starts does not provide the function name, we can only analyze the error logs at the low-level machine code level. It is useful for debugging and developing App, but we also need other tools to analyze the error logs.

Nlist symbols List – Nlist symbol table

nlistIs a structure, and its specific structure is shown in the figure below.nlistThe symbol table is built onfunction startsAnd a coded one__LINKEDIT segmentOf coursenlistHave their ownloadInstruction; withfunction startsThe difference isnlistHe encodes more information in its structure than just memory addresses; As shown in the picture below,nlistThe structure contains the name and several other attributes, specificallynlistThe type of then_typeDetermined by the

There are three types of N_type that we are interested in symbolizing. Here we will focus on two of them; The first is direct symbole – direct symbol; The direct symbolic correlation is that in App and binary libraries, there are fully defined methods and functions; The direct symbol stores the function name and the function file address in the nlist_64 structure;

Nlist direct symbol

n_typeThe value of the specified binary bit innlistTo be specific,n_typeThe second, third, and fourth bits of1Is displayednlistThe type is a direct symbol, and the combination of these three bits is also calledN_SECT;

We can get throughNm - defined - only - numberic - sortCommand line tool to viewN_SECT; Here,nmTraverse themagicNumbers AppMake symbols, and listed in order to address, refer to the output in the figure below; Notice we still use it herexcrun -swift-demangleTo resolveSwift manglingThe function name after;

As shown in the figure above, we can already get the method name from the resultnumberChoices()And the name of the classMagicNumbers, file name,main; This is because the information is directly inAppIn the definition;symbolsView the direct symbols andnmTools are similar,symbolsThe command line tool also provides viewingnlistData method, and support automaticdemangle, as shown in the following figureThe above two methods, let us from the memory address in the crash log, associated with the specific function name in the source code, so far, the crash log symbolic information rich degree further;So far, we can match the function entry offset address provided by fuction starts from the direct symbols to a function entry with a name. Putting these information together, we can find that crash occurs at the 264 byte offset of the main method address. But main isn’t the only function in the crash, which suggests we have more to discover; For example, we haven’t figured out the line count information in the code

We’ve figured outmainNot the only function associated with crashes, we have more information to discover; For example, we don’t have the number of lines in the file; And in the above symbolization, some functions are serialized, and some stack and crash log information is not symbolized

We are inInstrumentsA similar situation is encountered in the stack of, where some function names are symbolized and readable, but some are still memory addresses; The reason for this phenomenon is that the functions contained in the direct symbol table are limited to the parts that are directly linked when linking, and the binaries loaded at runtime such as dynamic libraries are not included. These unsymbolized methods are called from dynamic libraries across modules. We need other means of symbolizing this debugging information;

This direct symbol table logic helps reduce compilation volume; After all, it’s not common sense to store all the relevant function information when packaging into a symbol table; forFrameworks 和 LibrariesWe need to deal with logging the methods that are called and stripping out the unused ones; Of course, if you strip out the functions in the main program in the direct symbol table, there’s nothing left in the symbol table;

Effects of Xcode compilation Settings on nlist direct symbols

inXcodeIn the compilation Settings of,stripConfiguration items havestrip linked product,strip style 、strip swift symbolsThree options. These compiler Settings are controlled by optionsAppStripping the logic of redundant symbol tables during link compilation; To be specific,strip linked product 为 YESWhen the binaries will be based onstrip styleValue for symbol table stripping; For example,strip styleA value ofall symbols, the symbol table will execute the most radical stripping strategy, and the final symbol table contains only the most core methods;Non globalsA type strips away the immediate symbols that are used in common across different modules in an application, but leaves them for use elsewhereAPPThe symbol in;Debugging symbols The third type is deletednlistType symbol, which is discussed laterDWARFWill, but this type of stripping preserves the symbols directly used.

For example, here’s one that defines twopublic interfaceInterface and ainternal sharedThe method of implementationframeworkSince all of these functions are useful in linking links, they all have direct symbolic items.

If I follownon globalsPeel it off. There’s only twointerfaceLeaves; Since the shared implementation of the function only inframeworkInternal, so it is not global, and therefore will not be placed in the symbol table;Similarly if it’sall symbolsThe stripping strategy is always if these twointerfaceHave beenframeworkWhen called externally, they are still left behind;

Symbols - onlyNListDataIt prints something distributed between the direct symbolsfunction startsThe entry; These entries also indicate whether the function exists in the direct symbol table or has been stripped out. You can use these stripping Settings to achieve the symbol table visibility you need; With this information, we can determine when direct symbol tables are needed. In practice, sometimes we can symbolize the function name without specifying the line number or file name. Or the symbolic result contains the method name and the method start address, as hereframework 的 symbolsExamples of directives;

Indirect symbols – Indirect symbols

Similar to a direct or indirect signn_typeThe first binary bit of is1, orn_EXT

throughnm -m -arch arm64 -undefined-only --numberic-sort MagicNumbersOutput indirect symbol information; Which uses- undefined - onlyTo replace- defined - only, which is used to view indirect symbols;-mThis allows you to see where these methods come fromframework 或 libraries. The output is indicated in the figure belowMagicNumbers AppRely on thelibSwiftCoreA series ofSwiftThe basic methods are as followsprint().

#### Summary – Function starts and NLIST notation table at the beginning of this article, we agreed to discuss three symbolic tools, Function starts, Nlist and DWARF. The first two have been discussed so far, and I’ll review them here;

Function startsCan provide address list, missing method name, can help calculate the file address offset corresponding to the crash;
Nlist symbol tableStore details associated with an address in a structure,nlistSymbols can provide function names and can also be described inAppDirect symbols defined within and indirect symbols provided in the binary library; Direct symbol tables usually retain functions related to links,XcodeIn project Settingsstrip build styleAffects the contents of the immediate symbol table.
Both symbol tables are embedded directly inAppBinary fileMach-OIn the header__LINKEDITBinary segment

DWARF

So far we have not seen symbolic information such as file name, number of function lines, number of crash lines, etc. This information is available inDWARFWe will discuss it in detail hereDWARF; In contrast tonlistThe symbol table only holds part of the function information,DWARFAlmost all context information for a function is recorded; reviewfunction startsProvide offset information on only one dimension;nlistBased on the codingnlist_64The structure escalates debugging information into two dimensions, address information and function names; As a comparisonDWARFA third dimension is added: relational information; In real projects, functions do not exist in isolation. Functions are called and other functions are called within them. Functions have input and output parameters. By logging contextual information about these functions; DWARF will unlock the most iconic poses;

When we analyzeDWARF“, generally refers to reference analysis of onedSYM bundlethebundleThere are metadata components inplist, including oneDWARFBinary file; In binaryDWARFThe information is recorded in__DWARFBinary segment;DWARFIn this binary section are recorded the three data flows we need to focus on; Specifically, the three data streams aredebug_info.debug_abbrev.debug_line ； debug_infoContains raw data,debug_abbrevStructured processing of raw data,debug_lineContains file names and line numbers; In addition toDWARFIt also defines two that need to be discussedvocabulary listVocabulary:compile unitCompilation unit andsubprogramSubroutines; The third type of vocabulary – inline subroutines – will be mentioned later

Compile Unit – The Unit that compiles

A compilation unit represents a single source file that will be compiled in a project; Specifically, each in the projectswiftEvery file has a compilation unit that corresponds to it;DWARFAttributes are assigned to each compilation unit, such as filename, module name,__TEXT segmentFunction placeholder part, etc.;main.swiftFile corresponding compilation unit indebug_infoThese properties are stored in the data stream, as shown on the left; Corresponding to that, indebug_addrevThe data stream contains an associated entry that tells us what the values represent, as shown on the right; We see the file name, language, and one on the rightlow/highYeah, it’s a statement__TEXT segmentThe scope of the

Subprogram – subroutine

A subroutine represents a defined function; We already havenlistDefined methods are found in the symbol table, but subroutines can also be used to describe static and local methods; Subroutines also have their own names and corresponding__TEXT segmentAddress start range

Tree DWARF relationship

A basic relationship between a compilation unit and a subroutine is that a subroutine is defined in a compilation unit;DWARFUse a tree to express this relationship; The compilation unit is on the root node, and the subroutine is the child node of the root node. These child nodes can be retrieved by their address range;

We can get throughdwarfdumpCommand line tool to verify the aboveDWARFFirst, we’ll look at a compilation unit that carries the same attributes (filename, language, number of lines, etc.).dwarfdumpTools combinedebug_info 和 debug_abbrevContent to showdSYMsData structure and content in a file

The output is long, and if we look down, we’ll see a subroutinesubprogram; The address range it occupies exists within the address range of the compilation unit, and the method name can be seen; I mentioned earlierDWARFA very detailed description of symbol tables and relational information that we will not delve into furtherDWARFBut knowing these details can help us understand the logic behind symbolization;

Looking down at the output, you’ll see that it also contains parameter information,DWARFHas its own vocabulary describing the names and types of parameters; A parameter is a child node of a subroutine; You can see the output in the figure belownumberofChoiceParameters of a functionchoicesRelevant information of; File name and line number information

In addition,debug_lineThe file name and the number of lines associated with the function are stored in the data stream. butdebug_lineThe data flow is not a tree structure; instead, the data flow defines oneline table programLine table program, the navigation program can let the linked file address map to the source file specific line number; We can use this line table program to find the file address associated with the specific source and line number;

To sum up, based ondebug_infoThe tree structure anddebug_lineThe row table program, we can get a following structure; By walking through the tree, we can find the desired file address; Start with the compilation unit, iterate through its children, and filter out the containsdebug_lineChild node of;

DWARF and inline optimization of compile-time functions

We can use the ATos command line tool to do this, this time we omit the -i flag, you can see that the output is much less, only the method name, file name, and line number; The result here provides the number of lines, so we can conclude that we are using DWARF for symbolization; But except for the file name and the number of lines, this output is not much different from the symbolized results of the NList symbol table; Then we try to add -i flag to atOS. The output is the second image below. You can compare the difference between the two outputs. They command only a – I atos – o MagicNumbers. DSYM/Contents/Resources/DWARF/MagicNumbers – arch arm64 – l x10045fb70 x10045c000 0 0 atos -o MagicNumbers.dSYM/Contents/Resources/DWARF/MagicNumbers -arch arm64 -l 0x10045c000 -i 0x10045fb70

You might guess, well, this is-iWhat it means; In fact,atos 的 -imeansinlined functionInline functions, inlining is a general optimization performed by the compiler; To be more specific, inlining means that the implementation code of a function replaces the code called by the function directly during compilation. This substitution makes both the calling code and the function definition code “disappear”; In ourDemoIn is to usenumberOfChoice()The implementation code replaces the calling code;numberOfChoice()The calling code is missing

Inlined subroutines – Lined subroutines

DWARFUse inline subroutines to express this compile-time inlining optimization; So that’s the third one we’re going to talk aboutvocabulary listGlossary types:inlined subroutinesInline subroutine; An inline subroutine is a type of subroutine, so it is also a method, a method that is inlined to another subroutine; So the inline function is inDWARFA relationship tree is a child node of a subroutine; This definition implies recursion; That is, an inline subroutine can have other inline subroutines as child nodes;Again usingdwarfdumpCommand-line tools, we can check that outDWARFInline subroutine; These inline subroutines are listed as children of other nodes and have attributes similar to those of subroutines, such as name and address; But in theDWARFIn a file, these properties are typically accessed through a common node, called an abstract source; If there are many inline copies of a particular function, the common shared properties of that function are stored in the abstract source, so that the inline functions are not duplicated. Inline subroutines have a unique propertycall siteCall location; This property represents where functions are actually called in the source code, which the compiler optimizer replaces. For example, we are inmain.swiftLine 36 in the file is calledgenerateANumber(), which makes it necessary to add child nodes to the tree to record the function call;

So here we areDWARFWith a more comprehensive understanding of symbolization, as shown in the figure below, we have an understanding ofAppCall logic also has a broader perspective. Understanding the optimization methods and details of inline functions is the key to fully symbolic crash logging.-iThe instructions actually require thatatosThe above inline functions are taken into account in the process of symbolization. The information for these inline functions is also inInstrumentsMissing in stack; We’re crashing logs andInstrumentsAll on the stackdSYMDocuments, becausedSYMContains precisely the three types of information mentioned above: compilation units, subroutines, andDWARFRelationship tree;

Get DWARF from the library and object files

In addition todSYMFiles can also be found in static libraries and object filesDWARF; Which means even if there isn’tdSYMFiles you can still generate from static libraries or linked functions in object filesDWARF; In this case, you will find the debug symbol tablenlistType, these could have beenstripOne of the stripped symbol types; But thesenlistThe type does not directly containDWARFInstead, they associate functions directly with their source files; If a library is built to contain debugging information, thesenlistEntries can be provided to usDWARFInformation about

Of the above typenlistEntries can be accessed throughdsymutil -dump-debug-mapCommand line tools to output and view in detail; Here we list the different function methods and where they come from; This address information can be scanned and processed intoDWARFInformation required in the document;

Summary – DWARF

DWARFIt is an important source of deep symbolic data
DWARFDescribes the important relationship between functions and files.
DWARFThe problem of compilation time convergence optimization is properly dealt with.
dSYMBoth files and static libraries can be includedDWARF;
It is recommended in practicedSYMTo obtainDWARFBecause fromdSYMTo derive theDWARFCan be easily used in other tools, andXcodeMany built-in tools are also supportedDWARF;

Development tools and symbolic practices

Xcode compilation setting -debug info format

Direct build is recommended for local development configurationDWARF
For the publish build configuration, make sure that the build containsDWARF 的 dSYMfile
Submitted toApp Store Connect 的 AppYou can download it theredSYM
Even if it doesbitcodeTechnology, you can also get fromApp Store ConnectDownload thedSYMfile

Find and confirm`dSYM`file

As shown below, locallyMacUp can catch itmdfindCommand line tool checkdSYMFile; This alphanumeric string is used to compile binary artifactsUUIDIs also runtimeloadThe unique identifier of the instruction; You can still passsymbols -uuidTo view thedSYMOf the fileUUID;

In rare cases, the compilation process will produce an invalidDWARFYou can go throughdraftdump -verifyCommand to verifyDWARFEffectiveness; If this check command outputs any errors, go throughfeedbackassistant.apple.comTo carry outDeveloper Tool - A development ToolthebugFeedback;

The maximum size of a DWARF binary is 4GB. If errors greater than 4GB are reported in the validation above, you may consider componentializing the project so that each component has a smaller dSYM

In practice, by comparisondSYM 的 UUIDAnd crash logsbinary image 的 UUIDSex to match the two; Except to view it in the crash logAppBinary mirroredUUIDYou can still passsymbolsCommand line tool to obtainUUID, refer to the figure below; In actual symbolization, you needdSYMAnd crash logsUUIDMatching;

Other symbolic details

symbolsCommand-line tools can also help you check in on yourselfAppAvailable debugging information contained in the compilation product; The label in square brackets in the output tells the source of the debugging information. Use this command to see what debugging information is available when you don’t know what to use for debugging.

If you’re sure it’s already availabledSYMFile, but still notInstrumentsStack information symbolized in, please check the item’sEntitlementsAnd code signing configuration; Specifically usingcodesignCommand-line tool, where you can verify that you have the correct code signing configuration;

At the same time, you also need to check the local developmententitlementIs it contained inget-task-allowThe configuration grantsInstrumentsSuch tools perform mappings in debuggingAppSymbolic rights; In general,XcodeThis is automatically set by defaultget-task-allowConfiguration items; butInstrumentsIf it cannot be symbolized, check the configuration item. If you find thatentitlementThere is noget-task-allow, you can check to make surebuild-setting -> code signing -> code signing inject base entitlemensThe value oftrueTo solve the problem;

Finally, for useUniversal 2technologyAppWhen using the command-line tools mentioned in this article, you can specify schemas such assymbols,otool,dwarfdumpThere are-archCan be configured so that only architecture-specific operations can be performed;

conclusion

As the “symbolic progression” in its name, this Session is summed up with the following key points

symbolic

UUID and file address are a consistent and reliable way to identify App problems at runtime, as they are not affected by the ASLR Slide offset; UUID and file address are key first steps in symbolizing runtime information

In practice, use as much as possibledSYMComplete symbolization;dSYM 以 DWARFThe form records the most detailed debugging information, and isXcode 和 InstrumentsWell supported
Several command-line symbolization tools such asotool.vmmap.nm.symbols.dwarfdump.atos; These tools are included in theXcode Command line toolProvides powerful diagnostics and the ability to view symbolic processes and details. If necessary, you can integrate these tools into your workflow;

If you are interested in learning more about linking and symbolization, I recommend two WWDC18 sessions: Optimizing Your App Startup Time By Optimizing the startup speed of your App Past, Present, and Future-App launch timeline: past,present, and future;

Pay attention to our

We are the Veteran Driver Tech Weekly, a tech newsletter that continues to pursue premium iOS content. Welcome to follow.

Focus on politeness, focus on[Old Driver Technology Weekly], reply “2021” and get the internal reference of 2017/2018/2019/2020

Support the author

I recommend the WWDC21 Insider column, which contains 102 articles about WWDC21 and is the source of this article. If you are interested in the rest of the content, please click the link to read more

WWDC internal reference series is led by the old driver organization of high-quality original content series. We’ve been doing it for a couple of years, and it’s been good. It is mainly for the annual WWDC content, do a selection, and call on a group of front-line Internet iOS developers, combined with their actual development experience, Apple documents and video content to do a second creation.