Verifying the correctness of code is one of the most difficult problems in computer science, and because there is no universally correct algorithm, this verification is usually done using digital signature processing. Digital signature mainly does two parts:
- Verify that the source of the code is legitimate.
- Whether the code has been modified.
Code signing is not unique to Apple, as Dalvik is used in Java and Android, but Apple was the first to use it. You can read below to consider how verification is implemented to verify that the source of the code is legitimate and that the code has been modified.
This article is based on Jonathan Levin’s code signing chapter, The Best iOS and macOS Security Guide.
- Test environment: macOS 11.2.3.
- Test Items:
/bin/ls
MachO file in x86_64 architecture. Files on iOS aren’t that different.
Code signature format
Before understanding the mechanism of code signing, it is important to understand what code signing involves. The code signature is attached to the tail of the MachO. The load command is LC_CODE_SIGNATURE, which points to a super binary block Code Signature that contains several other sub-binary blocks. A previous article explained how to parse this signed binary block manually: Understanding the MachO data parsing rules in depth.
Here is the hierarchy of the binary block:
A super binary block is a directory structure that specifies the location of subbinary blocks, which are the primary role of code signing.
Subbinary block type
Subbinary block types are represented by different values:
value | Binary block type |
---|---|
0x0000 | The code directory |
0x0002 | demand |
0x0005 | authorization |
0x10000 | CMS binary block |
0x10001 | Proof of identity (not used) |
This paper mainly analyzes the functions and partial implementation of these sub-binary blocks.
Binary block extraction
Jtool is an efficient tool developed by Jonathan Levin for MachO analysis and can be installed using Homebrew.
$ brew install jtool
Copy the code
We can use Jtool to extract the code signing part separately:
$ jtool -arch x86_64 -e signature /bin/ls
Extracting Code Signature (5728 bytes) into ls.signature
$ od -t x1 -A x ls.signature # Raw byte content0000000 fa de 0c c0 00 00 14 86 00 00 00 03 00 00 00 00 0000010 00 00 00 24 00 00 00 02 00 00 02 61 00 01 00 00 0000020 00 00 02 9d fa de 0c 02 00 00 02 3d 00 02 01 00 0000030 00 00 00 00 00 00 00 7d 00 00 00 30 00 00 00 02 0000040 00 00 00 0e 00 00 d2 30 20 02 0b 0c 00 00 00 00 0000050 00 00 00 00 63 6f 6d 2e 61 70 70 6c 65 2e 6c 73#...
Copy the code
You can also use MachO to find the Code Signature block for review.
Subbinary block of code signature
We can use JTool to view analysis of code signing content:
$ jtool -arch x86_64 --sig -v /bin/ls
Blob at offset: 53808 (5728 bytes) is an embedded signature of 5254 bytes, and 3 blobs
Blob 0: Type: 0 @36: Code Directory (573 bytes)
Version: 20100
Flags: none (0x0)
Platform Binary
CodeLimit: 0xd230
Identifier: com.apple.ls (0x30)
CDHash: 46cc1da7c874a5853984a286ffecb48daf2f65f023d10258a31118acfc8a3697 (computed)
# of Hashes: 14 code + 2 special
Hashes @125 size: 32 Type: SHA-256
Requirements blob: a8ccc60c2a5bff15805beb8687c6a899db386d964a5eb3cf3c895753f6879cea (OK)
Bound Info.plist: Not Bound
Slot 0 (File page @0x0000): e4a537939e00f4974e02b03d36e4dab75f7dc095d2214ba66bc53c73c145ceff (OK)
Slot 1 (File page @0x1000): ad7facb2586fc6e966c004d7d1d16b024f5805ff7cb47c7a85dabd8b48892ca7 (OK)
Slot 2 (File page @0x2000): ad7facb2586fc6e966c004d7d1d16b024f5805ff7cb47c7a85dabd8b48892ca7 (OK)
Slot 3 (File page @0x3000): 4a7cb3e6c1b3a6ac82e3575239ee53d4f0d3bed260fed63438fd21ce0d00392e (OK)
Slot 4 (File page @0x4000): 9ec9e4e02292dfda34ef3caa8317e8bfbcc41a46b18d994dba45febe31b8c660 (OK)
Slot 5 (File page @0x5000): 037285f744f366210cde48821261d4a5f5b739dcf0b82f94144613e92c4b7c07 (OK)
Slot 6 (File page @0x6000): be89c764e52382702918f2db62ff24d9df40410fe894b11d505a4abc1f854340 (OK)
Slot 7 (File page @0x7000): a6b322014743965656e796155c1e0bf22e19a3e8770a43f1111cfbc961037d26 (OK)
Slot 8 (File page @0x8000): a643fc9485d941019cbdeead1d5c47add9382417ebe4d15768221f3763553b84 (OK)
Slot 9 (File page @0x9000): ad7facb2586fc6e966c004d7d1d16b024f5805ff7cb47c7a85dabd8b48892ca7 (OK)
Slot 10 (File page @0xa000): ad7facb2586fc6e966c004d7d1d16b024f5805ff7cb47c7a85dabd8b48892ca7 (OK)
Slot 11 (File page @0xb000): ad7facb2586fc6e966c004d7d1d16b024f5805ff7cb47c7a85dabd8b48892ca7 (OK)
Slot 12 (File page @0xc000): 23304ae11c1ade4411cb63a0955eb644574b8af416e4e3818e382421272ae1b4 (OK)
Slot 13 (File page @0xd000): e0ca7b7000d04057e71c49365b1937711b3557f6b91e0fa144791c66de2a7a4d (OK)
Blob 1: Type: 2 @609: Requirement Set (60 bytes) with 1 requirement:
0: Designated Requirement (@20, 28 bytes): SIZE: 28
Ident: (com.apple.ls) AND Apple Anchor
Blob 2: Type: 10000 @669: Blob Wrapper (4585 bytes) (0x10000 is CMS (RFC3852) signature)
CA: Apple Certification Authority CN: Apple Root CA
CA: Apple Certification Authority CN: Apple Code Signing Certification Authority
CA: Apple Certification Authority CN: Apple Root CA
CA: Apple Certification Authority CN: Apple Root CA
CA: Apple Certification Authority CN: Apple Code Signing Certification Authority
CA: Apple Software CN: Software Signing
Time: 201222002625Zi
Copy the code
It has three BLOBs, three subbinary blocks, Blob 0 for the code signature, Blob 1 for the requirements, and Blob 2 for the CMS. The following bloBs are analyzed.
Code Directory
The code directory is the body of the signature block, which provides the hash value (hash value) of the signed resource. Code signing does not sign the entire file, because sometimes binaries can be large and computing the entire contents is resource-intensive; And binary loads are loaded on demand, not all mapped into memory at first. When signed, the entire MachO file is divided into multiple pages, and each page is signed separately.
The code directory is the description of the signature information, including the signature value of each page, signature algorithm and page size. The data structure of the code signature is as follows:
/* * C form of a CodeDirectory. */
typedef struct __CodeDirectory {
uint32_t magic; /* magic number (CSMAGIC_CODEDIRECTORY) */
uint32_t length; /* total length of CodeDirectory blob */
uint32_t version; /* compatibility version */
uint32_t flags; /* setup and mode flags */
uint32_t hashOffset; /* offset of hash slot element at index zero */
uint32_t identOffset; /* offset of identifier string */
uint32_t nSpecialSlots; /* number of special hash slots */
uint32_t nCodeSlots; /* number of ordinary (code) hash slots */
uint32_t codeLimit; /* limit to main image signature range */
uint8_t hashSize; /* size of each hash in bytes */
uint8_t hashType; /* type of hash (cdHashType* constants) */
uint8_t platform; /* platform identifier; zero if not platform binary */
uint8_t pageSize; /* log2(page size in bytes); 0 => infinite */
uint32_t spare2; /* unused (must be zero) */
/* Version 0x20100 */
uint32_t scatterOffset; /* offset of optional scatter vector */
/* Version 0x20200 */
uint32_t teamOffset; /* offset of optional team identifier */
/* followed by dynamic content as located by offset fields above */
} CS_CodeDirectory;
Copy the code
MachOView can view the contents of this data with the offset of CodeDirectory:
To find the meaning in the corresponding data structure, we focus on the values of three uint8_t types:
parameter | value | meaning |
---|---|---|
hashSize | 0x20 | The hash value is 0x20 bytes. |
hashType | 0x02 | Indicates the signature algorithm. 0x01 indicates SHA-1, 0x02 indicates SHA-256. Starting with macOS10.12 and iOS11, apple switched to sha-256. |
pageSize | 0x0C | Here’s a formula: log2(PageSize) = 0x0C |
PageSize = 2 ^ 0x0C = 4096 = 0x1000 = 4K This is consistent with the paging size of the system.
From this, you can see that the entire MachO file is paged by 0x1000 bytes, and the hashes are computed using SHA-256. These computed hashes are recorded in Code Slots.
Code slot validation
The slots marked 0 to 13 above correspond to code slots.
Given the calculation rules, we can also manually verify the correctness of the code signature. Let’s take the first three code slots, i.e. 0x1000 bytes, as an example, and try to manually calculate their hash values.
$ lipo /bin/ls -thin x86_64 -output /tmp/ls_x86_64
$ dd bs=0x1000 skip=0 count=1 if=/tmp/ls_x86_64 2>/dev/null | openssl sha256
SHA256(stdin)= e4a537939e00f4974e02b03d36e4dab75f7dc095d2214ba66bc53c73c145ceff
$ dd bs=0x1000 skip=1 count=1 if=/Users/zhangferry/ls_x86_64 2>/dev/null | openssl sha256
SHA256(stdin)= ad7facb2586fc6e966c004d7d1d16b024f5805ff7cb47c7a85dabd8b48892ca7
$ dd bs=0x1000 skip=2 count=1 if=/Users/zhangferry/ls_x86_64 2>/dev/null | openssl sha256
SHA256(stdin)= ad7facb2586fc6e966c004d7d1d16b024f5805ff7cb47c7a85dabd8b48892ca7
Copy the code
Notice that the results of the latter two slots are the same, because these two parts of the data are the complement bits, and they are all 0.
Compare the values of the previous three code slots:
Slot 0 (File page @0x0000): e4a537939e00f4974e02b03d36e4dab75f7dc095d2214ba66bc53c73c145ceff (OK)
Slot 1 (File page @0x1000): ad7facb2586fc6e966c004d7d1d16b024f5805ff7cb47c7a85dabd8b48892ca7 (OK)
Slot 2 (File page @0x2000): ad7facb2586fc6e966c004d7d1d16b024f5805ff7cb47c7a85dabd8b48892ca7 (OK)
Copy the code
OK is the result of jTool validation.
Here you can see that there are 14 code slots, and the contents of the File page represent the relative starting address. Also notice a comment in the output section:
# of Hashes: 14 code + 2 special
Copy the code
It represents 14 code slots and 2 special slots.
The special slot
Special slots are created because the App consists of multiple contents, not just binary files. To ensure the integrity of these non-binary files, they are also signed, and their signature value is the special slot. Because the code slot index starts at 0 and its size is not fixed, negative numbers are used to represent the meaning of special slots in order to accommodate them. Here is the definition of a special slot:
# | Slot purpose |
---|---|
– 1 | The binding of the info. Plist |
2 – | Requirement: Binary block embedded code signature |
– 3 | Resource directory: CodeSignature/CodeResources file hash value |
4 – | Specific application: not actually used |
– 5 | Entitlement: The authorization embedded in the code signature |
We can find the contents of the special slot in the jtool output above:
Requirements blob: a8ccc60c2a5bff15805beb8687c6a899db386d964a5eb3cf3c895753f6879cea (OK)
Bound Info.plist: Not Bound
Copy the code
Because the role of the special slot is fixed, there is no serial number.
Code signing Requirements
Currently, code signing is just chunking hash values and storing them, but it doesn’t seem powerful enough. Apple has added another mechanism for code signing: requirements. It can customize rules to impose specific restrictions, such as which dynamic libraries are allowed to load.
The requirements have a special set of syntax rules, whose expression is composed of operands and opcodes. The rich opcode set makes it possible to construct any number of logical conditions. See what opcodes are available in the requirements.h file.
enum ExprOp {
opFalse, // unconditionally false
opTrue, // unconditionally true
opIdent, // match canonical code [string]
opAppleAnchor, // signed by Apple as Apple's product
opAnchorHash, // match anchor [cert hash]
opInfoKeyValue, // *legacy* match Info.plist field [key; value]
opAnd, // binary prefix expr AND expr
opOr, // binary prefix expr OR expr
opCDHash, // match hash of CodeDirectory directly
opNot, // logical inverse
opInfoKeyField, // Info.plist key field [string; match suffix]
opCertField, // Certificate field [cert index; field name; match suffix]
opTrustedCert, // require trust settings to approve one particular cert [cert index]
opTrustedCerts, // require trust settings to approve the cert chain
opCertGeneric, // Certificate component by OID [cert index; oid; match suffix]
opAppleGenericAnchor, // signed by Apple in any capacity
opEntitlementField, // entitlement dictionary field [string; match suffix]
exprOpCount // (total opcode count in use)
};
Copy the code
Compilation of requirements is done by CSREQ, and verification of requirements can be done using CODesign -V.
Let’s try to interpret the existing requirements here.
The requirement for most binaries is simply to verify signed identity, that is, whether a certificate is issued by Apple. Apps in the App Store use a stricter rule set. We can look at Xcode’s code signing requirements:
$ codesign -d -r- /Applications/Xcode.app/Contents/MacOS/Xcode Executable=/Applications/Xcode.app/Contents/MacOS/Xcode Designated = > (an apple generic and certificate leaf [field. 1.2.840.113635.100.6.1.9] exists / * * / or anchor apple Generic and certificate 1 [field 1.2.840.113635.100.6.2.6] exists / * * / and certificate The leaf [field 1.2.840.113635.100.6.1.13] exists / * * / and certificate leaf/subject. OU = APPLECOMPUTER) and identifier"com.apple.dt.Xcode"
Copy the code
Note that there are many identifiers starting with 1.2.840.113635, which represent the branch of Apple Corporation (ISO.member-body.us. AppleOID) in the iso certificate. 100 corresponds to some security-related definitions appleDataSecurity, which can be found here at oidref.com.
Looking at Xcode’s signature requirements, we can roughly infer what these rules mean:
- It is signed by Apple and the certificate node contains 6.1.9, the Mac App Store App.
- Or it is signed by Apple and the certificate node contains 6.2.6, “dev_program”. (Presumably the development version of the application)
- The certificate node contains 6.1.13, Developer ID Applications.
- The team identifier (OU) of the certificate is APPLECOMPUTER and BundleId is
com.apple.dt.Xcode
.
Note the contents of the last item, which qualifies the team identifier and BundleId, so that the application can be re-signed.
CMS
CMS stands for Cryptographic Message Syntax, a standard signature format defined by RFC3852. It’s not mentioned in the book, but I think it’s the most important part of signing code.
In addition to the certificate, CMS signatures also carry some other information, such as the signature attribute signedAttrs.
The Hash value for MachO pages is stored in CodeDirectory, as long as the CodeDirectory is not modified. So just Hash the code directory, get the CDHash, and sign the CDHash.
Note that this step is the real signature, it starts with encryption, and the previous code slot just provides summary information.
Jtool signature output:
CDHash: 46cc1da7c874a5853984a286ffecb48daf2f65f023d10258a31118acfc8a3697 (computed)
Copy the code
This is the externally calculated CDHash value, which is used to compare with the contents of the signedAttrs. But more critical is the signedAttrs encryption verification, the actual verification process is more complex, interested partners can read this article in detail iOS code signature (three). Combined with the content of the article, I drew the verification process of this:
There are two Hash comparisons. One is to decrypt signedAttrs to make sure it is trusted. The other is the CDHash comparison to ensure that the code has not been modified.
SignerInfo contains information such as signedAttrs, Hash algorithm used for signature, encryption algorithm, and signature data. Combined with the public key in the certificate, we can verify the validity of signedAttrs.
authorization
In addition to ensuring the authenticity and integrity of code, code signing provides entitlement to Apple and its powerful security mechanisms. The authorization file is also included in the signature, and its hash value is placed in the slot with index -5. Authorization file is an XML file, we can use JTool –ent to view its content, because ls does not have authorization file, we take wechat on Mac as an example to view:
$ jtool -arch x86_64 --ent /Applications/WeChat.app/Contents/MacOS/WeChat <? xml version="1.0" encoding="UTF-8"? > <! DOCTYPE plist PUBLIC"- / / / / DTD PLIST Apple 1.0 / / EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>com.apple.security.app-sandbox</key>
<true/>
<key>com.apple.security.application-groups</key>
<array>
<string>5A4RE8SF68.com.tencent.xinWeChat</string>
</array>
<key>com.apple.security.device.audio-input</key>
<true/>
<key>com.apple.security.device.camera</key>
<true/>
<key>com.apple.security.device.microphone</key>
<true/>
<key>com.apple.security.files.downloads.read-write</key>
<true/>
<key>com.apple.security.files.user-selected.read-write</key>
<true/>
<key>com.apple.security.network.client</key>
<true/>
<key>com.apple.security.network.server</key>
<true/>
<key>com.apple.security.personal-information.location</key>
<true/>
</dict>
</plist>
Copy the code
Here we can see that there are a number of permissions for sandbox, application-Groups, voice input, camera, and more. When an application accesses a particular API, Apple can use these licenses to determine whether that action is legal. Because apple is the ultimate signer application, so the signature can also be easily modified in the process of authorization, such as com. Apple. Security. The sandbox. The said container – required the sandbox access the value will be forced to relocate to authorization file.
Enforce validation of code signatures
In order for code signing to be truly effective, it is very important to ensure that the validation process is performed smoothly and without omission. The current signature validation takes place in kernel mode, not user mode. Validation of the signature occurs in two phases: when the executable is loaded and when the binary is actually accessed (Page Fault). The split into two phases is also for performance reasons, since binaries are loaded dynamically, and the part that has not been loaded is only validated when it is loaded into memory, i.e. when a Page Fault occurs.
Loading of executable files
The loading of the executable file occurs when the execve()/mac_execve() or posix_spawn() system call is triggered. For MachO, exec_mach_imgact() is called, which finds the location of LC_CODE_SIGNATURE when parsing the file. The code signing binary block is loaded into the kernel’s unified cache buffer.
Page Fault handling
Osfmk /vm/vm_fault.c
/* * CODE SIGNING: * When soft faulting a page, we have to validate the page if: * 1. the page is being mapped in user space * 2. the page hasn't already been found to be "tainted" * 3. the page belongs to a code-signed object * 4. the page has not been validated yet or has been mapped for write. */
#defineVM_FAULT_NEED_CS_VALIDATION(pmap, page, page_obj) \ ((pmap) ! = kernel_pmap/ * 1 * / && \
!(page)->cs_tainted / * 2 * / && \
(page_obj)->code_signed / * * / 3&& \ (! (page)->cs_validated || (page)->wpmapped/ * * / 4))
Copy the code
If the Page Fault meets the following conditions, the signature verification process is triggered:
1. The page is being mapped in user space
2. This page has not yet been found to be tainted
3. The page belongs to a code signing object
4. The page has not been validated or mapped to a writable state
Vulnerabilities in code signing
The code signing mechanism is powerful enough to protect the security of applications, but it has been breached. Here are a few examples.
JIT (Just-in-time code generation)
This happens in the Page Fault process, and if the Page content is for the JIT, it will be specially marked so that arbitrary code can be created and executed without code signing.
Starting with iOS 10, Apple began JIT hardening on 64-bit devices. A specialized memcpy() is used to map the JIT to executable but unreadable memory, then the executable JIT is mapped to unwritable, and the writable JIT to an unexecutable state.
Jekyll application
Jekyll apps are meant to appear harmless when submitted to the App Store, but in fact contain malicious features that are dormant. After the review and the local server to cooperate, voluntarily open its address space and symbols, through code injection or Return Oriented Programming (ROP), trigger the preset malicious program.
There is no reliable way to combat ROP, but the reach of malicious code is manageable because of the sandbox mechanism.
Apple’s plan to use LLVM BitCode to submit apps to the App Store also makes it difficult for malicious apps to know their address space in advance.
Memory locking
From the above we know that a Page Fault triggers the signature verification process, so there would be no signature verification without a Page Fault. In the order of mmap -> mLock -> Memcpy -> MProtect calls, applications can modify executable memory, patching it in any way that seems appropriate. While XNU generally prevents once-writable memory from being set to R-X, this detection is bypassed when memory is locked.
Apple fixed the bug in iOS 9.3.
conclusion
Let’s try again to answer the question left above:
1. How to verify that the source of the code is legitimate?
The source is verified by certificates. All developer certificates are issued by Apple and authenticated by the Root CA. Depending on requirements, other validation methods can be extended.
2. How to confirm if the code has been changed.
You can confirm whether it has been modified by signing CDHash, mainly through the code slot and CDHash. Note that there are two key Hash comparisons in the actual verification process. You can use the above flow chart to further understand.