For more exciting articles, please pay attention to the author’s wechat official number: Code worker notes

After the last OCPack technical scheme summary was sent out, some students sent a private letter to ask about the details of ARC implementation. It happened that they did not properly summarize this part before, so this article came into being.

Before starting this article, it is advisable for those of you who have not read the previous article to read it. The link is as follows:

Self-developed iOS hot update mechanism — OCPack technical scheme summary

The following text begins.

An important feature of OCPack technology solution is that the business side uses Objective-C language directly for development, while the use of OC cannot avoid mentioning ARC, so how to support ARC?

I. ARC related background knowledge

As we all know, ARC is a mechanism that compilers support to automatically insert memory management methods such as retain/release into their code, which can greatly improve development efficiency by reducing the burden of manual memory management. So how does the compiler implement ARC? Is there a node in the Clang syntax tree for a direct retain/release operation for our AST parser to use?

To answer these questions, you need to take a closer look at the logic of ARC in Clang and examine its syntax tree nodes and code implementation in detail.

After investigation, there are no nodes in the CLang syntax tree that directly correspond to retain/release and other methods. The syntax tree generated by the code with ARC compiler enabled only contains some special types of cast expr nodes (as well as some type information added to other AST nodes). Its location is not directly related to the retain/release that should actually be generated. These cast expr nodes (and type information) simply provide hints on when to generate retain/release, and the logic for generating retain/release and so on is scattered among the “vast” code of clang’s codeGen.

To understand the logic of the relevant code in Clang CodeGen, you first need to have a more detailed understanding of the ARC rules. We need to take a look at the language standards for ARC in the official CLang documentation.

ARC Language Standard (Specification)

The ARC language Standard document divides the ARC usage scenarios into two parts, with the following highlights:

1. About retained objects:

  • Scenario without retain/release operation
    • Read a pointer to a non-weak object
    • Pass a pointer to a retain object to a function or method
    • Returns the value from the method to a retained object
  • Method parameter of type Consumed
    • The ns_CONSUMED attribute is included in the parameter declaration
    • The caller expects to get an object whose reference count has been +1 and become its owner
    • ARC needs to automatically give this method +1 before actually calling it
    • ARC needs to release this object after the method call completes
    • Example: The self argument in the init method is of this type
  • Return value of the method that needs to be retained
    • The ns_returns_retained attribute is declared
    • The caller expects to get an object passed by +1 and become its owner
    • ARC will retain the returned result when the return statement is executed (before leaving the current scope)
    • When the caller gets a return value from this method, ARC releases the object when the caller’s current full-expression ends
    • Example: Alloc, copy, init, mutableCopy, new
    • To remove this effect, you need to add the __attribute((ns_returns_not_retained)) attribute
  • Return value that is not retained
    • ARC retains the return value object when it executes the return expression, then leaves the local scope, and then balances the previous retain operation by calling release after ensuring that the returned object successfully crosses the method call boundary
      • At worst this results in an autorelease
    • The ns_returNS_autoreleased attribute indicates that the life of the returned value object is at least as long as the current topmost autoRelease pool
  • Bridge transformation
    • (__bridge T)op
      • No retain/release
    • (__bridge_retained T)op
      • The op must be an object pointer that supports the retain operation
      • T must be a pointer that cannot retain
      • ARC will retain the OP object
    • (__bridge_transfer T)op
      • Op must be a pointer that cannot be retained
      • T must be an object pointer that supports the retain operation
      • ARC releases the current expression at the end of the full expression

2. Specify the life cycle of the object

2.1 Keywords in the program

  • Variable type:
    • __autoreleasing
    • __strong
    • __unsafe_unretained
    • __weak
  • Attribute declaration and variable type mapping
    • assign: __unsafe_unretained
    • copy: __strong
    • retain: __strong
    • strong: __strong
    • unsafe_unretained: __unsafe_unretained
    • weak: __weak

2.2 Object Operations

  • Read data (left value to right value)
    • __weak: The current object is retained and then released at the end of the full expression where the current expression is
    • Others: No special operation
  • Assignment (= operator)
    • __strong
      • Retain a new object
      • Read the corresponding rvalue of an lvalue (old object)
      • The new object address is stored in an lvalue variable
      • Release the old object
    • __weak
      • An lvalue variable points to a new object
      • If the new object is being destroyed, the lvalue variable is updated to a null pointer
    • __unsafe_unretained
      • No special operation
    • __autoreleasing
      • The new object is retained, autorelease, and then stored in an lvalue variable
  • Initialize the
    • Store a null pointer to an lvalue variable
      • If the object type is __unsafe_unretained, skip this step
    • If the object has an initialization expression, the expression is executed and the return value of the expression is assigned to the variable using assignment logic
  • The destruction
    • The logic is the same as assigning a null pointer to a variable
  • Move (c++)
    • slightly

ARC related implementation in OCPack

According to the ARC standard, OCPack simplifies and ADAPTS the complex implementation of THE ARC logic in clang source code (CgobjC.cpP) with basic standards compliance.

The following is a two-part introduction to the ARC implementation in OCPack.

3.1 Processing logic of ARC-related AST nodes in OCPack

OCPack performs special processing on the following AST nodes that contain ARC information when traversing the syntax tree:

CastExpr

  • CK_ARCConsumeObject
    • Generate instruction: autorelease
  • CK_ARCProduceObject
    • Generate directive: retain + autorelease or
    • retain_block + autorelease
    • Note: The semantics are different here: the clang implementation uses EmitARCRetainScalarExpr, but since we’re only working with a single file, we can’t use fullexPR to control external callers, so we use AutoRelease here. This causes problems with __bridge_retained, because the result is not a +1 object, but an autorelease object. Therefore, this solution does not support the __bridget_retained keyword.
  • CK_ARCReclaimReturnedObject
    • Generation instruction: retain + autorelease
  • CK_ARCExtendBlockObject
    • Generation instruction: retain_block + autorelease
  • CK_CopyAndAutoReleaseBlockObject
    • Generate instruction: copy + autorelease
  • LValueToRValue
    • The arc type of lValue is added to the command parameter. If the command is weak, the runtime should call loadWeak

ObjcMethodDecl (ParamVarDecl and ImplicitParamDecl)

  • Method header retain
    • Strong: retain if ns_consume is not present
  • Method ends with release
    • Strong call release
    • Call destroyWeak on weak
  • Destroy the operation stack
    • Each method starts with an object destroy stack and adds a record to the destroy stack each time it encounters a variable that needs to be released.
    • When the method terminates (for example, return), all records in the destroy stack of the current method are retrieved and the corresponding instruction is generated
    • RunCleanupsScope is also created and destroyed at the beginning and end of each CompoundStmt to handle the life cycle of local variables in the code.
      • A RunCleanupsScope is also created in @Autoreleasepool {}

VarDecl (local variable)

  • Init expression processing
    • Generate the Assign directive
  • Method ends with release
    • Strong call release
    • Call destroyWeak on weak
    • See the Destroy operation stack section in ObjCMethodDecl above for timing

BinaryOperator: Assign

  • Arc type information of Lvalue is added to the command, and corresponding processing is done according to the type at runtime:
    • Weak: storeWeak + loadWeak
    • strong: retain new + assign + release old
    • autoreleasing: retain + autorelease
    • unsafe_unretained: N/A

ObjCAutoreleasePoolStmt

  • Generate instructions: ARC_AUTORELEASEPOOL_PUSH and ARC_AUTORELEASEPOOL_POP

3.2 Arc-related specific instruction design in OCPack

ARC directives such as retain/release that are generated for various specific code scenarios mentioned in the previous section are as follows:

  • Instructions: arc_cmd
  • Operand:
    • ARC_RETAIN
    • ARC_RELEASE
    • ARC_AUTORELEASE
    • ARC_INIT_WEAK
    • ARC_DESTROY_WEAK
    • ARC_LOAD_WEAK
    • ARC_RETAIN_AUTORELEASE
    • ARC_RETAIN_BLOCK
    • ARC_AUTORELEASEPOOL_PUSH
    • ARC_AUTORELEASEPOOL_POP
    • ARC_COPY

The function of the instruction is exactly the same as the description of the operand name, which is not repeated here.

Note: The runtime virtual machine performs the corresponding operations when interpreting the instructions (for convenience, the relevant implementation is written in MRC).