64. Difference between Extension and Cateogry.

  1. Extension can add member variables, but category cannot. The runtime does not load the class until the class is loaded into memory. At this point, the memory layout of the class has been determined (the compiler also optimizes the order of the member variables to ensure that the class uses the least amount of memory in accordance with the memory alignment principle). Adding member variables will destroy the memory layout of the class. The address for each member variable is determined at compile time, and the address offset for each member variable is fixed (memory offset (hard-coded) relative to the class’s starting address).
  2. Extension is determined at compile time (so that it is part of the class), and category is determined at run time. Extension was created at compile time along with the @Interface in the header file and the @implementation in the implementation file to form a complete class. Extension was born with the class and died with it. Methods in a category are determined at runtime and can run without implementation, whereas methods in Extension are checked at compile time and report errors without implementation.
  3. Extension is usually used to hide the private information of a class. Extension cannot be directly extended to a system class, but you can create a system class subclass and then add extension.
  4. Categories can add categories to system-provided classes.
  5. Both Extension and category can add attributes, but attributes in a category cannot generate corresponding member variables and implementations of getter and setter methods.
  6. An extension cannot have a separate implementation part (the @implementation part) like a category, and the methods declared by an extension must rely on the implementation part of the corresponding class.

65. Some functions and characteristics of categories.

Categories are a language feature added after Objective-C 2.0 that allows you to dynamically add methods to a class without changing or inheriting the original class. There are a few other application scenarios:

  1. You can separate the implementation of a class into several different files. There are several obvious benefits to doing this:
  • Can reduce the size of individual files.
  • You can group different functions into different categories.
  • Multiple developers can work on a class.
  • You can load as many categories as you want.
  • Declare private methods.
  1. There are also several other scenarios derived from the category:
  • Emulate multiple inheritance (another option is protocol).
  • Expose the framework’s private methods.

Some characteristics of categories:

  1. A category can only extend methods to an existing class, not member variables.
  2. You can also add properties to a category, except at sign property only generates declarations of the setter and getter, not implementations of the setter and getter, and member variables.
  3. If the method in the category has the same name as the original method in the class, the runtime will call the method in the category first, that is, the method in the category will overwrite the original method in the class, so try not to make the method in the category have the same name as the method in the original class. The solution to avoid this is to uniformly prefix the method names of the classes, such as category_.
  4. If there is a method with the same name in multiple categories, which method is called by the runtime is determined by the compile order, and the last method to participate in the compilation is called. We can test by dragging the order of different categories in Compile Sources.
  5. Call priority, Category > This class > Parent class. That is, the methods in the category are called first, then the methods of the class, and finally the methods of the parent class. Note: Categories are added at run time, not compile time.

Note:

  • The category method does not “completely replace” the existing methods of the original class. That is, if both the category and the original class have methodA, then the list of methods of the class will have two methodA after the category is appended.
  • The methods of the category are placed in front of the list of new methods, and the methods of the original class are placed behind the list of new methods. This is what we normally call “overwriting” the methods of the original class with the same name. This is because the runtime looks up methods in the order of the list of methods. As long as it finds a way to match the name, it will stop, but there may be another name behind it.

66. Can I add attributes to a Category?

Category can’t add instance variables, so can it add @Property? Category_t: category_t: category_t

// classref_t is unremapped class_t*
typedef struct classref * classref_t;

struct category_t {
    const char *name; // The name of the class
    classref_t cls; // The class to which it belongs
    struct method_list_t *instanceMethods; // List of instance methods
    struct method_list_t *classMethods; // List of class methods
    struct protocol_list_t *protocols; // Protocol list
    struct property_list_t *instanceProperties; // Instance property list
    
    // Fields below this point are not always present on disk.
    struct property_list_t* _classProperties; // Class attribute list
    
    // Returns a list of class/metaclass methods
    method_list_t *methodsForMeta(bool isMeta) {
        if (isMeta) return classMethods;
        else return instanceMethods;
    }

    property_list_t *propertiesForMeta(bool isMeta, struct header_info *hi);
    
    // Protocol list, metaclass has no protocol list
    protocol_list_t *protocolsForMeta(bool isMeta) {
        // Nullptr is returned if the metaclass applies, but in the load_categories_nolock function there are indications that protocols can be added to the metaclass.
        The protocolsForMeta function returns nullptr, which is not actually added.
        if (isMeta) return nullptr;
        else returnprotocols; }};/* * category_t::propertiesForMeta * Returns the instance or class property of the category. * Hi is the image that contains the category. * /
property_list_t *
category_t::propertiesForMeta(bool isMeta, struct header_info *hi)
{
    if(! isMeta)return instanceProperties; // Returns the instance properties
    else if (hi->info() - >hasCategoryClassProperties()) return _classProperties; // Return the class attribute
    else return nil; // Otherwise return nil
}
Copy the code

From the category definition, you can see that category can add instance methods, class methods can implement protocols, add attributes, and you can’t add member variables. So why don’t you add attributes? In fact, categories allow you to add properties, which you can do with @ Property, but just because you can add @property doesn’t mean you can add “full version” properties, By adding a property, we usually mean that the compiler generates the corresponding member variables and the corresponding setter and getter methods for accessing the property. While you can write @property in a category, you don’t generate _ member variables, and you don’t generate implementations of getter and setter methods for added properties, so you can’t use dot syntax to call setter and getter methods despite adding properties. Crash: unrecognized selector sent to instance…. . We can now manually implement setter and getter access methods for properties via the Associated Object.


67. AttachLists Category data appended to the original class.

When does the _category_t data get attached to the class? Or is it stored in memory somewhere waiting for us to call an instance function or class function inside it? We know that all the classification data is appended to the class itself. It is not the same as the weak mechanism or the Associated object mechanism, and then prepare another hash table to store the data, and then read and process the data according to the object address.

Let’s look at how the data of the classification is appended to the class.

Loading a category involves the runtime initialization and loading process and is so extensive that this is just a cursory overview. Only the category loading involved in the Runtime initial loading process is studied here. Objective-c runs on runtime, which, like other system libraries, is dynamically loaded by macOS and iOS through dyLD (the Dynamic Link Editor).

Map_images_nolock does four things:

  1. Take the mach_header from DLYd and encapsulate it as header_info.
  2. Initialize the selector.
  3. Inside arr_init() : Initializes data structures used by AutoreleasePoolPage, SideTablesMap, and AssociationsManager.
void arr_init(void) {
    AutoreleasePoolPage::init(a);// Automatic release pool initialization
    SideTablesMap.init(a);// SideTablesMap is initialized
    _objc_associations_init(); // AssociationsManager::init(); Initialize the
}
Copy the code
  1. Read the images.

The agreements in a category are added to both the class and the metaclass. Objc: : unattachedCategories addForClass (lc, CLS) can understand category_list CLS, the value is the key for this operation is the hash table, the CLS is not implemented, So when did the contents of these categories get attached to the class. When we looked at the UnattachedCategories data structure in the previous section, we saw that the attachToClass function does just that, attaching the pre-saved category data to the CLS. Global search, we can see that attachToClass is only called in methodizeClass, then global search methodizeClass, we can see that it is only called in realizeClassWithoutSwift.

void attachLists(List* const * addedLists, uint32_t addedCount) {
    if (addedCount == 0) return;

    if (hasArray()) {
        // many lists -> many lists
        
        // Record the previous length
        uint32_t oldCount = array()->count;
        uint32_t newCount = oldCount + addedCount;
        
        // realloc prototype: extern void *realloc(void *mem_address, unsigned int newsize);
        // Pointer name = (datatype *) realloc (pointer name to change memory size, new size)
        // Return value: a pointer to the allocated memory is returned if the reallocation succeeds, otherwise NULL is returned
        
        // Check whether the current pointer has enough contiguous space. If so, expand the address pointed by mem_address.
        // and return mem_address. If there is not enough space, allocate the space specified by newsize.
        // Copy the original data from beginning to end to the newly allocated memory area,
        // Then free the memory area where mem_address refers to.
        // Return the first address of the newly allocated memory area, that is, the address of the reallocated memory block.
        
        // The new size can be large or small (if the new size is larger than the old memory size, the newly allocated portion will not be initialized)
        // If the new size is smaller than the old memory size, data may be lost
        // 注意事项: 
        The old pointer becomes a wild pointer. When the memory is no longer used, free() is used to free the memory block.
        
        // Expand the space
        setArray((array_t *)realloc(array(), array_t: :byteSize(newCount)));
        // Update the array length
        array()->count = newCount;
        
        Void *memmove(void* dest, const void* SRC, size_t count);
        // Copy count bytes from SRC to dest.
        // memmove is used to copy bytes. If the destination region and source region overlap,
        // memmove ensures that the source string copies bytes from the overlapping region to the destination region before being overwritten.
        // But the source content will be changed after copying. However, when the target region does not overlap with the source region, it has the same function as the memcpy function.
        
        // Move the list of methods backward to make room for addedLists addedCount long
        memmove(array()->lists + addedCount, array()->lists, 
                oldCount * sizeof(array()->lists[0]));
        
        // Prototype: void *memcpy(void *destin, void *source, unsigned n);
        // Copy n bytes from the start of the memory address indicated by source to the start of the memory address indicated by destination destin
        
        // Copy addedLists to array()->lists initial memory space
        memcpy(array()->lists, addedLists, 
               addedCount * sizeof(array()->lists[0]));
    }
    else if(! list && addedCount ==1) {
        // 0 lists -> 1 list
        // If it is currently null, the assignment operation (here is the assignment operation, here is the assignment operation)
        list = addedLists[0];
    } 
    else {
        // 1 list -> many lists
        List* oldList = list;
        uint32_t oldCount = oldList ? 1 : 0;
        uint32_t newCount = oldCount + addedCount;
        
        / / capacity
        setArray((array_t *)malloc(array_t: :byteSize(newCount)));
        / / update the count
        array()->count = newCount;
        // Put oldList at the end of lists
        if (oldList) array()->lists[addedCount] = oldList;
        // Copy addedLists to array()->lists initial memory space
        memcpy(array()->lists, addedLists, 
               addedCount * sizeof(array()->lists[0])); }}Copy the code

68. + Load function analysis.

  • The classes and classes that implement load are non-lazy classes and non-lazy classes, and the classes and classes that do not implement +load are lazy classes and lazy classes. Lazy-loaded classes are implemented only when we first use them.
  • The load function is called directly from its address, not through objc_msgSend, so there is no “override” behavior in the load function in the class or category. They all execute, in the same way: first the class must precede the class, then the parent class must precede the subclass, and between the classes, whoever compiled first executes first. (This is the opposite of a function in a different class with the same name, which is “overridden” by a function in a compiled class.)
  • Normally we should not call the load function manually, we just leave it to the system to wait for the call, and it will be called globally only once.

Void load_images(const char *path __unused, const struct mach_header *mh) handles the +load function in the given mirror and classification mapped by dyLD. Extern bool hasLoadMethods(const headerType * MHDR) determines if mach_header contains classes and categories that are not lazy-loaded (that is, classes and categories that implement the load function). Extern void prepare_load_methods(const headerType * MHDR) extern void prepare_load_methods(const headerType * MHDR) The loop adds different classes of load functions to the global load array (loadable_classes), first adding the parent class’s +load function recursively. The load function is added to the global load array (loadable_classes). We can already see that the load function calls the class before the class, and the parent class before the subclass.

Detailed execution process such as how to load function read, how to execute and so on details can be referred to: iOS Category basic implementation principle (3) : additional +load function hyperdetailed parsing


69. List of data structures used to store weak reference variables.

Template Class DisguisedPtr is a template utility class defined in Project Headers/objc-private.h. Its main function is to convert the T pointer (address of the variable of TYPE T) into an unsigned long. Realize pointer to integer mutual mapping, play the role of pointer camouflage, so that the pointer hidden in the system tools (such as leaks tool). The global search in OBJC4-781 DisguisedPtr finds that the abstract type T is used only as OBJC_object and objC_object *. When the abstract type T is objc_object *, it is used to hide the address of the __weak variable. DisguisedPtr is similar to the pointer type T *, except that the stored value is disguised to be hidden by tools such as Leaks. Nil itself is disguised, so zero-valued memory works as expected, allowing a nil pointer to run its operations as a non-nil pointer without crashing the program.

The Template class StripedMap is used as a hash table with Key void * and Value T from a data structure perspective. A global StripedMap search in objC4-781 finds that T is used as a SideTable and spinlock_T type.

The SideTables type is StripedMap. SideTables use: SideTable *table = &SideTables()[obj]; SideTable *table = &SideTables()[obj]; SideTable *table = &SideTables()[obj];

Typedef DisguisedPtr < objc_object * > weak_referrer_t; Here T is objc_object*, so T* in the DisguisedPtr is objc_object**, which is the pointer of the pointer. The address used to disguise the __weak variable, that is, the address used to disguise objc_object *.

The out_of_line_ness field overlaps the lower two bits of memory space in the Inline_referrers [1]. Inline_referrers [1] is the pointer aligned address of DisguisedPtr. The lower two digits of the pointer aligned DisguisedPtr are always 0b00 (the last two digits of the 8-byte alignment obtained in the binary representation of the address are always 0) (disguised as nil or 0x80.. 00) or 0B11 (any other address). Therefore, out_of_line_ness == 0b10 can be used to mark out-of-line status, Struct Weak_entry_t internally uses a hash table to store weak_referrer_t instead of using the weak_referrer_t array of length 4.

The function of weak_entry_t is to hold the addresses of all weakly referenced variables that point to an object. The data stored in the hash array of Weak_entry_t is typedef DisguisedPtr<objc_object *> Weak_Referrer_t, which is essentially the address of the weak reference variable, namely objc_object **new_referrer, By manipulating a pointer to a pointer, you can make a weak-reference variable point to nil after object destruction. We have to save the address of the weak reference variable to set its pointer to nil.

In weak_entry_t, when the number of weak references of the object is not more than 4, weak_referrer_t inline_referrers[WEAK_INLINE_COUNT], which is an array with fixed length of 4, is used for storage Weak_referrer_t. When the length is greater than 4, weak_referrer_t * Referrers hash array is used to store Weak_Referrer_t data.

The initial length of the hash array of weak_table_t is 64. When the storage ratio exceeds 3/4, the hash array will be expanded to 2 times of the total capacity, and then the previous data will be hashed again in the new space. When some data is removed from the hash array, in order to improve the search efficiency, it is necessary to reduce the total length of the hash array. The rule is that when the total capacity of the hash array exceeds 1024 and the used part is less than 1/16 of the total capacity, it is reduced to 1/8 of the total capacity. After the reduction, the original data will be hashed again in the new space. (Both scaling and scaling are done by using calloc to create new space, while cache_t is done by ignoring old data, which can be remembered here.) . Keep in mind that this is just for the hash array of weak_table_t.

Weak_entry_t first uses an array of fixed length 4. When a new weak reference comes in, it will first judge whether the current fixed-length array is used or the hash array. If the fixed-length array is used at this time, it will first judge whether the fixed-length array is empty. If no vacancy will apply for the hash array length for 4 and a circular array of fixed length of the data in the hash array, there appears to be in accordance with the subscript cycle storage, actually here will go to hash, then the judgment of the hash array expansion, and if more than three-quarters of the expansion of total accounted for the total capacity of 2 times, So the hash array of Weak_entry_t is 8 after the first expansion. There is no reduction mechanism for the hash array of weak_entry_t. The operation of removing the weak reference is actually to set the pointer of the weak reference to nil. The operation of removing is to judge if the fixed-length array is empty or the hash array is empty. The weak_entry_t in the weak_table_t hash array will be removed, and then the weak_table_t will do some operations to reduce the capacity.

Here, the reason why weak_entry_t does not shrink and uses fixed-length array at the beginning is optimization, because the number of weak references of an object is not too many.

struct weak_entry_t {
    // Referent holds the address of the objc_Object instance represented as an integer. The following weak reference variables refer to this object instance
    DisguisedPtr<objc_object> referent;
    
    // When the number of weak references to the referent is less than or equal to 4, the inline_referrers array is used to store the addresses of these weak references.
    // If the value is greater than 4, use the referrers hash array.
    
    // combine 32 bytes of memory space
    union {
        struct {
            weak_referrer_t *referrers; // Save the hash array of weak_referrer_t
            
            // out_of_line_ness and num_refs constitute bit-domain storage with a total of 64 bits
            uintptr_t        out_of_line_ness : 2; // The tag uses hash array or inline_referrers to save weak_referrer_t
            uintptr_t        num_refs : PTR_MINUS_2; // The number of weak_referrer_t saved in the current referrers
            uintptr_t        mask; // The total length of the referrers hash array is reduced by 1
            
            // The maximum number of hash collisions that can occur to determine if a logical error has occurred. (The number of hash collisions should never exceed this value.)
            // This value is updated when creating a new Weak_entry_t and inserting a new Weak_referrer_t, which always records the maximum offset value
            uintptr_t        max_hash_displacement;
        };
        struct {
            // The memory space of the lower two digits of out_of_line_ness and inline_referrers[1] coincides
            Weak_referrer_t (Dsiguised
      
       ) array of length 4
      
            weak_referrer_t  inline_referrers[WEAK_INLINE_COUNT];
        };
    };
    
    // Return true to use the referrers hash array. False to use the inline_referrers array to save weak_REFERrer_t
    bool out_of_line(a) {
        return (out_of_line_ness == REFERRERS_OUT_OF_LINE);
    }
    
    // Weak_entry_t assignment operation, directly use memcpy function copy other memory contents into this,
    // Instead of using the copy constructor, it should also be considered for efficiency.
    weak_entry_t& operator= (const weak_entry_t& other) {
        memcpy(this, &other, sizeof(other));
        return *this;
    }

    // Constructor of weak_entry_t
    
    // newReferent is a pointer to the original object,
    // newReferrer is a pointer to the weakly referenced variable of newReferent.
    
    // Referent (newReferent) initialises the DisguisedPtr(T* PTR) : value(disguise(PTR)) {} constructor,
    // Call the disguise function to convert newReferent to an integer assigned to value.
    weak_entry_t(objc_object *newReferent, objc_object **newReferrer)
        : referent(newReferent)
    {
        // Putting the newReferrer in bit 0 of the array also calls the DisguisedPtr constructor, which transforms the newReferrer into an integer and saves it
        inline_referrers[0] = newReferrer;
        // The loop sets the remaining 3 bits of the inline_referrers array to nil
        for (int i = 1; i < WEAK_INLINE_COUNT; i++) { inline_referrers[i] = nil; }}};Copy the code

The reason why fixed-length array/hash array switch is used in weak_entry_t is that the number of weak reference variables of instance objects is generally small. At this time, the fixed-length array does not need to apply for memory space dynamically (the two structures in the union share 32 bytes of memory), but uses a continuous memory space allocated at a time when Weak_entry_t initialates, which will improve the operation efficiency.

Weak_table_t is a global hash table that holds weak references. Object IDS is the key and Weak_entry_t is the value.

Struct SideTable definition is in nsobject. mm file. It manages two things that are super important to us, one is the reference count of RefcountMap refcnts store object, and one is the weak reference variable of weak_TABLE_t Weak_table store object.

Struct SideTable; struct SideTable;

  1. spinlock_t slock; Spin locks to ensure thread-safe operations on SideTable. Weak_table_t and weak_entry_t look at the front of the two large block, weak_table_t and weak_entry_t, see that all their operation functions are not mentioned to add unlock things, if you carefully observe it will find that there is a no_lock behind the function name, is used to remind us, Their operations are not thread-safe. They leave the task of keeping their threads safe to the SideTable, and as you can see, all the functions provided by SideTable are thread-safe, which is done by Slock.
  2. RefcountMap refcnts: The hash table using DisguisedPtr< objC_object > as key and size_t as value is used to store the reference count of the object (used only when the reference count saved in ISA_T exceeds the DisguisedPtr< objC_object > in the case of ISA optimization or isa optimization is not used. Here two fields of uintptr_t has_sidetable_rc and uintptr_t extra_rc are involved in ISA_T. As a hash table, it uses square probe to value from the hash table, while weak_table_t is linear probe (open addressing).
  3. Weak_table_t Weak_table is a hash table that stores weak references of objects, and is the core data structure of weak function realization.

Spinlock_t is originally an unfair spinlock of the Uint32_t type, which has been replaced by the os_UNfair_lock mutex due to security issues. Unfair means that the order in which the lock is acquired is independent of the order in which the lock is applied. In other words, the first thread that applies for the lock may be the last to acquire the lock, or the thread that has just acquired the lock may acquire the lock again immediately, causing other threads to busy-wait.

Os_unfair_lock The member variable _OS_UNFAIR_LOCK_OPAQUE records information about the thread that has obtained the lock. Only the thread that has obtained the lock can unlock the lock.

SideTables is a static global hash table of type StripedMap. Through the learning of StripedMap above, it is known that it is a hash array of fixed length 8 in iPhone, and a hash array of fixed length 64 in MAC. It has a simple hash function that calculates the hash value according to the void * input parameter. And then get T in the hash array based on the hash value. In SideTables, T is the SideTable.

SideTables() defines a number of lock-related global functions. The internal implementation is the interface supported by the template abstraction type T that calls StripedMap. The corresponding type T of SideTables is SideTable. The SideTable executes the corresponding function that calls its Spinlock_t slock member variable. A separate lock mechanism is used to reduce the blocking pressure when processing multiple objects in parallel.


70. List of functions used by weak reference variables.

Static weak_entry_t * weak_entry_for_referent(weak_table_t *weak_table, objc_object *referent); Based on the given referent (our object variable) and weak_table_t hash table, find the corresponding Weak_entry_t (hash table that holds the addresses of all weak-reference variables pointing to the Referent) and return, or NULL if not found.

Id Weak_register_NO_lock (Weak_table_t * Weak_table, ID Referent, ID * Referrer, bool crashIfDeallocating); Add a pair (Object, weak Pointer) to the weak reference list. (That is, when an object has the first weak variable pointing to it, the object will be registered in the weak_table_t hash table, and the address of the first weak variable will be saved in the weak_entry_t hash table of the object. If the weak variable is not the first, it indicates that the object already exists in the Weak_table_t hash table. In this case, we only need to save the address pointing to the weak variable into the weak_entry_t hash table of the object.)

Weak_unregister_no_lock (weak_table_t *weak_table, ID referent, ID *referrer); Remove a pair (Object, weak Pointer) from the weak reference list. (Remove the address of an weak variable from the object’s Weak_entry_t hash table.)

Weak_clear_no_lock (weak_table_t *weak_table, ID referent); This function is called when the object is destroyed. Set all remaining __weak variables to nil, which corresponds to the usual mechanism by which the __weak variable is set to nil after the object to which it points is destroyed.

Adjust the length of weak_table_t hash array, take weak_table_t pointer as parameter, call weak_grow_maybe and weak_compact_maybe these two functions, It is used to adjust the length of weak_table_t hash array in time when it is too full or empty, optimize the efficiency of memory use, and improve the efficiency of hash search. Both functions adjust the length of the Weak_table_t hash array by calling the Weak_resize function.

The static void weak_grow_maybe (weak_table_t * weak_table); This function is used to expand the weak_entry_t * Weak_entries length of weak_table_t, if num_entries are greater than 3/4 of mask + 1. It can be seen that the initial length of Weak_entries is 64, and the length of each expansion is 2 times that of Mask + 1. After expansion, weak_entry_T in the original hash array will be re-hashed into the new space, and all member variables of weak_TABL_T will be updated. The total capacity of the memory space occupied is (mask + 1) * sizeof(Weak_entry_t) bytes. So mask plus 1 is always 2 to the N. (at the beginning, N is 6:2 ^6 = 64, then N >= 6)

The static void weak_compact_maybe (weak_table_t * weak_table); This function will be called in the weak_entry_remove function, aiming at weak_entry_t removed from the hash array of Weak_table_t, if the hash array occupation is low, reduce the length of weak_entry_t * Weak_entries, Optimize memory usage and improve hash search efficiency. The condition for reducing the length of weak_entry_t * Weak_entries is that when the current total length exceeds 1024 and the capacity occupancy ratio is less than 1/16, the weak_entries space is reduced to 1/8 of the current space.

Static void weak_resize(weak_table_t *weak_table, size_t new_size); static void weak_resize(weak_table_t *weak_table, size_t new_size); The Weak_resize public function is called for both expanding and shrinking space. The input parameter is a Weak_table_t pointer and a specified length value. Weak_entry_insert function We know the overall role of weak_resize function, which expands or reduces the length of hash array. First, it applies for the corresponding size of memory according to new_size, and the new_entries pointer points to the newly applied memory. Set the mask of weak_table to new_size-1. The function of mask is to record the memory boundary of the total capacity of Weak_table. In addition, mask is also used in the hash function to ensure that the index hash array will not exceed the boundary. The hash array of Weak_table_t may have hash collisions, while Weak_table_t uses open addressing to handle collisions. If there is a collision, the next empty space adjacent (or starting from the beginning if it has reached the end) is searched. Max_hash_displacement Records the maximum offset value of the current Weak_table. This value is used elsewhere, for example: Weak_entry_for_referent function, when looking for entry of given referent in weak reference table, If the value of hash_displacement exceeds the value of weak_table-> max_hash_DISPLACEMENT during the hash search, there is no weak_entry_t to look for.

Weak_entry_t = weak_entry_t; weak_entry_t = weak_entry_t; weak_entry_t = weak_entry_t; weak_entry_t = weak_entry_t; weak_entry_t = weak_entry_t; And the length adjustment mechanism of weak_table_T -> Weak_entries hash array.

Static void append_referrer(weak_entry_t *entry, objc_object **new_referrer); Add the given referrer to the hash array of Weak_entry_t (or an internal array of fixed length 4).

Static void remove_referrer(weak_entry_t *entry, objc_object **old_referrer); Remove weak-referenced addresses from weak_entry_t’s hash array (or internal array of fixed length 4).

Static void weak_entry_insert(weak_table_t *weak_table, weak_entry_t *new_entry); static void weak_entry_insert(weak_table_t *weak_table, weak_entry_t *new_entry); Add a new Weak_entry_t to the hash array of the given weak_table_t.

Static void weak_entry_remove(weak_table_t *weak_table, weak_entry_t *entry); Removes the specified Weak_entry_t from the hash array of Weak_table_t.

Static weak_entry_t * weak_entry_for_referent(weak_table_t *weak_table, objc_object *referent); Weak_entry_t of referent is found from the hash array of Weak_table_t, and NULL is returned if not found.

Weak_clear_no_lock (weak_table_t *weak_table, ID referent_id); This function is called when the object’s dealloc function executes. The main function is to set all weak reference Pointers to nil when the object is released from deprecation. This function is called when the object executes dealloc. First, find the corresponding Weak_entry_t in weak_table according to the input parameter referent_id, and then go through the hash array of Weak_entry_t or inline_Referrers fixed-length array through the address of the weak variable stored inside. Set the weak variable pointing to nil, and finally remove weak_entry_t from weak_table.


Weak The process of initializing, assigning, and setting the weak variable to nil.

The functions related to the weak variable are objc_initWeak, objc_storeWeak, and objc_destroyWeak. These functions initialize the weak variable, assign the weak variable (modify the pointer to the weak variable), and destroy the weak variable, respectively.

#import <Foundation/Foundation.h>
int main(int argc, const char * argv[]) {
    @autoreleasepool {
        // insert code here...
        id obj = [NSObject new];
        id obj2 = [NSObject new];
        printf("Start tag\n");
        {
            __weak id weakPtr = obj; // Call objc_initWeak to initialize the weak variable
            weakPtr = obj2; // Call objc_storeWeak to change the direction of the weak variable
        } 
        // With the right curly brace, call objc_destroyWeak to destroy the weak variable
        // The weak variable is destroyed, but the weak variable is not destroyed.
        
        printf("End tag\n"); // ⬅️ breakpoint is typed here
    }
    return 0;
}
Copy the code

Id objc_initWeak(id *location, id newObj); The function takes two arguments:

  1. id *locationThat is, the address of weak variable, that is, the address of weakPtr variable in the example code: WeakPtr, it is a pointer pointer, the reason to store the address of the pointer, because weakPtr variable points to the object after release, to set the weakPtr variable to nil, if only store pointer (that is, the address value that weakPtr variable points to), is not able to complete this setting.

This reminds us that when we do something on a linked list, the function entry is a pointer to the head pointer of the list.

If you're not familiar with Pointers, you might be a little confused. Why do we use Pointers to Pointers? When we change the pointer to a function, we also change the pointer to an external pointer. Not really! It is important to note that when a function parameter is a pointer, it passes in an address, and then creates a temporary pointer variable inside the function that points to the address passed in. If you change the pointer, you only change the pointer to the temporary pointer variable inside the function. The external pointer variable has nothing to do with it, except that they both point to the same address initially. Everything we do to the contents of this address is reflected in a pointer variable to that address. This address is a pointer to the address, if there is no limit to the const, we can do it for the content of inside address any operation even contents the ballast zero, these operations are to the address of memory, no matter how the memory is there, it addresses here all the time, and our original pointer has been pointing to it, So what we need to do is change the pointer to nil, so we only need to know the address of the pointer itself, so we put 0x0 in the address of the pointer itself, so that means we set our pointer to nil!Copy the code
  1. id newObj: The object that assigns a value to weakPtr, which is obj in the example code.

This method returns the value of the storeWeak function: Weakly_referenced bit of OBj’s ISA (ISA_T) is set to 1, indicating that the object has weak references. When the object is destroyed, those weak references to it should be processed. This is where the mechanism for the weak variable to be set to nil is implemented.

See the implementation of objc_initWeak function, it is called storeWeak function internally, and the template parameter when executing is DontHaveOld (no old value), which means that weakPtr does not point to any object before, our weakPtr is just initialized, Of course, I’m not pointing to old values. What is involved here is that when the weak variable changes direction, the address of the weak variable is removed from the weak_entry_t hash table of the object to which it previously pointed. DoHaveNew indicates a new value.

The storeWeak function implements the following core functions:

  • Store the address of weak variable location into the hash array (or array of fixed length 4) of weak_entry_t corresponding to OBj, which is used to find the addresses of all weak variables through the hash array during obJ destruction. Set the address to which the weak variable points (*location) to nil.
  • If ISA optimization is enabled, weakly_referenced position of ISA_T of OBj is set to 1, which serves to identify weak reference of OBj. When the object dealloc, Runtime determines whether it needs to look for weak_entry_t corresponding to OBj based on the Weakly_referenced flag bit and sets all its weak references to nil.

__weak ID weakPtr = obj A complete vernal understanding is: WeakPtr address and obj, call objc_initWeak function, add weakPtr address to weak_entry_t hash array in the hash table of SideTable where OBJC is located, And assign the address of obj to *location (*location = (id)newObj), then set weakly_referenced field of OBj’s ISA to 1, and finally return OBj.

There are two cases of setting an object to have weak references:

  1. Update the Weakly_referenced bit identifier of newObj’s ISA when the object’s ISA is an optimized ISA.
  2. In addition, if an object’s ISA is the original class pointer, its reference count and weak reference identifier bits are included in the reference count in the RefCount. (Different bits represent different information.) The reference count value of the object (type size_t) needs to be found from refcount. The first digit of the reference count value identifies whether the object has a weak reference (SIDE_TABLE_WEAKLY_REFERENCED).

StoreWeak Updates a weak variable. If HaveOld is true, the weak variable has an existing value that needs to be cleared. This value can be nil. If HaveNew is true, a new value needs to be assigned to the weak variable. This value can be nil. If CrashIfDeallocating is true, the program will crash if newObj’s ISA has been marked deallocating or if the class newObj belongs to does not support weak references. If CrashIfDeallocating is false, then the problem above is simply storing nil in the weak variable.

So far it’s clear that objc_initWeak is used to initialize the weak variable. Internally, only weak_register_no_lock calls are required, and then when assigning a new value to the weak variable, Is to first deal with its pointing to the old value (Weak_unregister_NO_lock), and then deal with its new pointing. (weak_register_no_lock)

Objc_storeWeak: In the example code, when we assign a new value to the weak variable, we call objc_storeWeak and internally we call storeWeak directly. Both DoHaveOld and DoHaveNew are true. Weak_unregister_no_lock: weak_unregister_no_lock: weak_unregister_no_lock: Weak_unregister_no_lock: Weak_unregister_no_lock: Weak_unregister_no_lock: Weak_unregister_no_lock: Weak_unregister_no_lock

Objc_destroyWeak: In the example code, when the local weak variable is left in the right curly bracket, its scope will end and destruction will inevitably occur. In the assembly code, we see objc_destroyWeak called, which, according to its name, should be the function called when the weak variable is destroyed. If the weak variable is destroyed before the object to which it points, what happens to the address where the weak variable is stored in the weak_entry_t hash array of the object to which it points? A look at the objc_destroyWeak function should provide the answer. Destroys the relationship between an weak Pointer and the weak reference table of the object to which it points. (The hash array of an object’s Weak_entry_t holds the addresses of all weak references to the object, which means to remove the specified weak_entry_t address from the hash array of the object’s Weak_entry_t.) If weak Pointer does not point to anything, there is no need to edit the hash array of Weak_entry_t. This function is not thread-safe for concurrent modification of weak references. (Simultaneous weak clear is thread-safe.) objc_destroyWeak is a direct call to storeWeak. DontHaveNew doesn’t have a new value. DontCrashIfDeallocating doesn’t need crash, newObj is nil, and location only has the address of the weak reference to be destroyed. Weak_unregister_no_lock (&oldTable-> Weak_table, oldObj, location); weak_unregister_no_lock(&oldTable-> Weak_table, oldObj, location); . Weak_unregister_no_lock = weak_unregister_no_lock = weak_unregister_no_lock = weak_unregister_no_lock = weak_unregister_no_lock = weak_unregister_no_lock = weak_unregister_no_lock = weak_unregister_no_lock = weak_unregister_no_lock = weak_unregister_no_lock

When an object is released and destroyed all its weak references are set to nil. It’s something we’ve heard a million times, but where’s the entrance? Dealloc = nil; weak = nil; dealloc = nil; dealloc->_objc_rootDealloc->rootDealloc->object_dispose->objc_destructInstance->clearDeallocating->clearDeallocating_slo W.

When the weak reference of an object is created for the first time, a weak_entry_t is created with the pointer of the object and the address of the weak reference, and placed in the weak_table_t of SideTable where the object is located. Then the addresses of all weak references pointing to this object will be saved in the hash array of weak_entry_t of this object. When the object is destructed, the addresses of weak references saved in Weak_entry_t will be traversed and weak references will point to nil. Finally, weak_entry_t is removed from weak_table.


72. ARC and MRC read the weak variable.

Across ARC, we want to obtain “weak” and “objc_loadWeak” and “objc_release”. Across MRC, we want to “objc_loadWeak” and “MRC”. This is different from reading the contents of an object through a pointer directly into its memory. Analyzing the function implementation above, you can see that whenever an object is marked deallocating, accessing it from a weak reference will get nil, even though the object’s weak reference is still pointing to the object’s memory and hasn’t been completely freed.

  1. In ARC mode, an objc_loadWeakacross is called and then an objc_release is called once before leaving the current scope. Retained with objc_loadWeak, the objc_object::rootRetain function is called for the weak pointer to the object, and the reference count of the object is increased by 1. Objc_object ::release is called just before going out of scope to reduce the reference count of the object by one. The weak variable does not strongly reference the object to which it points. The weak variable does not have a strong reference to the object to which it points.

  2. In MRC mode, the objc_loadWeak function is called when the weak pointer is obtained. Objc_loadWeakRetained (location). That is, objc_autorelease is used to cancel the operation of adding 1 to the reference count in the weak variable reading process, so as to ensure the normal release of the object.

The function of objc_loadWeakRetained function can be interpreted as: In ARC, the compiler inserts an objC_release function after it. In MRC, the compiler inserts the returned object into the automatic release pool. Both methods ensure that the read object is released properly. (When you verify this conclusion, you can see that each weak is read. In ARC: objc_loadWeakRetained and objc_release are one-to-one.)

The objc_copyWeak function is called when assigning a weak variable to another weak variable.

void
objc_copyWeak(id *dst, id *src)
{
    // First get the specified object from the SRC weak variable and increment the reference count by 1
    id obj = objc_loadWeakRetained(src);
    
    // Initialize the DST weak variable
    objc_initWeak(dst, obj);
    
    // the obj reference count is reduced by 1, corresponding to the +1 read above, to ensure that the object can be freed normally
    objc_release(obj);
}
Copy the code

73. Block definition.

Block is an extension of THE C language. Block extensions can be expressed in one sentence: anonymous functions with automatic variables (local variables) (with automatic variables represented in the block as intercepting external variable values). (For programmers, naming is the essence of the job.)

The block definition paradigm is as follows: ^ Return value type Parameter list expression “return value type” is the same as the return value type of C functions, “parameter list” is the same as the argument list of C functions, and “expression” is the same as the expression allowed in C functions.

Block variables can be used exactly like normal C variables, so Pointers to block variables can also be used, that is, Pointers to block variables.

typedef int (^blk_t)(int);
blk_t blk = ^(int count) { return count + 1; };

// Pointer assignment
blk_t* blkPtr = &blk;

/ / execution block
(*blkPrt)(10);
Copy the code

No matter where a block is defined or when it is executed. When block executes, use value is its definition to intercept the basic variable values or intercept external memory address, if is a memory address, from the definition to the implementation of the time, no matter which stores values have been modified, block executes, use memory which stores the value at the time, if is the basic variables, The values used for execution are only the transient values used when the block structure was initialized. (Definition can be understood as generating block struct instances, interception can be understood as initializing block struct instance member variables with external variables)


74. The nature of blocks.

Blocks are anonymous functions with automatic variables, but what exactly is a block? The syntax looks special, but it is actually handled as fairly ordinary C source code. With a block-enabled compiler, source code with block syntax is converted into source code that can be processed by a normal C compiler and compiled as plain C source code. With the clang-rewrite-objc source file name, the following source code can be converted to:

int main(a) {
    void (^blk)(void) = ^ {printf("Block\n"); };
    blk(a);return 0;
}
Copy the code
  • __block_impl
struct __block_impl {
  void *isa;
  int Flags;
  int Reserved;
  void *FuncPtr;
};
Copy the code
  • __main_block_impl_0
 struct __main_block_impl_0 {
   struct __block_impl impl;
   struct __main_block_desc_0* Desc;
   
   // Struct constructor
   __main_block_impl_0(void *fp, struct __main_block_desc_0 *desc, int flags=0) { impl.isa = &_NSConcreteStackBlock; impl.Flags = flags; impl.FuncPtr = fp; Desc = desc; }};Copy the code
  • __main_block_func_0
 static void __main_block_func_0(struct __main_block_impl_0 *__cself) {
     printf("Block\n");
 }
Copy the code
  • __main_block_desc_0
 static struct __main_block_desc_0 {
   size_t reserved;
   size_t Block_size;
 } __main_block_desc_0_DATA = { 0.sizeof(struct __main_block_impl_0)};
Copy the code

As the transformed source code shows, the anonymous functions used by blocks are actually treated as simple C functions: __main_block_func_0. In addition, clang-transformed functions are named based on the name of the function to which the block syntax belongs (in this case, main) and the order in which the block syntax appears in that function (in this case, 0). The function’s __cself argument is a pointer to an instance of the block structure, the equivalent of the c ++ instance method’s this variable to the instance itself, or the Objective-C instance method’s self variable to the object itself.


75. Block intercepts the essence of external variable values.

In the previous section, to see the simplest form of a block without intercepting any external variables, let’s look at the result of a conversion when a block intercepts external variables. Clang-rewrite-objc converts the following block definition:

int dmy = 256; // This variable is used for comparison, unused variables are not intercepted by blocks
int val = 10;
int* valPtr = &val;
const char* fmt = "val = %d\n";

void (^blk)(void) = ^ {Int * const char * const char * const char * const char * const char * const char * const char *
    printf(fmt, val);
    printf("valPtr = %d\n", *valPtr);
};
Copy the code

The converted code: the __block_impl structure remains unchanged. The __main_block_IMPL_0 member variable has been added. External variables used in block syntax expressions (which look like the same variable, but actually have the same name) have been appended as member variables to the __main_block_IMPL_0 structure. And the type is exactly the same as external variables (local static variables are converted to the corresponding pointer type). The __main_block_IMPL_0 constructor assigns the corresponding contents of the IMPL, Isa = &_nsConcretestackBlock Specifies the storage domain of the block and the type of the current block. The values of external variables intercepted by the block are stored in the member variables of the structure. Constructors are also changed. FMT, val, and valPtr are assigned in the initializer list. The values of the used external variables are stored in the block structure variables, and the use of an external variable in a block expression is actually a member variable of an instance of the block structure with the same name, so we cannot directly assign to it.

struct __main_block_impl_0 {
  struct __block_impl impl;
  struct __main_block_desc_0* Desc;
  
  __main_block_impl_0 adds its own member variable,
  // The type of an external automatic variable is exactly the same as that of an external automatic variable (except for local static variables).
  const char *fmt;
  int val;
  int *valPtr;
  
  FMT (_fmt), val(_val), valPtr(_val)
  // The constructor instance is initialized with the value of the captured external variable, and the parameter type is exactly the same as the external variable
  __main_block_impl_0(void *fp,
                      struct __main_block_desc_0 *desc,
                      const char *_fmt,
                      int _val,
                      int *_valPtr,
                      int flags=0) : fmt(_fmt), val(_val), valPtr(_valPtr) { impl.isa = &_NSConcreteStackBlock; impl.Flags = flags; impl.FuncPtr = fp; Desc = desc; }};Copy the code

The __main_block_func_0 function also uses the __cself argument:

static void __main_block_func_0(struct __main_block_impl_0 *__cself) {

    // We can see the __main_block_impl_0 instance passed through the function to read the corresponding intercepted external variable value
    const char *fmt = __cself->fmt; // bound by copy
    int val = __cself->val; // bound by copy
    int *valPtr = __cself->valPtr; // bound by copy

    printf(fmt, val);
    printf("valPtr = %d\n", *valPtr);
}
Copy the code

Inside main, the __main_block_IMPL_0 constructor instance is built and the __main_block_func_0 function is executed unchanged:

int main(int argc, const char * argv[]) {
    /* @autoreleasepool */ { __AtAutoreleasePool __autoreleasepool; 

        NSLog((NSString *)&__NSConstantStringImpl__var_folders_24_5w9yv8jx63bgfg69gvgclmm40000gn_T_main_4ea116_mi_0);

        int dmy = 256;
        int val = 10;
        int* valPtr = &val;
        const char* fmt = "val = %d\n";
        
        // Initializes the member variables appended by automatic variables in struct __main_block_impl_0 according to the arguments passed to the constructor
        void (*blk)(void) = ((void (*)())&__main_block_impl_0((void *)__main_block_func_0,
                                                              &__main_block_desc_0_DATA,
                                                              fmt,
                                                              val,
                                                              valPtr));

        val = 2;
        fmt = "These values were changed. val = %d\n";
        
        // Execute the FuncPtr function in __block_impl with the __main_block_IMPL_0 instance variable BLK
        ((void (*)(__block_impl *))((__block_impl *)blk)->FuncPtr)((__block_impl *)blk);
    }

    return 0;
}
Copy the code

In general, “intercepting external variable values” means that when a block is executed, the variables with the same name as the external variables used by the block syntax expression are actually member variables of the block’s structural instance (i.e., the block itself), and these member variables are initialized from the values of the intercepted external variables. Block cannot directly use automatic variables of C array types. As mentioned earlier, when intercepting external variables, the value is passed to the constructor of the structure for saving. If C array is passed, let’s say a[10], The assignment inside the constructor is int a[10] = a, which is not allowed by the C specification (arrays cannot be assigned directly, char* can be used instead). Blocks follow the C syntax.

A block intercepts the value of an external variable. It intercepts the value of an external variable immediately when the block syntax defines it. After saving the value, it cannot be overwritten. It is a syntax requirement of a block that the value cannot be overwritten. If a pointer variable is intercepted, it can be used to modify the value in memory space. For example, if you pass in an NSMutableArray variable, you can add objects to it, but you can’t assign to that variable. Int *val can also be passed with *val = 20 to change the value in memory that the pointer to val points to. Reading the value outside the block is also synchronized with changes inside the block. After all, they operate on the same memory space.

The possible reason why the syntax is immutable is that the value cannot be passed out, but is used inside the block, which makes no sense. For example, if you intercept the variable val in the block definition, you’re using the variable val, and you’re just assigning the value of val to the val member of the block structure. If we change the value of val inside the block, we will change the value of val outside the block. If we intercept the pointer variable, we will intercept the address that the pointer variable points to. Inside a block, changes are made only to the direction of the block structure’s member variables. These changes are irrelevant to external variables.

A compilation error occurs when you try to change a member variable with the same name as an external variable inside a block expression. Because it is implementationally impossible to overwrite the value of a intercepted external variable, a compilation error occurs when the compiler detects an assignment to a intercepted external variable during compilation. Theory within the block member variables has been completely without involvement and external variables, block structure member variables is theoretically can change, but change here is only part of the structure itself member variable, and and external completely the same, if changed the internal member variable developers would mistakenly assume that joint external variables change together, It’s better to just have a compilation error (one guess)! __block variables are created to modify external variables within block expressions.


76. What are the ways to modify external variables in block expressions?

(This ignores many previous examples of passing Pointers directly to change the value of a variable.)

  1. There are variable types in C that allow block to overwrite values: static variables, static global variables, global variables.

While the anonymous function part of the Block syntax is simply converted to C functions, accessing static/global variables from the function of this transformation remains unchanged and is directly accessible. However, in the case of static local variables, the converted function is set outside of the function with block syntax, so the static local variable cannot be accessed directly from the variable scope. In our clang-rewrite-objc code we can clearly see that static local variables are defined in the main function, Static void __main_block_func_0(struct __main_block_impl_0 *__cself){… } is a static function defined entirely externally.

Static variable access, outside of scope, should be considered, although the code is written together, but after the transformation is not in the same scope, can be accessed across scope only by Pointers. (This is a guess)

A static local variable captured by a block is declared as a pointer member variable of the original type in the block structure. The address of the static local variable is then used to assign the initial value when the block structure is initialized. You can see that the access and use of global_val and static_global_val in __main_block_func_0 is exactly the same as it is outside the block. The static_val variable is accessed through a pointer, In the initializer list of the __main_block_IMPL_0 constructor, &static_val is assigned to the int *static_val member of the struct __main_block_IMPL_0, In this way, variables are accessed and modified outside the scope of the variable through an address.

This approach to static variables seems to work for external variables as well, but why not?

However, if the block does not hold this variable, as bock did with weak and unsafe_unretained variables, it is possible to store external variables of intercepted objects that exceed their scope. When the scope of the variable ends, the automatic variable will probably be released and destroyed. If the automatic variable is accessed again, the weak variable will be set to nil, while the unsafe_unretained variable will crash directly due to the access of wild pointer. While accessing static local variables will not appear this kind of problem, static variables are stored in a static variable, it is always there, before the end of the program is referred to as local, simply say the scope cannot be directly accessed through the variable name it (compared to a global variable in the module can directly access to any location), is not to say that this piece of data, As long as we have a pointer to the static variable, it can still be accessed out of scope, so external blocks cannot be treated in the same way as static local variables.

  1. The second is to use the __block specifier. A more accurate expression is “__block storage domain specifier “(__block storage-class-specifier).A struct instance is generated for a variable that uses a __block modifier, regardless of whether it is used in the block.

According to the clang-rewrite-objc conversion, __block val is converted to struct __Block_byref_val_0 (0 indicates the current number of __block variables). (__Block_byref_val_0 is prefixed by __Block, followed by byref for the variable modified by __Block, val for the original variable name, and 0 for the current __Block variable number)

struct __Block_byref_val_0 {
  void *__isa;
__Block_byref_val_0 *__forwarding; // A pointer to oneself
 int __flags;
 int __size;
 int val;
};
Copy the code

__Block_byref_val_0 is a separate definition that can be reused in multiple blocks.

Struct __Block_byref_val_0 (struct __Block_byref_val_0, struct __Block_byref_val_0, struct __Block_byref_val_0) __Block_byref_id_object_copy and __Block_byref_id_object_dispose are used to copy and dispose __block variables into the heap.

struct __main_block_impl_0 {
  struct __block_impl impl;
  struct __main_block_desc_0* Desc;
  
  FMT is the same as the previous conversion. FMT is the same as val
  const char *fmt;
  
  // val is a __Block_byref_val_0 structure pointer
  __Block_byref_val_0 *val; // by ref
  
  // The __Block_byref_val_0 * _val argument is first seen, but val(_val->forwarding) is used in the initialization list, and _val->forwarding is used in the initialization
  
  __main_block_impl_0(void *fp,
                      struct __main_block_desc_0 *desc,
                      const char *_fmt,
                      __Block_byref_val_0 *_val,
                      int flags=0) : fmt(_fmt), val(_val->__forwarding) { impl.isa = &_NSConcreteStackBlock; impl.Flags = flags; impl.FuncPtr = fp; Desc = desc; }};Copy the code

Assigning to a static variable in a block just uses a pointer to that static variable. Assigning to a __block variable is more complicated, The __main_block_IMPL_0 struct instance holds a pointer to the __Block_byref_val_0 struct instance of the __block variable. The member variable __forwarding of the __Block_byref_val_0 struct instance holds a pointer to the instance itself, and the member variable val is accessed through __forwarding. (The member variable val is a variable held by the instance itself, which is equivalent to the original external variable.)

The pair of copy and Dispose functions are generated when an external object variable and an external __block variable are used within a block expression, as well as an external block.


77. Block storage domain.

Block is converted to an instance of a block structure and a __block variable is converted to an instance of a __block variable structure.

Blocks can also be viewed as OC objects. When a block is considered as an OC object, the block has one of three classes: _NSConcreteStackBlock, _NSConcreteGlobalBlock, and _NSConcreteMallocBlock. Malloc is set in the heap allocated by the malloc function. Malloc is set in the heap allocated by the malloc function. Malloc is set in the heap allocated by the malloc function.

class Sets the storage domain of the object
_NSConcreteStackBlock The stack
_NSConcreteGlobalBlock The data area of the program (.data area)
_NSConcreteMallocBlock The heap

When block syntax is used where global variables are written, the generated block is of type _NSConcreteGlobalBlock. The specific type of block is not visible from the Clang conversion code; the actual ISA reference of the block is determined dynamically at runtime.

void (^blk)(void) = ^ {printf("Global _NSConcreteGlobalBlock Block! \n"); };

/ / print:_NSConcreteGlobalBlock Block! ❄ ️ ❄ ️ ❄ ️ block isa: < __NSGlobalBlock__ :0x100002068>
Copy the code

Blocks defined in the global scope, that is, the block struct instance is stored in the data area of the program. Because automatic variables cannot be used where global variables are used, there is no interception of automatic variables. Thus, the contents of a block’s struct instance do not depend on the state at execution, so only one instance is needed in the entire program. So just set the block with the struct instance in the same data area as the global variable.

Only when automatic variables are intercepted will the intercepted values of the block struct instance change according to the state at execution. Even if a block is defined in a function instead of a global variable, as long as the block does not intercept automatic variables, blocks can be set with the struct instance in the data area of the program. This is a global block.

For blocks that do not capture automatic variables, we do not need to rely on their run-time state — captured variables, so we are not involved in the copy of the block, and therefore in the data area. 支那

Also note that isa compiled through clang shows up as stackblock in the second case, because OC isa dynamic language and the true type is determined at run time, which can be viewed using the LLDB debugger. Although the source code for the clang conversion is usually of type _NSConcreteStackBlock, the actual runtime is different. The summary is as follows:

  • Block syntax is used to describe global variables
  • Block syntax does not intercept external automatic variables in an expression

In this case, the block is of type _NSConcreteGlobalBlock, that is, the block is configured in the program’s data region. Block syntax generates blocks of type _NSConcreteStackBlock and stores them on the stack.

Blocks configured in the global variable area can be safely used by Pointers from anywhere, but blocks on the stack are deprecated if the scope of the variable they belong to ends, since __Block variables are also configured on the stack. Similarly, if the scope of the variable they belong to ends, The __block variable is also discarded. Blocks provide a way to solve this problem by copying block and __block struct instances from the stack to the heap. Copies blocks configured on the stack to the heap so that blocks on the heap continue to exist even after the variable scope described by the block syntax ends.

Here’s a question to consider: how do we manage our assignment, modification, and discard operations when we have a block on the stack and a block on the heap? Block ISA copied to the heap will point to _NSConcreteMallocBlock, that is, impl. Isa = &_NSConcretemalLocblock; , the __forwarding member on the stack points to the __block instance on the heap, and the __forwarding member on the heap points to itself, Then the same __block instance content can be accessed from either a __block variable on the stack or a __block variable on the heap.

What exactly does block provide for replication? In fact, in most cases under ARC, the compiler will make the appropriate judgment and automatically generate code that copies blocks from the stack to the heap.

There are two scenarios in which blocks are automatically copied from stack to heap when assigned:

//
// Clang-rewrite-objc can convert successfully
typedef int(^BLK)(int);

BLK func(int rate) {
    // The right stack block is copied to the heap and held by temp
    BLK temp = ^(int count){ return rate * count; };
    return temp;
}

// The following code fails to convert to clang-rewrite-objc, and will succeed:
typedef int(^BLK)(int);

BLK func(int rate) {
    // Return the stack block directly
    return^ (int count){ return rate * count; };
}

// Failed description, the clang conversion failed, but it is normal to execute the function directly
// Clang conversion error description returns a block in the stack,
// When the stack block is out, the curly braces are released, so it cannot return,
// Clang cannot dynamically copy stack blocks to the heap.
// The block to the right of the equals sign has been copied to the heap and assigned to temp.

// When executed correctly, the compiler can dynamically copy blocks from stack to heap.

returning block that lives on the local stack
return^ (int count){ returnrate * count; }; ^ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~64 warnings and 1 error generated.

//
BLK __weak blk;
{
    NSObject *object = [[NSObject alloc] init];
    
    // NSObject * __weak object2 = object;
    
    void (^strongBlk)(void) = ^ {NSLog(@"object = %@", object);
    };
    
    // BLK is a weak reference variable, assigned to it by a strong variable,
    // It does not hold the strong variable
    blk = strongBlk;
}

// blk();
printf("blk = %p\n", blk);

// Print normal, curly braces out, block instance has been released:
blk = 0x0

BLK __weak blk;
{
    NSObject *object = [[NSObject alloc] init];
    // NSObject * __weak object2 = object;
    // void (^strongBlk)(void) = ^{
    // NSLog(@"object = %@", object);
    // };

    // Here is a warning:
    // Assigning block literal to a weak variable; object will be released after assignment
    blk = ^{
        NSLog(@"object = %@", object);
    };
    
    printf("Internal BLK = %p\n", blk);
}

// blk();
printf("blk = %p\n", blk);

// Print the address of the stack block (BLK not 0x0)
// Print a stack block address (i.e. the stack block address on the right)

// The reason for this is that taking an instance of the stack block structure to assign a weak variable is not the real weak process (OC dealloc)Within a series =0x7ffeefbff538
blk = 0x7ffeefbff538
Copy the code

Under ARC, when a block is returned as a function return value, the compiler automatically generates code that is copied to the heap.

In most cases the compiler will make the appropriate judgment, but in other cases you need to manually generate code (calling copy yourself) to copy a block from the stack to the heap (_Block_copy creates a heap-based copy of the block, as noted in the comment on the _Block_copy function). That is, we actively call the copy instance method ourselves.

What happens when the compiler can’t tell?

  • Pass a block to a function argument, but if the argument is properly copied in the function, there is no need to manually copy before calling the method or function.

Instead of manually copying the following methods or functions, the compiler automatically copies them:

  • Cocoa framework methods with usingBlock in their names.
  • Grand Central Dispatch API.
  • When assigning a block to a member variable of type ID or block with the __strong modifier.

NSArray enumerateObjectsUsingBlock and you won’t have to manually copy dispatch_async function. Manual copy is required when passing blocks on initWithObjects of NSArray.

Here is 🌰, which actively copies blocks added to an array:

id obj = [Son getBlockArray];
void (^blk)(void) = [obj objectAtIndex:0];
blk(a);// Copy is called on block
+ (id)getBlockArray {
    int val = 10;
    return [[NSArray alloc] initWithObjects:[^{NSLog(@"blk0: %d", val); } copy], [^{NSLog(@"blk1: %d", val); } copy], nil]; }// If copy is not added, the operation crashes
+ (id)getBlockArray {
    int val = 10;
    return [[NSArray alloc] initWithObjects:^{NSLog(@"blk0: %d", val); ^ {},NSLog(@"blk1: %d", val); }, nil]; }// Cause of crash: When copy is not called actively, the block on the stack is discarded after the getBlockArray function completes, and the compiler cannot determine whether copy is needed.
// Copying blocks from stack to heap is CPU intensive, and when blocks are available on the stack, copying them from stack to heap is a waste of CPU resources.
// Copy the file manually.
Copy the code
Block of the class The configuration storage domain of the copy source Print effect
_NSConcreteStackBlock The stack Copy from stack to heap
_NSConcreteGlobalBlock The data area of the program Do nothing
_NSConcreteMallocBlock The heap Reference count increment

No matter where the block configuration is, copying with the copy method doesn’t cause any problems, and when you’re not sure if you need to copy, you just call the copy method.

🎉🎉🎉 To be continued…