Article Summary:This paper mainly analyzes the implementation principle of RxSwift operator, and then introduces Swift reflection mechanism, Swift function distribution mechanism and namespace mechanism. At the same time, we design a set of dynamic and static Hook Swift solutions, hoping to be helpful to the majority of iOS developers.


1. Background: RxSwift pain

RxSwift is a functional responsive programming framework developed by GitHub’s ReactiveX team. Its main idea is to encapsulate events into signal flows and implement monitoring by observer mode.
When you use RxSwift to do simple things like send a network request, listen for button clicks, etc., your code looks very straightforward and neat, but if you use RxSwift to do an asynchronous heat flow that is passed and converted from one class to another, the code becomes much less readable. It can even be difficult to debug because you can’t catch the stack generated by asynchronous events.



To solve the debugging problem of RxSwift, we analyzed the implementation principle of RxSwift operator by reading the source code, and then used Swift reflection mechanism to dump “Observable Link”. Finally, according to the function distribution mechanism and namespace mechanism of Swift language, a set of safe and efficient hook Swift dynamic and static method scheme is designed. Through this hook scheme, the interception and processing of key functions on the transmission chain of convective events is completed, so as to achieve the goal of precise positioning and debugging of asynchronous events in RxSwift.



2. Dump Observable Link

2.1 Brief analysis of implementation principle of RxSwift operator

An Observable can be converted into a new Observable using operators, and the source Observable forms an Observable Link after some continuous operator transformations. Tracing the source of an asynchronous event starts with finding the Head node of the entire Observable Link.
After reading the RxSwift source code, I found that the basic principle of RxSwift operators is that when you convert an Observable A, the operator generates A new Observable B. And holds the original Observable A inside the new Observable B. When someone else subscribes to Observable B, Observable B also subscribes to Observable A internally to achieve the “linkage” effect of the entire Observable Link. At this point you may have some ideas. Since each operator holds a previous Observable inside it, can we follow this rule by tracing an operator all the way up to the root Observable and dump the entire Observable Link? That’s true, but the reality is that all the properties that an Observable uses to hold the original Observable are Private, which means you can’t get them directly! Fortunately, Swift’s reflex mechanism can also be used to achieve this goal.

2.2 Swift Reflection mechanism

Despite Swift’s emphasis on strong typing, compile-time safety, and static scheduling, its standard library provides a mirror-based Struct for reflection. Simply put, if you have A Class A and create an instance of A, then you can use Mirror(reflecting: A) to generate A Mirror object M, and then walk through M. child to get all the attributes of A.
You should know how to dump an Observable Link.

2.3 Dynamically adding storage attributes for existing classes

All observables on the dumped Observable Link are the objects that we need to focus on during runtime. So how can we distinguish these Observables from other Observables? We can add a tag attribute to an Observable, and monitor the events generated on the Observable if the tag is not empty at runtime. However, there is an association type problem. The any type can be cast to a protocol type, but not to the protocol type of the association type, because the exact type of the association is unknown. To solve this problem, we designed an unassociated type protocol RxEventTrackType, added the eventTrackerTag attribute to the extension of this protocol, and then made Obseverble comply with this protocol. To add a storage property to extension for a protocol type, I chose an implementation that was often used in the OC era: objc_setAssociatedObject.


Hook Swift dynamic and static methods

3.1 Function distribution mechanism of Swift

Function dispatch deals with how to call a function. There are three common methods of function Dispatch in compiled languages: Direct Dispatch, Table Dispatch, and Message Dispatch. Swift supports all three function distribution modes.
Direct Dispatch is the fastest, not only because there are fewer sets of instructions to call, but also because the compiler has a lot of room for optimizations, such as function inlining. Static invocation, however, means that programming does not support inheritance because of its lack of dynamism.
Table Dispatch is the most common implementation of dynamic behavior in compiled languages. The function table uses an array to store Pointers to each function declared by the class. Most languages refer to this as a “virtual table”; Swift refers to it as a “witness table”. Each class maintains a function table, which records all the functions that need to be distributed through the function table. If the parent function is overridden in the class, only the functions after the override are stored in the table. Any new function added by a subclass to the declaration body is inserted after the list of functions that the runtime uses to determine which function is actually called.
Message Dispatch is the most dynamic way to invoke functions, and this mechanism has spawned features like KVO, UIAppearence, and CoreData. The key to this approach is that developers can change the behavior of functions at run time, not only with Swizzling, but even with ISa-Swizzling to modify the inheritance relationship of objects, and can implement custom distribution on an object-oriented basis.
Summary of Swift function distribution rules:
  • Functions in a value type declaration scope always use direct distribution
  • All functions in the Class declaration scope are distributed using the function table (in some special cases the compiler optimizes to distribute directly)
  • Extensions for both protocols and classes use direct distribution
  • Functions declared in the protocol with a default implementation are distributed using the function table
  • Functions decorated with dynamic are dispatched via the message mechanism at run time

3.2 Hook difficulties of static language Swift

Compared with the dynamic language OC, the method Hook of the static language Swift becomes extremely difficult. The main reasons are as follows:
1. It is difficult to find the objective function
In the OC we can find the corresponding method by a Selector (you can think of it simply as a string), and the IMP field inside the method stores the pointer to the function. Dynamic methods in Swift use witness table or protocol Witness table to find corresponding function Pointers by offset addressing, and static methods in Swift have their addresses determined at compile time.
2. It is dangerous to forcibly replace function Pointers directly
If we need to Hook dynamic methods in Swift, we can use Xcode’s LLDB debugging tool to disassemble and record the offset of a function in witness table at run time. Then find the meta data of the class and find the corresponding function pointer based on these offsets to Hook. However, this is a very dangerous approach, and if Swift ever tweaks the memory model of its class objects, our inherent offset hooks will burst!
3.3 Graft – Use namespaces wisely
In Swift each module represents a separate namespace, and you can define the same type name or method name in different modules. For example, Swift defines an lowercased method in String, which is the basic data type provided by us. If we use extension to add an lowercased method to String in our module, In this case, the two lowercased methods can coexist, and when you invoke String’s lowercased method in your module, the lowercased method in your module takes precedence over the lowercased method in your module by default.
The Hook method in Swift seems to have some bearings, but there is still a more important problem to solve: how to invoke the native Lowercased method in our own lowercased method? The answer, again, is to use namespaces. It is feasible to build an originalLowercased method in the B module for String. The internal implementation of this method is simple: it simply calls String’s native lowercased method. Then we can call originalLowercased in our module’s lowercased method to indirectly implement the calling of String’s native lowercased method.




Unfortunately, the method of using schema hooks as described above only works in our own Module, but is sufficient for general Hook requirements.


4. Hook RxSwift method

The above introduction of Hook has given us a sufficient theoretical basis, and now we can use theory to guide practice.
If you want to trace the source of a flow event, the key thing to do is listen for the onNext, onError, onComplete methods of ObserverType, and the Accept method of BehaviorRelay. Then, when the onNext method of an object of ObserverType is called, if the object is found to have observerTypeTrackerTag, it is considered to be an object that needs to be observed and monitored. We can also add a conditional breakpoint here to facilitate debugging. The code screenshot is as follows:

Use this location tool to trace and locate the one-step event source debugging as shown in the Gif below:


5. To summarize

In the research and development process of the RxSwift asynchronous event tracking and positioning tool, one of the most critical and difficult points is how to realize the dynamic and static methods of Hook Swift. After trying two or three schemes, we finally determined the scheme of using the function distribution mechanism and namespace mechanism of Swift language to secure and efficient dynamic and static methods of Hook Swift. We believe that our hook scheme will also bring you more ideas and inspiration when dealing with similar problems in the future development.