Sometimes, you want to change the behavior of existing functionality in your iOS or Mac OS applications. You may also want more fine-grained control over how the function behaves. At least two points will come to mind: logging and tracing. As an SWE in Product Science, I am particularly interested in tracing so I will focus on related aspects in this article.
Let's explore how we can change the behavior of functions written in Swift using a small library called fishhook by Meta. This library allows us to hook into existing functions and modify their behavior at runtime. SwiftTrace also uses fishhook, but with additional features that some developers may not need or want, such as trampolines written in assembly language.
Here, we will skip over many technical details about how fishhook works that can easily be found in references. We are not interested in trivial cases, such as swizzling Objective-C code or dynamic Swift methods. If you want to learn how to interpose C functions using fishhook, please follow the examples here.
Instead, in this article, we will focus on the practical aspects of using fishhook to interpose Swift code.
The problem statement
When trying to trace the execution of asynchronous functions in iOS or Mac OS applications, a common challenge is tracing both the async call and the inner closure. To illustrate this challenge, let's consider a sample piece of code:
To make things more challenging, let's assume that the closure passed to async captures some outer variables and uses self (a reference to DispatchQueue) within the block. So, how can we achieve tracing in this scenario?
When we hook C methods (e.g., open and close) using fishhook, it's a straightforward process:
However, when it comes to interposing Swift methods, things get more complicated.
The first thing to consider is mangling. Mangling is a way to represent Swift symbols in libraries.
Mangled symbols look like this:
$sSo17OS_dispatch_queueC8DispatchE5async5group3qos5flags7executeySo0a1_b1_F0CSg_AC0D3QoSVAC0D13WorkItemFlagsVyyXBtF
Here is a mangled symbol for the async extension method of the DispatchQueue class:
(extension in Dispatch):__C.OS_dispatch_queue.async(group: Swift.Optional<__C.OS_dispatch_group>, qos: Dispatch.DispatchQoS, flags: Dispatch.DispatchWorkItemFlags, execute: @convention(block) () -> ()) -> ()
To replace the method with our augmented implementation, we need to pass the mangled symbol name into the function rebind_symbols.
The book by the author of SwiftTrace library provides some useful code for us to borrow:
While we won't use this function in our current implementation, it is still beneficial. If we are dealing with more complex prototypes, it becomes much simpler to search for interesting symbols through the mangled symbol table in a dynamic library. Additionally, the annotation @_silgen_name(<function_name>) is significant since it allows access to hidden C functions from Swift code and vice versa, calling Swift functions from C. The latter ability is essential for our usage.
Note: It's important to know how to obtain a list of mangled symbols for a particular library. As mentioned earlier, you can write some code or copy-paste one from a library, such as SwiftTrace, to list symbols. But it's easier to use built-in Mac OS commands like nm <filename> and otool -l <filename> to get started.
Our approach to interposing Swift methods
Why do we need to call a Swift function from C anyway? The naïve approach is to replace one Swift symbol with another one directly, but this approach has some problems.
First, the calling convention is different for C functions and Swift methods. C-convention convention(c) is for calling C functions, but Swift uses a "thick" convention. This means that a Swift method cannot be cast to a proper function C pointer using unsafeBitCast. As a result, you cannot directly use the pure Swift approach to interpose Swift symbols.
The second issue concerns the type of extension method. Swift uses a “curried” form for such types, which means that the function type
DispatchQueue.async(execute: DispatchWorkItem)
appears as
(DispatchQueue) -> (DispatchWorkItem) -> Void
This can be misleading, as one might think that it represents the actual signature of the symbol, when in reality, the actual signature is (DispatchWorkItem) -> Void.
This leads to another critical question: how can we access self? After examining the generated assembly code (skipping over LLVM IR and SIL for simplicity), we discover (with some effort) that self is stored in a register. This is logical since self iis heavily used in class methods and should be easily accessed. Furthermore, it should be stored in the same register, so we can quickly determine which one that is. To access the specific CPU register, we need to write only one line of assembly code embedded into C:
To summarize, our approach to interposing Swift methods involves the following steps:
- Create a C replacement of a specific Swift method with a compatible signature.
- Store the pointer to the original implementation.
- Use inline assembly to obtain access to self.
- Modify arguments when calling Swift code from our C function to replace the original Swift method.
- Finally, change some arguments before passing them into the original function, if we want to augment the closure block in the async call.
Let's now proceed to implement this plan step by step.
Here's an example of a compatible C signature to store the original method:
The new implementation of the async method will look like this:
We are only interested in modifying the block argument. Therefore, our transforming function takes the block and a pointer to self as arguments to augment the closure.
To make the Swift function transformBlock visible from C code, we define it as follows:
The final Swift implementation is defined using the @_silgen_name attribute:
Conclusion
Fishhook is a powerful library that enables developers to interpose Swift functions at runtime. By tracing the execution of asynchronous functions, we can gain insights into the behavior of our applications and find ways to optimize their performance.
However, interposing Swift methods using fishhook is not a trivial task. We need to consider issues such as mangling, calling conventions, and curried function types. Some symbols can be easily interposed by fishhook in Mac OS but not in iOS (e.g., DispatchQueue.async(execute: DispatchWorkItem)).
Despite these challenges, interposing Swift functions is a valuable technique. In this article, we have provided a general approach to interposing Swift functions that can be useful in many scenarios.
You can find the source code for our implementation for Mac OS here.
About the author: Vitaly Khudobakhshov leads the development of core technologies at Product Science that identify patterns in mobile device function executions, which is essential for optimizing performance. Previously, he was a team lead at JetBrains and authored a Kotlin shell.
Acknowledgments: Thanks to David Liberman, Ryan Peterson, Gleb Morgachev, and others.
If you’re interested in tackling challenges like this, join our team! Time is humanity’s most valuable non-renewable resource. Our mission is to help all people in the world stop experiencing delays from software inefficiency.