preface

This “Swift collaborative task cancellation” discussion is launched under the premise of “Swift structuration and concurrency” on the task cancellation operation, for structured concurrency do not understand can first go to see structured concurrency, very powerful.

Cooperative cancellation

Canceling tasks is very difficult in a callback based heuristic concurrency model. Concurrent tasks may run out of the current scope, and the lack of connection between concurrent tasks, we often need to maintain the relationship between the individual tasks, hold those tasks could be cancelled, and under the right circumstances will they stop, the complexity of the involved, this actually is feasible in theory, there must be a variety of bugs.

For structured concurrent, we also introduced its father and son in front of the tree hierarchy relationship between tasks, because the parent task to cancel this hierarchy can be easily passed into subtasks, this task can be held without task about the parent task under the condition of reference, to cancel a response (such as cleaning up resources, etc.).

However, cancelled delivery in structured concurrency does not mean that resources that need to be released manually when a task is cancelled can be “automatically” reclaimed, nor will the task itself stop automatically when it is cancelled. Swift concurrency and task cancellation are collaborative cancellation: the parts that make up the task hierarchy, including parent and child tasks, often need to work together to achieve the desired result. Cancelled delivery in structured concurrency is only part of collaborative cancellation.

A preliminary study on structured task cancellation

Let’s forget about structured concurrency for a moment and look at the simplest top-level task example. For example, in the following task, we append a character to the result every second:

Func work() async -> String {var s = "" for c in "Hello" {// await Task.sleep(NSEC_PER_SEC) print("Append: \(c)") s.append(c) } return s }Copy the code

As with the other examples you’ve seen before, task.sleep is used to simulate time-consuming operations. We create a task, perform work in it, and cancel the task after some time:

Sleep (UInt64(2.5 * Double(NSEC_PER_SEC))) t.ancel () / / 2.5 sCopy the code

At 2.5s, we call t.canel () to cancel the task. But when we look at the console output, we can see that t actually executes to the end:

// Output: // Append: H // Append: e // Append: l // Append: l // Append: o // Hello"Copy the code

It didn’t seem to stop on the third sleep, as we “expected.” So the question is, what does cancel do? Task.iscancelled checks the cancelled status of the current Task, so add it to the output:

// print("Append: \(c)") print("Append: \(c), cancelled: \(task.iscancelled)") false // Append: e, cancelled: false // Append: l, cancelled: true // Append: l, cancelled: true // Append: o, cancelled: true // HelloCopy the code

At the end of the third sleep, the task’s isCancelled was true, indicating that the cancellation did take effect, but the task did not stop and was still executed at the end.

In fact, Swift concurrent calls cancel to a task do only two things:

  1. Set the isCancelled flag for your task to true.

  2. In structured concurrency, if the task has subtasks, cancel the subtasks.

Subtasks do both of these things when canceled. In structured concurrency, cancellation is passed to all child nodes below the current task node in the task tree.

  1. SubTask 1 and SubTask 2 are subtasks of the Root task. IsCancelled for SubTask 1 is marked true if cancel() is called on SubTask 1.

  2. All subtasks passed to SubTask 1 are then cancelled and their isCancelled is marked true.

  3. Cancel operations are passed down the structured task tree to the last leaf node.

The cancel() call maintains a Boolean variable, and that’s it. It doesn’t involve anything else: tasks don’t stop because they’re canceled, and they don’t return early. This is why we call cancellation in Swift concurrency “collaborative cancellation” : tasks need to cooperate in order to ultimately stop execution. All the parent task has to do is pass isCancelled to the child task and set its own isCancelled status to true. When the parent task has finished its work, it is up to the child implementations to check for isCancelled and respond appropriately. In other words, if isCancelled is not checked by anyone, the collaborative cancellation will not work and the whole task hierarchy will appear to have no cancellation support at all. This is why the task was executed to the end in our example.

Structured task cancellation scheme

  1. In our implementation of Work, we use the try Task.checkCancellation() function to detect the cancellation of the Task and throw a CancellationError. This part of the code in Task 1.1 or Task 1.2 will be triggered and the error will be thrown to SubTask 1.
func work(_ text: String) async throws -> String {
    var s = ""
    for c in text {
        if Task.isCancelled {
            print("Cancelled: \(text)")
        }
        try Task.checkCancellation()
        await Task.sleep(NSEC_PER_SEC)
        print("Append: \(c)")
        s.append(c)
    }
    print("Done: \(s)")
    return s
}
Copy the code
  1. This error is not handled in subtask.addTask, so it will be further thrown to the upper layer, Root.

  2. As the parent task, the outer Root will actively cancel all subtasks in the task tree after receiving the error of SubTask 1, and wait for all subtasks to complete (whether they return normally or throw errors) before processing the error. In this case, there is only one other SubTask SubTask 2 in Root besides SubTask 1. IsCancelled for SubTask 2 is then set to true and the relevant check in work is triggered to throw a cancellation error.

Handling task cancellations

Now let’s look at how isCancelled can actually be used to stop asynchronous tasks in a task. Structured concurrency requires that asynchronous functions execute within task scope, so when confronted with task cancellation, there are roughly two options if we want to process and end the task early:

  1. Returns a null value or partially calculated value ahead of time, allowing the current task to end normally.

  2. Abort the current task by throwing an error and reporting it to the parent task.

Let’s take a look at each of these.

Returns a null or partial value

When the cancellation of a task does not affect the process, we can complete the current task by returning null or partial values in advance. We can do this by checking Task.iscancelled. For example, rewrite work as:

Func work() async -> String {var s = "" for c in "Hello" {// Check the state guard! Task.isCancelled else { return s } await Task.sleep(NSEC_PER_SEC) print("Append: \(c)") s.append(c) } return s } func start() async { let t = Task { let value = await work() print(value) } await Task.sleep(UInt64(2.5 * Double(NSEC_PER_SEC))Copy the code

Throw an error

If a task’s completion (or return value) is critical in concurrent operations, and other tasks must rely on the task’s completion to proceed, returning a null or partial value is no longer a viable option.

For example, we are implementing a framework for image downloading and caching, which basically has three steps:

First we download image data from the Internet

This data is then cached to disk

Finally, the image itself is provided to the caller of the framework

These three tasks: downloading data, caching data, and providing images are not equally important. The task of caching and serving images is dependent on the download task: caching and serving images only makes sense if the download data is truly complete. But the task of serving the image does not depend on the cache task: even if the cache fails, the image can be generated from the downloaded data. So, when designing these tasks, when the cache task is canceled, we can choose to return partial results or nil; But when the download task is cancelled, we can only throw an error telling the framework caller that the task cannot be completed.

Cancellation of the built-in API

You might find it a bit cumbersome because when designing a concurrent system, if we want to respond to cancellations as quickly as possible, we need to add a try Task.checkCancellation() before and after each await. While this is not difficult, it is clearly repetitive work and template code.

In the above example, task.sleep (_:) itself does not support cancellation: it faithfully counts to a set time and then returns the control flow. However, Swift also provides a cancelable version of sleep in the Task API that takes a named parameter nanoseconds and is marked throws to distinguish:

extension Task where Success == Never, Failure == Never {
    static func sleep(nanoseconds duration: UInt64) async throws
}
Copy the code

When a cancellation is encountered, sleep(nanoseconds:) interrupts directly and raises a CancellationError. If we use this version of sleep to rewrite work, we can no longer do the checkCancellation manually:

func work(_ text: String) async throws -> String { var s = "" for c in text { if Task.isCancelled { print("Cancelled: \(text)") } // try Task.checkCancellation() // await Task.sleep(NSEC_PER_SEC) try await Task.sleep(nanoseconds: NSEC_PER_SEC) s.append(c) // ... } // CancellationError() CancellationError() CancellationError() CancellationError() CancellationError()Copy the code

Sleep (nanoseconds:) is more timely in throwing errors than sleep(_:), which checks each time before await. It does not need to wait until the current await has finished. Sleep (nanoseconds:) is a better implementation than the original way of handling cancellations.

Canceled cleanup

Defer: Ensure that the code is called after leaving scope

func load(url: URL) async throws { let started = url.startAccessingSecurityScopedResource() if started { try Task.checkCancellation() Await doSomething(URL) try task.checkCancellation () await doAnotherThing(URL) // The call may not have been executed url.stopAccessingSecurityScopedResource() } }Copy the code

In the synchronous world, to avoid rewriting cleanup code on various exit paths, we tend to use defer to ensure that code is called after leaving scope. This technique applies to asynchronous operations as well. In the code above, we simply add defer to if started to handle resource cleanup on cancellation:

func load(url: URL) async throws {
    let started = url.startAccessingSecurityScopedResource()
    if started {
        defer {
            url.stopAccessingSecurityScopedResource()
        }
        await doSomething(url)
        try Task.checkCancellation()
        await doAnotherThing(url)
        try Task.checkCancellation()
    }
}
Copy the code

Cancellation Handler

When you use defer, it is triggered only when an asynchronous operation is returned or thrown. If we use the checkCancellation to check for cancellation each time we await it, the error will actually be thrown later than the task will be canceled: Cancellation during a pause in asynchronous function execution does not immediately result in a throw, but only the next time checkCancellation is called to check and defer is triggered for resource cleanup. While in most cases this time difference should not be a problem, there are two cases where we might wish for a more “real-time” approach to handling cancellations.

unc asyncObserve() async throws -> String { let observer = Observer() return try await withTaskCancellationHandler { observer.start() return try await withUnsafeThrowingContinuation { continuation in observer.waitForNextValue { value, error in if let value = value { continuation.resume(returning: value) } else { continuation.resume(throwing: error!) }}}} onCancel: {// Clear resources observer.stop()}}Copy the code

Without withTaskCancellationHandler, we in the packaging the asynchronous operation, with the function of “cancel” will have to be in the form of polling, in continuation. Resume to constantly check Task. Before isCancelled, This can make cancellation untimely and even cause the held resource to never be released if a new event does not occur. In contrast, onCancel gives us a more correct and elegant solution.

Cancellation of asynchronous sequences

The most important part of the asynchronous sequence protocol is the next() async throws function defined by the AsyncIterator. This function is already marked as throws, so as with other asynchronous operations, we can choose to check if the task has been canceled when implementing next() and throw the appropriate error:

mutating func next() async throws -> Int? {
    try Task.checkCancellation()
    return try await getNextNumber()
}
Copy the code

Thus, when the Task running in an asynchronous sequence with for try await is cancelled, the sequence will be thrown and terminated when the next element in the sequence is evaluated.

Implicit wait and task pause

Potential pause points for structured concurrency

In the introduction of asynchronous functions, we mentioned that await represents potential pause points. We need to pay special attention to the possibility that the execution context of an asynchronous function may change before and after await, including the cancellation state of the task. Therefore, if we choose to use isCancelled or checkCancellation to check the task cancellation, await is a good sign: checking the cancelled status of the task before and after await is a great way to save worry and effort.

In structured concurrency, however, there is an implicit await. As we said earlier, in taskGroups, if we do not explicitly wait for tasks in the group, they will implicitly wait before leaving the group scope:

let t = Task {
    do {
        try await withThrowingTaskGroup(of: Int.self) { group in
            group.addTask { try await work() }
            group.addTask { try await work() }
        }
    } catch {
        print("Error: \(error)")
    }
}
await Task.sleep(NSEC_PER_SEC)
t.cancel()
func work() async throws -> Int {
    try await Task.sleep(nanoseconds: 3 * NSEC_PER_SEC)
    print("Done")
    return 1
}
Copy the code

Running the code above, you will see neither “Done” output in work nor “Error” output in the catch block. This is because we did not explicitly try await the group. Try await work only lives inside addTask, and its throw is passed up to the group, but since we did not explicitly try await group, the error does not continue to be passed to the outer layer of the withThrowingTaskGroup. It is not particularly obvious that an implicit wait on leaving scope would choose to digest the error itself rather than throw it. Seemingly covering a complete branch of code but not executing either is a very difficult situation to debug and understand.

The assumptions and situations faced by a group without a full await are much more complex than when an await is written in full. So I recommend that we keep the good habit of explicitly writing out the group wait operations, whether or not we ultimately need the return value of the subtask. For example, filling in a try await when leaving scope makes the catch block work correctly when a cancellation is received:

do {
    try await withThrowingTaskGroup(of: Int.self) {
        group in
        group.addTask { try await work() }
        group.addTask { try await work() }
        try await group.waitForAll()
    } catch {
        print("Error: \(error)")
    }
}
Copy the code

summary

Collaborative task cancellation is an important part of Swift concurrency. Compared to the traditional concurrency model, in dealing with a “normal” path, perhaps structural advantages of concurrency is not so obvious, but when handling errors or cancel, cancel the tag in the task in the hierarchy tree and checking, can help us easily to write correct, stable and efficient complex concurrent code. This would have been unthinkable in the old days of traditional concurrency.

However, we also need to take more responsibility when using the new tool of collaborative cancellation. If we want to implement our own concurrent system, we need to ensure that asynchronous tasks handle cooperative cancellation correctly. Only in this way can our concurrent system meet the specifications and requirements of Swift concurrent system. This is critical when others use the concurrent system we create and integrate it into other concurrent systems.

Throwing errors or handling cancellations in a task means we exit the task context early. This brings us to another important topic: resource cleanup. Structured concurrency ensures that the lifetime of concurrent operations does not exceed the scope of the function, which is a great convenience for resource cleaning. We can ensure that no subtasks are still running and need these resources when we exit the task. Thanks to the nature of asynchronous functions and the structured concurrency model that dictates the lifecycle of asynchronous operations, we can consolidate clean code that would otherwise need to be scattered around in a similar way to synchronous functions (such as defer). This further reduces the difficulty of creating concurrent systems and reduces the chance of accidentally writing bugs in your programs.

Write in the last

Most of the content of this article is quoted from miaoshen “Swift asynchronous and concurrent programming”, the final copyright belongs to the original author, if there is any infringement, please inform us.