By Mike Ash, translator: Grey S; Proofreading: Numbbbbb, Forelax; Finalized: Forelax

Swift 1.2 has now been released as part of Xcode 6.3, and there is a new API that allows us to build efficient data structures using value types, such as Array types in the Swift standard library. Today, we’re going to relive the core functionality of Array.

Value type and reference type

Before we begin, a quick review of value types and reference types. In OBJC and most other object-oriented languages, Pointers or references to objects we use are of reference type. You can assign a reference to an object to a variable:

MyClass *a = ... ;Copy the code

Now you can copy the value of this variable to another variable:

    MyClass *b = a;
Copy the code

Now a and B both refer to the same object. If the object is mutable, a change to one variable will also occur to the other.

A value type is something like an int in Objective-C. With value types, variables contain real values, not references to values. When you assign a value to another variable with =, you assign a copy of the value to another variable. Such as:

    int a = 42;
    int b = a;
    b++;
Copy the code

Now, b is 43, but A is still 42.

In Swift, class is a reference type and struct is a value type. If you assign a reference to a class instance to another variable using =, you will get a new reference to that instance. Changes to this instance are visible to each reference. If you assign a reference to a struct instance to another variable using =, you get a copy of that instance, independent of the original data.

Unlike most languages, arrays and dictionaries in the Swift standard library are value types. Such as:

    var a = [1.2.3]
    var b = a
    b.append(4)
Copy the code

In most languages, when this code (or its equivalent) runs, both a and B will be a reference to the array [1, 2, 3, 4]. But in Swift, A is a reference to the array [1, 2, 3] and B is a reference to the array [1, 2, 3, 4].

The implementation of a value type

If your object has fixed attributes, declaring it as a value type in Swift is easy: just put all the attributes into a struct. For example, if you need a 2D Point value type, you can simply declare a struct containing x and y:

    struct Point {
        var x: Int
        var y: Int
    }
Copy the code

Pretty soon, you declare a value type. But how do you implement a value type like Array? You can’t put all the data in the array into the declaration of the struct, because you can’t predict how much data you’re going to put in the array when you’re writing code. You can create a pointer to all data:

    struct Array<T> {
        var ptr: UnsafeMutablePointer<T>}Copy the code

Also, you need to do something special with that struct each time it is allocated and destroyed.

  • In the process of allocation, you need to make a copy of the contained data into a new memory location, which is newstructYou don’t share the same data with the original data.
  • In the process of destruction,ptrPointers also need to be destroyed normally.

Customizing the allocation and destruction process of structs is not allowed in Swift.

Destruction can be implemented using a class that provides deinit. Pointers can also be destroyed here. Class is not a value type, but we can provide a class as an internal property to a struct and expose an array as a struct to an external interface. It looks something like this:

    class ArrayImpl<T> {
        var ptr: UnsafeMutablePointer<T>

        deinit{ ptr.destroy(...) ptr.dealloc(...) }}struct Array<T> {
        var impl: ArrayImpl<T>}Copy the code

The methods declared in Array are actually performed on the ArrayImpl.

Is this the end of it? Even though we’re using structs, we’re still using reference types. If we make a copy of this struct, we get a new struct that holds the same ArrayImpl as before. Since we cannot customize the struct allocation process, there is no way to copy the ArrayImpl as well.

The solution to this problem is to abandon copying during allocation and instead copy when the ArrayImpl changes. The point is that even if a copy shares a reference with the original data, the semantics of the value type still hold as long as the reference data remains unchanged. Only when the value of this shared data changes does the value type and reference type become distinct.

For example, when implementing the Append method you can copy the ArrayImpl first (assuming the ArrayImpl implementation has a copy method, then change the IMPL reference to the original copy) :

    mutating func append(value: T) {
        impl = impl.copy()
        impl.append(value)
    }
Copy the code

So Array is a value type. Although a and B still share the same IMPL reference just after assignment, any method that changes the IMPL copies it once, thus preserving the illusion of not sharing data.

It works fine now, but it’s very inefficient. Such as:

    var a: [Int] = []
    for i in 0..<1000 {
        a.append(i)
    }
Copy the code

Although the consumer can’t see it, it copies the data in memory for each iteration of the loop, then immediately destroys the previous data memory. How can you optimize it?

isUniquelyReferenced

This is a new API introduced in Swift 1.2. It does what it says on the face of it beautifully. Give it a reference to an object and it will tell you if that reference is independent. Specifically, true is returned when the object has one and only one strong reference.

Our guess is that the API checks the object’s reference count and returns true if the reference count is 1. Why doesn’t Swift just provide an interface to query reference counts? Perhaps implementationally difficult, and reference counting is one of the more easily abused information, Swift provides this encapsulated, more secure interface.

Using this API, the previous implementation of the append method can be changed to copy data in memory only when needed:

    mutating func append(value: T) {
        if! isUniquelyReferencedNonObjc(&impl) { impl = impl.copy() } impl.append(value) }Copy the code

This API is actually one of a set of three methods. Exists in Xcode’s own Swift library:

    func isUniquelyReferenced<T : NonObjectiveCBase>(inout object: T) -> Bool
    func isUniquelyReferencedNonObjC<T>(inout object: T?) -> Bool
    func isUniquelyReferencedNonObjC<T>(inout object: T) -> Bool
Copy the code

These methods work only in pure Swift classes and do not support the @objc type. The first method must ensure that T is a subclass of NonObjectiveCBase. The other two methods do not require the type of the argument, but simply return false if the type is @objc.

I can’t let my code to compile NonObjectiveCBase type, so I use isUniquelyReferencedNonObjC instead. Functionally, there is no difference.

The translator’s note: Article 3.1 isUniquelyReferencedNonObjC API have been elaborated by Swift when is replaced with isKnownUniquelyReferenced details you can refer to Swift – the evolution of this advice

ArrayImpl

Let’s start implementing swift.array, starting with ArrayImpl and then Array.

I’m not going to re-implement the full Array API here, just the basic functionality needed to make it work and show the principles involved.

The ArrayImpl has three properties: a pointer, the total number of array elements, and the remaining capacity in the requested memory space. Only the total number of Pointers and elements is required, but rather than requesting more memory space, monitoring the remaining capacity in real time and requesting it as needed, we can avoid a large and expensive memory reallocation. Here is the beginning of the class:

    class ArrayImpl<T> {
        var space: Int
        var count: Int
        var ptr: UnsafeMutablePointer<T>
Copy the code

You need a count and a pointer in the init method, and then copy what the pointer points to to the new object. Method provides default values 0 and nil, so init can be used to create an instance of an empty array without passing in any arguments:

        init(count: Int = 0, ptr: UnsafeMutablePointer<T> = nil) {
            self.count = count
            self.space = count
            
            self.ptr = UnsafeMutablePointer<T>.alloc(count)
            self.ptr.initializeFrom(ptr, count: count)}Copy the code

The initializeFrom method copies the data to the new pointer. It is important to distinguish between UnsafeMutablePointer being in different assignment modes to ensure that they work properly and avoid crashes. The difference is whether the data memory is treated as initialized or uninitialized. When alloc is called, the resulting pointer is in an uninitialized state and may be filled with junk data. A simple assignment, such as ptr.memory =… , is illegal because the assignment operation will destruct the existing value before copying the new value. If it’s an underlying data type like int it’s fine, but if you’re working on a complex data type it’ll crash. Here initializeFrom treats the target pointer as uninitialized memory, which is exactly what it is.

Next is a modified append method. The first thing it does is check whether Pointers need to be reassigned. If there is no free space available, we need a new block of memory:

        func append(obj: T) {
            if space == count {
                // In the new memory allocation, we will request twice the capacity and the minimum is 16:
                let newSpace = max(space * 2.16)
                let newPtr = UnsafeMutablePointer<T>.alloc(newSpace)
                // Copy data from the old memory to the new address
                newPtr.moveInitializeFrom(ptr, count: count)
                /* This is another assignment that not only treats the destination pointer as uninitialized, but also destroys the data source. It saves us writing separate code to destroy old memory, and is potentially more efficient. As the data move completes, the old pointer can be freed and the new data will be assigned to the class property: */
                ptr.dealloc(space)
                ptr = newPtr
                space = newSpace
            }
            // Now we are sure we have enough space, so we can put the new value at the back of memory and increment the count property:
            (ptr + count).initialize(obj)
            count++
        }
Copy the code

The changed remove method will be more concise because there is no need to reallocate memory. First, he will destroy a value before removing it:

Note: Forget the # sign, which in earlier Versions of Swift denoted the internal and external parameters with the same name

        func remove(# index: Int) {
            (ptr + index).destroy()
            // The moveInitializeFrom method is responsible for moving all elements one position forward after the removed element
            (ptr + index).moveInitializeFrom(ptr + index + 1.count: count - index - 1)
            // Decrement the value of the count attribute to reflect the deletion
            count-}Copy the code

We also need a copy method to ensure that a copy can be made from the data memory when needed. The actual code for copying is in the init method, so we only need to create one instance of the copy:

        func copy(a) -> ArrayImpl<T> {
            return ArrayImpl<T> (count: count, ptr: ptr)
        }
Copy the code

So, we’ve pretty much done everything. We just need to make sure to destroy all the elements in the array and release the pointer after calling the deinit method when it is about to be destroyed itself:

        deinit {
            ptr.destroy(count)
            ptr.dealloc(space)
        }
    }
Copy the code

Let’s move it to Array struct. Its only property is an ArrayImpl.

    struct Array<T> :SequenceType {
        var impl: ArrayImpl<T> = ArrayImpl<T> ()Copy the code

All mutating methods will start by checking whether the IMPL is a separate reference and copying if it is not. Will encapsulate it as a function for other methods to use:

        mutating func ensureUnique(a) {
            if !isUniquelyReferencedNonObjC(&impl) {
                impl = impl.copy()
            }
        }
Copy the code

The append method now calls only ensureUnique and then ArrayImpl’s Append method:

        mutating func append(value: T) {
            ensureUnique()
            impl.append(value)
        }
Copy the code

The remove method is the same:

        mutating func remove(# index: Int) {
            ensureUnique()
            impl.remove(index: index)
        }
Copy the code

The count property is returned directly by ArrayImpl’s:

        var count: Int {
            return impl.count
        }
Copy the code

Subscript operations are accessed directly through the underlying pointer. If we were writing real code, we would need to do a range check here (also in the remove method), but we omit it in this example:

        subscript(index: Int) - >T {
            get {
                return impl.ptr[index]
            }
            mutating set {
                ensureUnique()
                impl.ptr[index] = newValue
            }
        }
Copy the code

Finally, Array follows the SequenceType protocol to support for in loops. It must implement Generator TypeAlias and generate methods. The built-in GeneratorOf type makes it easy to implement. GeneratorOf uses a code block to ensure that it returns the next element in the collection every time it is accessed, or nil when the end is reached, and creates a GeneratorType to encapsulate the code block:

        typealias Generator = GeneratorOf<T>
Copy the code

The generate method increments from 0 until it runs to the end, and then starts returning nil:

        func generate(a) -> Generator {
            var index = 0
            return GeneratorOf<T> ({if index < self.count {
                    return self[index++]
                } else {
                    return nil}}}})Copy the code

That’s all there is to it!

Array

Our Array is a generic struct that complies with the CollectionType protocol:

    struct Array<T> :CollectionType {
Copy the code

The only property it has is a reference to the underlying ArrayImpl:

        private var impl: ArrayImpl<T> = ArrayImpl<T> ()Copy the code

Any method that changes the array must first check that the IMPL is a separate reference and copy it if it is not. This function is encapsulated as a private method for other methods to use:

        private mutating func ensureUnique(a) {
            if !isUniquelyReferencedNonObjC(&impl) {
                impl = impl.copy()
            }
        }
Copy the code

The append method uses ensureUnique and calls append in the IMPL:

        mutating func append(value: T) {
            ensureUnique()
            impl.append(value)
        }
Copy the code

The implementation of remove is basically the same:

        mutating func remove(# index: Int) {
            ensureUnique()
            impl.remove(index: index)
        }
Copy the code

The count property is a computational property that will be called directly from the IMPL:

        var count: Int {
            return impl.count
        }
Copy the code

Subscript operations modify the underlying data store directly through the IMPL. Normally this approach of accessing directly from outside the class is a bad idea, but Array and ArrayImpl are so closely linked that it doesn’t seem so bad. The set part of subscript alters the array, so ensureUnique is needed to preserve the value semantics:

        subscript(index: Int) - >T {
            get {
                return impl.ptr[index]
            }
            mutating set {
                ensureUnique()
                impl.ptr[index] = newValue
            }
        }
Copy the code

The CollectionType protocol requires an Index TypeAlias. For Array, the index type is Int:

    typealias Index = Int
Copy the code

It also needs attributes to provide a starting and ending index. For Array, the starting index is 0, and the ending index is the number of elements in the Array:

        var startIndex: Index {
            return 0
        }
    
        var endIndex: Index {
            return count
        }
Copy the code

The CollectionType protocol contains the SequenceType protocol, which makes objects available for for/in loops. It works by having a sequence provide a generator, which is an object that returns contiguous elements of the sequence. The type of generator is defined by the type of protocol used. GeneratorOf is used in Array, which is a simple wrapper to support the creation of generators using a closure:

        typealias Generator = GeneratorOf<T>
Copy the code

The generate method returns a generator. It uses GeneratorOf and provides a closure to increment subscripts until they reach the end of the array. By declaring an index outside the closure, it is captured in the call and its value persists:

        func generate(a) -> Generator {
            var index = 0
            return GeneratorOf<T> {if index < self.count {
                    return self[index++]
                } else {
                    return nil}}}}Copy the code

This completes the Array implementation.

Complete implementation and test code

Here’s the full implementation, with some additional tests to make sure all of this works, which I posted on GitHub:

Gist.github.com/mikeash/63a…

conclusion

IsUniquelyReferenced, added in Swift 1.2, is a well-received change that allows us to implement a lot of really interesting value types, including copying collections of value types from the standard library.

That’s all for today. Next time we’ll find fun, features, and fun features. If you have a topic you are interested in, please send it to us! Email address: [email protected].

This article is translated by SwiftGG translation team and has been authorized to be translated by the authors. Please visit swift.gg for the latest articles.