Array
The type of
Create a new project and write the simplest possible Demo:
var num: Array<Int> = [1.2.3]
Copy the code
Let’s look at the Array definition:
@frozen public struct Array<Element> {
.
}
Copy the code
Obviously, Array is, by definition, a struct type, which is a value type.
Struct (num); struct (num);
var num: Array<Int> = [1.2.3]
withUnsafePointer(to: &num) {
print($0)}print("end")
Copy the code
We can debug the print method by setting a breakpoint:
There is no information about Array values 1, 2, and 3, only 0x000000010076f400 in memory, looks like an address on the heap, so there is a problem:
Array
What is the address saved?Array
Where did the data go?Array
How is write-time replication implemented for?
generateArray
theSIL
file
We cut the code down to num (the simpler, the clearer) and generate the SIL file to look at:
sil @main : $@convention(c) (Int32.UnsafeMutablePointer<Optional<UnsafeMutablePointer<Int8- > > > >)Int32 {
bb0(%0 : $Int32.%1 : $UnsafeMutablePointer<Optional<UnsafeMutablePointer<Int8>>>):
alloc_global @main.num : [Swift.Int] // id: %2
%3 = global_addr @main.num : [Swift.Int] : $*Array<Int> // user: %23
// Array has 3 elements
%4 = integer_literal $Builtin.Word.3 // user: %6
// Array generation method
// function_ref _allocateUninitializedArray<A>(_:)
%5 = function_ref @Swift._allocateUninitializedArray<A>(Builtin.Word) - > ([A].Builtin.RawPointer) : $@convention(thin) <Tau _0_0> (Builtin.Word) - > (@owned Array< tau _0_0 >,Builtin.RawPointer) // user: %6
%6 = apply %5<Int>(%4) : $@convention(thin) <Tau _0_0> (Builtin.Word) - > (@owned Array< tau _0_0 >,Builtin.RawPointer) // users: %8, %7
%7 = tuple_extract %6 : $(Array<Int>, Builtin.RawPointer), 0 // user: %23
%8 = tuple_extract %6 : $(Array<Int>, Builtin.RawPointer), 1 // user: %9
%9 = pointer_to_address %8 : $Builtin.RawPointer to [strict] $*Int // users: %12, %19, %14
// literal 1
%10 = integer_literal $Builtin.Int64.1 // user: %11
%11 = struct $Int (%10 : $Builtin.Int64) / /user: %12 // Save 1 to %9store% 11to %9 : $*Int // id: % 12% = 13integer_literal $Builtin.Word1 / /user: %14 // %9 offset by 1 step %14 =index_addr %9 : $*Int, %13 : $Builtin.Word // user: %17 // Literal 2 %15 =integer_literal $Builtin.Int642 / /user: % 16 = 16%struct $Int (%15 : $Builtin.Int64) / /user: %17 // Save 2 to %14store% 16to %14 : $*Int // id: 18 = % 17%integer_literal $Builtin.Word2 / /user: %19 // %9 offset by 2 steps %19 =index_addr %9 : $*Int, %18 : $Builtin.Word // user: %22 // Literal 3 %20 =integer_literal $Builtin.Int64And 3 / /user: 21% % = 21struct $Int (%20 : $Builtin.Int64) / /user: %22 // Save 3 to %19store% 21to %19 : $*Int // id: % 22store% 7to %3 : $*Array<Int> // id: % 24 = 23%integer_literal $Builtin.Int32, 0 / /user: % 25 = 25%struct $Int32 (%24 : $Builtin.Int32) / /user: % 26return %25 : $Int32 // id: % 26} / /end sil function 'main'
Copy the code
From the above analysis, the generation of the num calls _allocateUninitializedArray < A > (_) method, this method is the return value is A tuple % 6, and then use % 7, 8 the tuples % % 6 values extracted out, % 7 to % 3, also is the location of the num, The value 0x000000010076f400 is the value of %7, and %9 is the address of %8, so what is %7 and %8?
Array
Definition in the source code
We can’t find the SIL file, so we have to look at the source code.
@frozen
public struct Array<Element> :_DestructorSafeContainer {
#if _runtime(_ObjC)
@usableFromInline
internal typealias _Buffer = _ArrayBuffer<Element>
#else
@usableFromInline
internal typealias _Buffer = _ContiguousArrayBuffer<Element>
#endif
@usableFromInline
internal var _buffer: _Buffer
/// Initialization from an existing buffer does not have "array.init"
/// semantics because the caller may retain an alias to buffer.
@inlinable
internal init(_buffer: _Buffer) {
self._buffer = _buffer
}
}
Copy the code
There’s really only one property in Array _buffer, _buffer under _Runtime (_ObjC) is _ArrayBuffer, otherwise it’s _ArrayBuffer. On Apple devices it should all be ObjC compatible, so it should be _ArrayBuffer.
We directly breakpoint what value _buffer was assigned to.
_allocateUninitializedArray
In the source code, we search the SIL file initialization method in _allocateUninitializedArray, we see the following definition:
@inlinable // FIXME(inline-always)
@inline(__always)
@_semantics("array.uninitialized_intrinsic")
public // COMPILER_INTRINSIC
func _allocateUninitializedArray<Element> (_ builtinCount: Builtin.Word)- > (Array<Element>, Builtin.RawPointer) {
let count = Int(builtinCount)
if count > 0 {
// Doing the actual buffer allocation outside of the array.uninitialized
// semantics function enables stack propagation of the buffer.
let bufferObject = Builtin.allocWithTailElems_1(
_ContiguousArrayStorage<Element>.self, builtinCount, Element.self)
let (array, ptr) = Array<Element>._adoptStorage(bufferObject, count: count)
return (array, ptr._rawValue)
}
// For an empty array no buffer allocation is needed.
let (array, ptr) = Array<Element>._allocateUninitialized(count)
return (array, ptr._rawValue)
}
Copy the code
So you can see there’s a different way to determine if count is greater than zero, but the return type is the same, and we just want to figure out what the data structure looks like, so we’ll just look at one of them. Count in my example is 3, so look at the conditional statement.
AllocWithTailElems_1 is called, but the object is Builtin, so it’s not easy to see how the method is implemented, but we can debug with a breakpoint.
The swift_allocObject method was entered when debugging:
HeapObject *swift::swift_allocObject(HeapMetadata const *metadata,
size_t requiredSize,
size_t requiredAlignmentMask) {
CALL_IMPL(swift_allocObject, (metadata, requiredSize, requiredAlignmentMask));
}
Copy the code
The requiredSize value displayed by the breakpoint is 56, and the Po pointer metadata displays _TtGCs23_ContiguousArrayStorageSi_$. We can know that allocWithTailElems_1 applies to allocate a piece of heap space, and the object type of the application is _ContiguousArrayStorage
After space is allocated, the _adoptStorage method is called:
/// Returns an Array of `count` uninitialized elements using the
/// given `storage`, and a pointer to uninitialized memory for the
/// first element.
///
/// - Precondition: `storage is _ContiguousArrayStorage`.
@inlinable
@_semantics("array.uninitialized")
internal static func _adoptStorage(
_ storage: __owned _ContiguousArrayStorage<Element>.count: Int
)- > (Array.UnsafeMutablePointer<Element>) {
let innerBuffer = _ContiguousArrayBuffer<Element>(
count: count,
storage: storage)
return (
Array(
_buffer: _Buffer(_buffer: innerBuffer, shiftedToStartIndex: 0)),
innerBuffer.firstElementAddress)
}
Copy the code
The _adoptStorage method returns an element containing the innerBuffer. What is the innerBuffer in the element?
The innerBuffer is generated using the _ContiguousArrayBuffer initialization method, so let’s look at the definition of _ContiguousArrayBuffer:
internal struct _ContiguousArrayBuffer<Element> :_ArrayBufferProtocol {
@inlinable
internal init(count: Int.storage: _ContiguousArrayStorage<Element>) {
_storage = storage
_initStorageHeader(count: count, capacity: count)
}
@inlinable
internal func _initStorageHeader(count: Int.capacity: Int) {
#if _runtime(_ObjC)
let verbatim = _isBridgedVerbatimToObjectiveC(Element.self)
#else
let verbatim = false
#endif
// We can initialize by assignment because _ArrayBody is a trivial type,
// i.e. contains no references.
_storage.countAndCapacity = _ArrayBody(
count: count,
capacity: capacity,
elementTypeIsBridgedVerbatim: verbatim)
}
@usableFromInline
internal var _storage: __ContiguousArrayStorageBase
}
Copy the code
_ContiguousArrayBuffer has only one attribute _storage, initialization method init(count: Int, storage: The storage passed in _ContiguousArrayStorage
) is _ContiguousArrayStorage, __ContiguousArrayStorageBase is _ContiguousArrayStorage parent class.
_ContiguousArrayStorage
_ContiguousArrayStorage is a class, went through the entire _ContiguousArrayStorage inheritance chain, found only in __ContiguousArrayStorageBase has a property:
final var countAndCapacity: _ArrayBody
Copy the code
So what is _ArrayBody
@frozen
@usableFromInline
internal struct _ArrayBody {
@usableFromInline
internal var _storage: _SwiftArrayBodyStorage
.
}
Copy the code
_ArrayBody is a structure with only one property, _storage
What is swiftarrayBodyStorage?
struct _SwiftArrayBodyStorage {
__swift_intptr_t count;
__swift_uintptr_t _capacityAndFlags;
};
Copy the code
Count and _capacityAndFlags are pointer sizes in Swift, which are 8 bytes.
We organize the memory structure of the class _ContiguousArrayStorage, _ContiguousArrayStorage itself is a class, so there is a metadata, and then _ContiguousArrayStorage has only one attribute:
final var countAndCapacity: _ArrayBody
Copy the code
And _ArrayBody has only one property:
@usableFromInline
internal var _storage: _SwiftArrayBodyStorage
Copy the code
It might be a little bit clearer to draw a picture
_initStorageHeader
After talking about the structure of _ContiguousArrayStorage, we go back to the initialization method of _ContiguousArrayBuffer, which is called when _storage is assigned:
_initStorageHeader(count: count, capacity: count)
Copy the code
_initStorageHeader (); _initStorageHeader ();
_storage.countAndCapacity = _ArrayBody(
count: count,
capacity: capacity,
elementTypeIsBridgedVerbatim: verbatim)
Copy the code
The _initStorageHeader method simply assigns the countAndCapacity property of _storage.
Now let’s see how _ArrayBody is initialized:
@inlinable
internal init(
count: Int.capacity: Int.elementTypeIsBridgedVerbatim: Bool = false
) {
_internalInvariant(count > = 0)
_internalInvariant(capacity > = 0)
_storage = _SwiftArrayBodyStorage(
count: count,
_capacityAndFlags:
(UInt(truncatingIfNeeded: capacity) & < < 1) |
(elementTypeIsBridgedVerbatim ? 1 : 0))}Copy the code
Capacity (_capacityAndFlags) is not stored in memory, but is shifted 1 bit to the left Then in the extra one data record a elementTypeIsBridgedVerbatim flag. Therefore, if we read capacity in memory, we also need to do displacement, which is reflected in the _ArrayBody source code
/// The number of elements that can be stored in this Array without
/// reallocation.
@inlinable
internal var capacity: Int {
return Int(_capacityAndFlags & > > 1)}Copy the code
_ArrayBuffer
Initialization of the innerBuffer is done, so back to generating the return value:
return (
Array(
_buffer: _Buffer(_buffer: innerBuffer, shiftedToStartIndex: 0)),
innerBuffer.firstElementAddress)
Copy the code
Array(_buffer:) is the default constructor initialization method. _buffer is the default constructor initialization method.
I’ll paste the related initialization methods of _ArrayBuffer together for a better look:
@usableFromInline
@frozen
internal struct _ArrayBuffer<Element> :_ArrayBufferProtocol {
.
@usableFromInline
internal var _storage: _ArrayBridgeStorage
}
extension _ArrayBuffer {
/// Adopt the storage of `source`.
@inlinable
internal init(_buffer source: NativeBuffer.shiftedToStartIndex: Int) {
_internalInvariant(shiftedToStartIndex = = 0."shiftedToStartIndex must be 0")
_storage = _ArrayBridgeStorage(native: source._storage)
}
.
}
@usableFromInline
internal typealias _ArrayBridgeStorage
= _BridgeStorage<__ContiguousArrayStorageBase>
@frozen
@usableFromInline
internal struct _BridgeStorage<NativeClass: AnyObject> {
@inlinable
@inline(__always)
internal init(native: Native) {
_internalInvariant(_usesNativeSwiftReferenceCounting(NativeClass.self))
rawValue = Builtin.reinterpretCast(native)
}
}
Copy the code
At the end there’s nothing fancy, the properties are structure-value types, structure-value types have only one property, and then the assignment operation.
The zero passed in shiftedToStartIndex does nothing for our understanding, but makes a judgment.
To sum up, %7 in the SIL file is the structure of _ArrayBuffer, which has an attribute that stores the instance class object of _arrayStorage
firstElementAddress
Now find the SIL file % 8, namely innerBuffer. FirstElementAddress:
/// A pointer to the first element.
@inlinable
internal var firstElementAddress: UnsafeMutablePointer<Element> {
return UnsafeMutablePointer(Builtin.projectTailElems(_storage,
Element.self))}Copy the code
Builtin (Builtin, Builtin, Builtin, Builtin, Builtin);
/// projectTailElems : <C,E> (C) -> Builtin.RawPointer
///
/// Projects the first tail-allocated element of type E from a class C.
BUILTIN_SIL_OPERATION(ProjectTailElems, "projectTailElems", Special)
Copy the code
So this operation will return the first address of the tail element of the _storage allocated space, so presumably, the stored location of the array element will be after the content of _ContiguousArrayStorage
validationArray
The underlying structure of
Same code as above:
var num: Array<Int> = [1.2.3]
withUnsafePointer(to: &num) {
print($0)}print("end")
Copy the code
inprint
Break point, outputnum
Memory: 0x0000000100604b30
Should be_ContiguousArrayStorage
“, continue to output0x0000000100604b30
Memory:
Perfect match, Nice
Array
Copy at write time
Let’s start with what copy on write means: variables are copied only if they need to be changed, and if they don’t change, everyone shares a single memory. In the Swift standard library, collection types such as Array, Dictionary, and Set are implemented using copy-on-write techniques
Let’s take a look at the source code to do this. Run the following code in the source code:
var num: Array<Int> = [1.2.3]
var copyNum = num
num.append(4)
Copy the code
Then put a breakpoint on the source’s append method:
@inlinable
@_semantics("array.append_element")
public mutating func append(_ newElement: __owned Element) {
// Separating uniqueness check and capacity check allows hoisting the
// uniqueness check out of a loop.
_makeUniqueAndReserveCapacityIfNotUnique()
let oldCount = _getCount()
_reserveCapacityAssumingUniqueBuffer(oldCount: oldCount)
_appendElementAssumeUniqueAndCapacity(oldCount, newElement: newElement)
}
Copy the code
Here a total of three methods, we see the first _makeUniqueAndReserveCapacityIfNotUnique first, look at the method name to understand, is that if the array is not the only, so that is the only and reserve capacity. So what does this only refer to?
Let’s have a look at the breakpoint debugging, because it is a bit deep, I copy the most critical code:
return !getUseSlowRC() &&!getIsDeiniting() && getStrongExtraRefCount() = =0;
Copy the code
These are all reference-count judgments, the most important being whether the getStrongExtraRefCount strong reference count is 0. If it’s not zero, it’s not unique, so unique here means unique references to this space.
What, if not the only one?
@inlinable
@_semantics("array.make_mutable")
internal mutating func _makeUniqueAndReserveCapacityIfNotUnique(a) {
if _slowPath(!_buffer.isMutableAndUniquelyReferenced()) {
_createNewBuffer(bufferIsUnique: false,
minimumCapacity: count + 1,
growForAppend: true)}}@_alwaysEmitIntoClient
@inline(never)
internal mutating func _createNewBuffer(
bufferIsUnique: Bool.minimumCapacity: Int.growForAppend: Bool
) {
let newCapacity = _growArrayCapacity(oldCapacity: _getCapacity(),
minimumCapacity: minimumCapacity,
growForAppend: growForAppend)
let count = _getCount()
_internalInvariant(newCapacity > = count)
let newBuffer = _ContiguousArrayBuffer<Element>(
_uninitializedCount: count, minimumCapacity: newCapacity)
if bufferIsUnique {
_internalInvariant(_buffer.isUniquelyReferenced())
// As an optimization, if the original buffer is unique, we can just move
// the elements instead of copying.
let dest = newBuffer.firstElementAddress
dest.moveInitialize(from: _buffer.firstElementAddress,
count: count)
_buffer.count = 0
} else {
_buffer._copyContents(
subRange: 0..<count,
initializing: newBuffer.firstElementAddress)
}
_buffer = _Buffer(_buffer: newBuffer, shiftedToStartIndex: 0)}Copy the code
We see that the _createNewBuffer method is called and a new buffer is generated in the _createNewBuffer method:
let newBuffer = _ContiguousArrayBuffer<Element>(
_uninitializedCount: count, minimumCapacity: newCapacity)
Copy the code
So this is a little bit like before, so I’m not going to expand it, but it’s a little bit easier to understand, but it’s basically creating a new space for the array that’s being modified.
So, the essence of the copy-on-write technique is to check the strong reference count of _ContiguousArrayStorage:
- Create a new array
num
._ContiguousArrayStorage
The strong reference count of the - At the moment the array
num
Add elements, discover_ContiguousArrayStorage
< span style = “max-width: 100%; clear: both; min-height: 1pt; - When using
copyNum
Copy the arraynum
When, but thenum
the_ContiguousArrayStorage
Copy to thecopyNum
.copyNum
the_ContiguousArrayStorage
withnum
the_ContiguousArrayStorage
It’s the same, but_ContiguousArrayStorage
The strong reference count of the Because there is no new space opened up here, very performance saving. - At the moment the array
num
Add the element again and discover_ContiguousArrayStorage
If the strong reference count of is 1, it indicates that it is not the only reference. Create a new space and create a new one_ContiguousArrayStorage
, copy the contents of the original array into the new space.
We can be on our ownXcode
Check memory to verify
conclusion
Swift’s array is a struct, but the internal implementation is still a reference type, and the contents of the array are still stored in the heap space.
The feature of swift’s array write copy is to determine whether it is a unique reference according to the reference count of the heap space. When the array changes, it detects that it is not a unique reference, and then the real copy begins.
Finally, I put the swift array structure also used swift code to achieve a again, can be downloaded from GitHub.