Mixing up the length and capacity of a slice

This is a translation of 100 Go Mistackes: How to Avoid Them. Due to the limited level of translation, it is inevitable that there will be some problems with the accuracy of translation. For more content, please pay attention to the public account –Go School

It is common for Go developers to get confused about length and capacity in slice structures. A complete understanding of these two concepts is critical to effectively handling slice’s core operations. For example, initialization of slice, adding elements, copying elements, or separating slices using append. Otherwise, it can lead to poor performance and even memory leaks when operating slices using Append.

In Go, the underlying implementation of slice is an array, that is, the sliced data is actually stored in an array. If the back-end array space is full or empty, the slice structure handles the expansion or reduction logic.

In addition, slice’s structure has three fields:

A pointer to the array at the back,
A length field that represents the number of elements in the slice.
A capacity field that represents the number of elements that the back-end array can hold.

Let’s use two examples to illustrate the structure of Slice.

First, we initialize a slice with the given length and capacity:

S := make([]int, 3, 6Copy the code

① The second parameter 3 represents the length, and the third parameter 6 represents the capacity.

As shown below:

This slice creates an array that can hold six elements (capacity). Also, because length is set to 3, Go initializes only the first three elements. Because slice’s elements are of type []int, the first three elements are initialized with the zero value of int. The remaining element space is allocated but not used.

If you print this slice, you get the following result: [0 0 0].

If we set s[1] = 1, the second element of the slice will be updated without any effect on the slice’s length or capacity. As shown below:

However, access to elements beyond the length of the slice (length) is not allowed, even though memory beyond that length has already been allocated. For example, s[4] = 0 causes panic:

panic:runtime error: index out of range [4] with length 3
Copy the code

So what do we do with the remaining space in Slice? Via the built-in append function:

s = append(s, 2)
Copy the code

This operation will add a new element to the net s slice. This element uses the gray block of elements in the first diagram (that is, the location where space is allocated but not used) to store element 2. As shown below:

At this point, the slice length has changed from 3 to 4, meaning that the slice now has 4 elements.

So what happens if we add three more elements to slice? Is there not enough array space on the back end?

s = append(s, 3)
s = append(s, 4)
s = append(s, 5)
fmt.Println(s)
Copy the code

If we execute this section of code, we’ll notice that slice still meets our requirements:

[0 1 0 2 3 4 5]
Copy the code

Because an array is a fixed-length structure, you can only store element 4 in it. When we want to insert element 5, the array is already full. Go creates another array with twice the size of the original array, copies all the elements in the original array into the new array, and inserts element 5 into the new array, as shown below:

Slice’s pointer field now points to the new array. What about the original array? If it is not referenced, it will be collected by GC.

Let’s look at the impact of shard on a slice:

S1 := make([]int, 3, 6) ① s2 := s1[1:3] ②Copy the code

① A slice of length 3 and capacity 6 ② shards from indexes 1 to 3

The diagram below:

First, S1 is initialized to a slice of length 3 and capacity 6. When an S2 slice is created by sharding S1, the pointer fields of s1 and S2 point to the same back-end array. However, the index of the first element of S2 starts at index 1 of the array. Therefore, the length and capacity of section S2 are different from s1: length is 2 and capacity is 5.

If we update S1 [1] or S2 [0], the change is the same for the back-end array. Therefore, the change is visible to both slices, as shown:

So what happens if I append an element in S2 now? Does it affect S1?

s2 = append(s2, 2)
Copy the code

In this way, the shared array will be modified, but only the length of S2 will be changed, as shown in the figure:

S1 is still 3 in length and 6 in capacity. So, if we print s1 and s2, the added element is visible only to S2:

s1 = [0 1 0], s2 = [1 0 2]
Copy the code

Understanding this behavior reduces the probability of errors when using Append.

The last thing to notice is, what happens if we keep going to the append element in S2 until the array is full? We add 3 more elements to S2 until the array at the back is full and there is no space left:

S2 = appEnd (s2, 3) S2 = append(s2, 4) s2 = append(s2, 5) ①Copy the code

① At this stage, the array at the back end is full.

This code causes another new array to be created, as shown below:

Notice that s1 and s2 point to two different arrays. In fact, S1 is still a slice of length 3 and capacity 6, with some available buffer space, so it still references the original array. At the same time, the newly created array copies the data from the starting position of S2 into its own space. That’s why the first element of the new array is a 1, not a 0.

In short, the length in the slice is the number of elements currently stored in the slice, and the capacity of the slice is the number of elements in the array to which the slice points. Adding new elements to a full slice (slice length = slice size) triggers the creation of a new array with twice the size of the original array, which copies all the elements in the original array and updates the Pointers in slice to point to the new array.

Mixing up the length and capacity of a slice

Related Posts

I can’t believe there’s such a thing on GitHub

Merge multiple rows of data with the same ID into one row

Understand Volatile thoroughly