Slicing is the core data structure of the Go language, yet programmers new to Go often stumble over how slicing works and behaves. For example, changes made to a slice in a function that explicitly says it is a reference type sometimes don’t survive, and sometimes they do. The reason for this is that many of us try to guess the behavior of slicing in Go using the thinking of other languages. Slicing is a built-in type that has its own type definition at the bottom of Go, rather than the notion of arrays that we usually understand in other languages.

Translated from Rob Peck’s Go Blog post, this article details how slicing was designed and how it relates and differentiates from arrays, as well as the implementation details of the built-in Append function. Although the length is very long, it is recommended to read the certification, especially about the design of slice and append function implementation of the part, after understanding the “slice head” a lot of slice behavior will naturally be able to understand.

Rob Pike September 26, 2013 original address: blog.golang.org/slices

introduce

One of the most common features of procedural programming languages is the concept of arrays. Arrays may seem simple, but adding an array to a language requires answering a number of questions, such as:

  • Is the array fixed size or variable size?
  • Is the size part of the array type?
  • What does a multidimensional array look like?
  • Does an empty array make sense?

The answers to these questions will affect whether arrays are just a common feature of the language or a core part of its design.

In the early development of Go, it took us about a year to decide on answers to these questions before we felt the design was right. A key step was the introduction of slicing, which is built on fixed-size arrays to provide flexible, extensible data structures. To this day, however, programmers new to Go often stumble over how slicing works, perhaps because experience with other languages has solidified their thinking.

In this article, we will attempt to clear up confusion. We’ll build snippets of knowledge to explain how the Append built-in function works and why it works the way it does.

An array of

Arrays are important building blocks in Go, but like the foundation of a building, they are often hidden under visible components. Before moving on to the more interesting, powerful, and important concepts of slicing, we must talk briefly about arrays.

Arrays are not often seen in Go programs because their size is part of the array type, which limits their expressiveness.

Declare the array as follows

var buffer [256]byte
Copy the code

Declares the array variable buffer, which contains 256 bytes. The type of buffer includes its size, [256] bytes. An array of 512 bytes will have different types of [512] bytes.

The data associated with an array is an array of elements. In principle, our buffer looks something like this in memory,

buffer: byte byte byte ... 256A... byte byte byteCopy the code

That is, the variable holds 256 bytes of data, and that’s it. We can access its elements using the familiar index syntax buffer [0], buffer [1], buffer [255], etc. (The index ranges 0 to 255 cover 256 elements.) Trying to index the array buffer with values outside this range crashes the program.

The number of elements back to the array or slice of the built-in function Len and some other data types. For arrays, it’s pretty obvious what Len will return. In our example, len(buffer) returns a fixed value of 256.

Arrays have their place (they represent transformation matrices nicely, for example), but their most common use in Go is to preserve storage space for slices.

Slices: Slices

Slicing is the place to do things, but to take full advantage of them, developers must understand exactly what they mean and what they do.

Slicing is a data structure that describes a contiguous segment of an array stored separately from the slicing variable itself. Slicing is not an array. Slice describes an array.

Using the array variable buffer given in the section, we can create a slice that describes the 100th through 150th elements of the array (or, to be exact, the 100th through 149th elements) :

var slice []byte = buffer[100:150]
Copy the code

In this code snippet, we use full variable declarations. The variable slice has a “byte slice” of type [] byte and is initialized by slicing the 100th (inclusive) to the 150th (exclusive) element from an array called buffer. The more common syntax is to ignore the type, which is set by the initialization expression:

var slice = buffer[100:150]
Copy the code

Inside functions, we can use short declaration form,

slice := buffer[100:150]
Copy the code

What exactly is a slice variable? Now think of Slice as a small data structure with two elements: a length and a pointer to an array element. You can think of it as being built at the bottom like this:

type sliceHeader struct {
    Length        int
    ZerothElement *byte
}

slice := sliceHeader{
    Length:        50,
    ZerothElement: &buffer[100],
}
Copy the code

Of course, this is just an example for illustrative purposes. Although this snippet illustrates that the sliceHeader structure is invisible to the programmer, and that the type of the element pointer depends on the type of the element, it gives a general idea of the slicing mechanism.

So far, we’ve used slicing on arrays, but we can also slice slices as follows:

slice2 := slice[5:10]
Copy the code

As before, this creates a new slice, in this case using elements 5 through 9 of the original slice, that is, elements 105 through 109 of the original array. The sliceHeader structure underlying the slice2 variable looks like this:

slice2 := sliceHeader{
    Length:        5,
    ZerothElement: &buffer[105],
}
Copy the code

Note that this header still points to the same underlying array stored in the buffer variable.

We can also re-slice, that is, slice the slices and store the results back into the original slice structure. After performing the following slicing operation

slice = slice[5:10]
Copy the code

The sliceHeader structure of the Slice variable looks the same as the slice2 structure. In using Go you will see that re-slicing is often used, such as truncating slices. The following statement deletes the first and last elements of the slice:

slice = slice[1:len(slice)-1]
Copy the code

[Exercise: Write what the sliceHeader structure looks like after the assignment above.]

You’ll often hear seasoned Go programmers talk about “slice headers,” because that’s actually what’s stored in a slice variable. For example, when you call a function that takes a slice as an argument, such as bytes.indexrune, this header is what is passed to the function. In this call

slashPos := bytes.IndexRune(slice, '/')
Copy the code

The slice argument passed to the IndexRune function is actually a “slice header.”

There is another data item in the slicing header that we’ll discuss below, but first let’s look at what the presence of the slicing header means when programming with slicing.

Passing a slice to a function is important to understand that even if a slice contains a pointer, it is a value in itself. Behind the scenes, it is a structure value that contains a pointer and a length. It is not a pointer to a structure.

That’s important.

In the previous example, when we called IndexRune, it passed a copy of the cut header. This behavior has important implications.

Consider the following simple function

func AddOneToEachElement(slice []byte) {
    for i := range slice {
        slice[i]++
    }
}
Copy the code

It does exactly what its name implies, iterating over the index of the slice (using the for range loop), incrementing each element.

Try it:

func main() {
    slice := buffer[10:20]
    for i := 0; i < len(slice); i++ {
        slice[i] = byte(i)
    }
    fmt.Println("before", slice)
    AddOneToEachElement(slice)
    fmt.Println("after", slice)
}
Copy the code

(You can edit and re-execute these runnable snippets if you want to explore.)

Although the slice header is passed by value, the header contains Pointers to the elements of the array, so both the original slice header and the copy of the header passed to the function describe the same array. So, when the function returns, the modified elements are visible through the original slice variable.

The argument to this function is actually a header copy of a slice, as shown in the following example:

func SubtractOneFromLength(slice []byte) []byte {
    slice = slice[0 : len(slice)-1]
    return slice
}

func main() {
    fmt.Println("Before: len(slice) =", len(slice))
    newSlice := SubtractOneFromLength(slice)
    fmt.Println("After: len(slice) =", len(slice))
    fmt.Println("After: len(newSlice) =", len(newSlice))
}
Copy the code

Here we see that the contents of the slice argument can be modified by the function, but its slice header cannot. Calling this function does not modify the length stored in the slice variable because the function is passed a copy of the sliced header (not the original header). Therefore, if we were to write a function that modifies the header, we would have to return it as a result parameter, as we did here. Slice variable stays the same, but returns a value with a new length, which is then stored in newSlice,

Pointer to slice: method receiver

Another way to let a function modify the slice header is to pass a pointer to the slice to the function. Here is a variation of our previous example:

func PtrSubtractOneFromLength(slicePtr *[]byte) {
    slice := *slicePtr
    *slicePtr = slice[0 : len(slice)-1]
}

func main() {
    fmt.Println("Before: len(slice) =", len(slice))
    PtrSubtractOneFromLength(&slice)
    fmt.Println("After: len(slice) =", len(slice))
}
Copy the code

This example may seem awkward, especially with the extra indirection to deal with (implemented using temporary variables), but there is one situation where we often see Pointers to slices: the idiomatic pattern for a method that modifiers slices is to use the pointer to the slice as the recipient of the method.

Suppose we want to have a method on the slice that cuts it off at the last slash. We can write this:

type path []byte

func (p *path) TruncateAtFinalSlash() {
    i := bytes.LastIndex(*p, []byte("/"))
    if i >= 0 {
        *p = (*p)[0:i]
    }
}

func main() {
    pathName := path("/usr/bin/tso") / / converts a string to a path type pathName TruncateAtFinalSlash (FMT). The Printf ("%s.", pathName)
}
Copy the code

If you run this example, you can see that it works and updates slices in the function called.

[Exercise: Change the type of the receiver to a value instead of a pointer, and run it again. Explain what happened.]

On the other hand, if we wanted to write a method for type path that uppercases the ASCII characters in the path, the interface to the method could be a sliced value, since the value receiver would still point to the same underlying array.

type path []byte

func (p path) ToUpper() {
    for i, b := range p {
        if 'a' <= b && b <= 'z' {
            p[i] = b + 'A' - 'a'
        }
    }
}

func main() {
    pathName := path("/usr/bin/tso")
    pathName.ToUpper()
    fmt.Printf("%s.", pathName)
}
Copy the code

Here, the ToUpper method uses two variables for the range loop in to capture index and slice elements. This form of circulation avoids writing p[I] multiple times in the body.

Exercise: Convert the ToUpper method to use the pointer receiver and see if its behavior changes.

Advanced exercise: Convert the ToUpper method to handle Unicode letters, not just ASCII.

capacity

The following function extends its integer slice argument by one element:

func Extend(slice []int, element int) []int {
    n := len(slice)
    slice = slice[0 : n+1]
    slice[n] = element
    return slice
}
Copy the code

(Why does it need to return the modified slice?) Now use Extend and run the following program:

func main() {
    var iBuffer [10]int
    slice := iBuffer[0:0]
    for i := 0; i < 20; i++ {
        slice = Extend(slice, i)
        fmt.Println(slice)
    }
}
Copy the code

See how the slices grow until… It doesn’t grow.

Now it’s time to discuss the third component of the slice header: capacity. In addition to the array pointer and length, the slice header stores its slice capacity:

type sliceHeader struct {
    Length        int
    Capacity      int
    ZerothElement *byte
}
Copy the code

The Capacity field records how much space the underlying array actually has; It’s the maximum Length you can get. Trying to make a slice exceed its capacity will exceed the limit of the underlying array of slices, which can cause panic.

After our sample slice is created with the following statement,

slice := iBuffer[0:0]
Copy the code

Its cutting head would look like this:

slice := sliceHeader{
    Length:        0,
    Capacity:      10,
    ZerothElement: &iBuffer[0],
}
Copy the code

The Capacity field is equal to the length of the underlying array minus the index of the array element to which the first element of the slice points (in this case, the index of the array element corresponding to the first element of the slice is 0). To query the size of a slice, use the built-in cap function:

if cap(slice) == len(slice) {
    fmt.Println("slice is full!")}Copy the code

Make function

What if we want to expand the slice beyond its capacity? Actually you can’t! By definition, capacity is the limit of slice growth. However, you can get the equivalent result by allocating a new array, copying data into the new array, and modifying slices to describe the new array.

Let’s start with distribution. We can use the new built-in function to allocate a larger array and then slice the result, but using the make built-in function is simpler. It allocates a new array and creates a cut header to describe it. The make function takes three arguments: the type of the slice, the initial length, and the capacity, which is the length of the array that make allocates to hold the slice data. The following call to make creates a slice of length 10 with a margin of 5 (15-10) for the underlying array, which can be seen by running it:

slice := make([]int, 10, 15)
fmt.Printf("len: %d, cap: %d.", len(slice), cap(slice))
Copy the code

The following code snippet doubles the size of our int slice, but stays the same length:

slice := make([]int, 10, 15)
fmt.Printf("len: %d, cap: %d.", len(slice), cap(slice))
newSlice := make([]int, len(slice), 2*cap(slice))
for i := range slice {
    newSlice[i] = slice[i]
}
slice = newSlice
fmt.Printf("len: %d, cap: %d.", len(slice), cap(slice))
Copy the code

After running the above code, Slice has more space to expand before it needs to allocate the new underlying array again.

When slices are created, the length and capacity are usually the same. The built-in make supports a shorthand for this common case. The length parameter value defaults to capacity, so you can omit capacity and set them to the same value when using the make function. Something like this:

gophers := make([]Gopher, 10)
Copy the code

Both the length and capacity of gophers slices are set to 10.

The Copy function

When we doubled the size of the slice in the previous section, we wrote a loop to copy the old data to the new slice. Go has a built-in function copy to simplify this operation. It takes two slices that copy data from the right parameter to the left parameter. Let’s rewrite the example from the previous section using the copy function:

newSlice := make([]int, len(slice), 2*cap(slice))
copy(newSlice, slice)
Copy the code

Copy is smart. It only copies what it can copy, focusing on the length of the two arguments. In other words, the number of elements it copies is the minimum of the two slice lengths. This saves some recording operations. Similarly, copy returns an integer value, the number of elements it copies, although this return value is not always worth checking in the program.

The copy function also works correctly when source slices and target slices overlap, which means it can be used to move elements within a single slice. Here is how to insert values into the middle of a slice using copy.

// The Insert function inserts the value into the index position specified by the slice. // The Insert position must be within the range. // Slices must leave space for new elements. Func Insert(slice []int, index, value int) []int {// Add an element to the slice. Slice = slice[0: len(slice)+1] // Use copy to move the upper part of the slice away, leaving a space. Copy (slice[index+1:], slice[index:]) // Insert a new value. Slice [index] = value // Returns the result.return slice
}
Copy the code

There are two things to notice in this function. First, it must return the updated slice because its length has changed. Second, it uses a shorthand slice expression

slice[i:]
Copy the code

The effect is exactly the same as the following expression

slice[i:len(slice)]
Copy the code

Also, although we haven’t used this technique yet, we can omit the first element of the slice expression, which defaults to zero.

The expression above slice[:] represents the slice itself, which is useful when slicing (verb) arrays. The following expression is the quickest way to describe a slice of all the elements of an array:

array[:]
Copy the code

Now, let’s run Insert.

Slice := make([]int, 10, 20) // Note that size > length represents the space to add elements. for i := range slice { slice[i] = i } fmt.Println(slice) slice = Insert(slice, 5, 99) fmt.Println(slice)Copy the code

Append: An example

In the previous sections, we wrote the Extend function, which extends the slice by one element. However, this function is problematic because it will crash if the size of the slice is too small. (Our Insert sample function has the same problem.) Now that we’ve solved this problem, let’s write a solid implementation of Extend for integer slices.

func Extend(slice []int, element int) []int {
    n := len(slice)
    if n == cap(slice) {// The slice is full and must be expanded // We doubled its size and increased it by 1, so that if the original size was zero, the slice can still be expanded. newSlice := make([]int, len(slice), 2*len(slice)+1) copy(newSlice, slice) slice = newSlice } slice = slice[0 : n+1] slice[n] = elementreturn slice
}
Copy the code

In this function, returning the slice at the end is particularly important, because when it is redistributed, the resulting slice describes a completely different array. The following code snippet demonstrates what happens when the slice fills up:

slice := make([]int, 0, 5)
for i := 0; i < 10; i++ {
    slice = Extend(slice, i)
    fmt.Printf("len=%d cap=%d slice=%v.", len(slice), cap(slice), slice)
    fmt.Println("address of 0th element:", &slice[0])
}
Copy the code

Notice that array reallocation occurs when an array of initial size 5 fills up. When a new array is allocated, the size of the slice changes, as does the address of the 0th element.

With the powerful Extend function as a guide, we can write a better function that allows us to Extend slices over multiple elements. To do this, we use Go’s ability to turn the function argument list into a slice when a function is called. That is, we use Go’s variable function argument function.

Let’s name our new function Append. For the first version, we could call Extend repeatedly so that the mechanics of the mutable function were clear. Append’s function signature looks like this:

func Append(slice []int, items ... int) []intCopy the code

Append takes a slice argument, followed by zero or more int arguments. As far as the Append implementation is concerned, these parameters are just an int slice, as you can see:

// Append appends items to slices // First version: just loop Extend. func Append(slice []int, items ... int) []int {for _, item := range items {
        slice = Extend(slice, item)
    }
    return slice
}
Copy the code

Notice that the for range loop iterates over the elements of the items parameter, which has an implicit type []int. Also note the use of the blank identifier _ to discard indexes in the loop, because we don’t need indexes in this example.

Try it:

slice := []int{0, 1, 2, 3, 4}
fmt.Println(slice)
slice = Append(slice, 5, 6, 7, 8)
fmt.Println(slice)
Copy the code

Another new technique in this example is that we initialize the slice by writing a compound literal that consists of the type of the slice and the elements in parentheses:

slice := []int{0, 1, 2, 3, 4}
Copy the code

Append is also interesting because not only can we Append elements like source slices, but we can also call Append with… Syntax splits slices into arguments to functions. This way we can Append the whole second slice to the source slice using the Append function.

slice1 := []int{0, 1, 2, 3, 4} slice2 := []int{55, 66, 77} fmt.Println(slice1) slice1 = Append(slice1, slice2...) / / '... 'Absolutely! fmt.Println(slice1)Copy the code

Of course, we can increase Append efficiency by assigning no more than one assignment on an internal basis of Extend:

// Append the element to the slice // efficient version. func Append(slice []int, elements ... int) []int { n := len(slice) total := len(slice) + len(elements)if total > cap(slice) {// reallocate. To 1.5 times the new size, so we can still grow. newSize := total*3/2 + 1 newSlice := make([]int, total, newSize) copy(newSlice, slice) slice = newSlice } slice = slice[:total] copy(slice[n:], elements)return slice
}
Copy the code

Notice how we use copy twice here, once to move the sliced data into newly allocated memory, and then copy the add-ons to the end of the old data.

Give it a try; The new snippet behaves the same as before:

slice1 := []int{0, 1, 2, 3, 4} slice2 := []int{55, 66, 77} fmt.Println(slice1) slice1 = Append(slice1, slice2...) / / '... 'Is essential! fmt.Println(slice1)Copy the code

Append: built-in function

Thus, we derive our motivation for designing the built-in functions of Append. It is exactly as efficient as our Append example, but it can be applied to any slice type.

One disadvantage of Go is that any operation of the generic type must be provided by the runtime. That may change one day, but for now, Go comes with a built-in generic function append for easier handling of slicing. It works the same way as our int slice version, but works for any slice type

Remember that since the slice header is always updated by calling Append, you need to save the returned slice after the call. In fact, the compiler won’t let you call Append without saving the result.

Here are some linear programs mixed with print statements. Try it, edit it, and explore the results

// Create two initial slices := []int{1, 2, 3} slice2 := []int{55, 66, 77} FMT.Println("Start slice: Println("Start slice2:", slice2) // Add an element to slice = append(slice, 4) fmt.Println("Add one item:", Slice) // Add one slice to another. slice = append(slice, slice2...) FMT.Println("Add one slice:", slice) // Copy (int) slices. slice3 := append([]int(nil), slice...) FMT.Println("Copy a slice:", slice3) // Copies the slice to the end of itself. fmt.Println("Before append to self:", slice) slice = append(slice, slice...) fmt.Println("After append to self:", slice)Copy the code

It’s worth taking a moment to carefully consider the last code in the example to understand how the design of slicing makes it possible for this simple call to work correctly.

Slice Tricks Wiki page built in the community. There are more append, Copy, and other examples using slicing.

Nil

And by the way, with our new knowledge, we can see what the representation of nil slices is. Naturally, it is the zero value of the slice header:

sliceHeader{
    Length:        0,
    Capacity:      0,
    ZerothElement: nil,
}
Copy the code

Or I could write it this way

sliceHeader{}
Copy the code

The key detail is that the element pointer in the slice header is also nil, whereas the slice created by the following statement

array[0:0]
Copy the code

It has zero length (or even zero capacity), but its pointer is not nil, so it is not a nil slice.

To be clear, empty slices can grow (assuming their capacity is non-zero), but nil slices have no array to put values into, or even grow to hold an element.

That is, a nil slice is functionally equivalent to a zero-length slice, even if it doesn’t point to anything. It has zero length and can be appended to by the append function by allocating a new array. For example, look at the single line program above, which copies slices by attaching to nil slices.

Here’s the program

// Copy (int) slices. slice3 := append([]int(nil), slice...) fmt.Println("Copy a slice:", slice3)
Copy the code

string

Now take a quick look at the strings in Go in the slicing context.

Strings are actually quite simple: they’re just read-only byte slices, and slicing provides some additional syntactic support at the language level.

Because they are read-only, no capacity is required (you can’t increase them), but for the most part, you can treat them like slices of read-only bytes.

First, we can index strings for them to access individual bytes:

slash := "/usr/ken"[0] // Generates byte values'/'
Copy the code

We can slice a string to get substrings:

usr := "/usr/ken"[0:4] // Generates a string"/usr"
Copy the code

Now, it should be easy to understand what’s going on behind the scenes when we cut into strings.

We can also use a normal byte slice to create a string from it with a simple conversion:

str := string(slice)
Copy the code

And vice versa:

slice := []byte(usr)
Copy the code

The array at the bottom of the string is hidden from view; Its contents cannot be accessed except through a string. This means that when we perform any of these transformations, we must copy the array. Of course, Go takes care of this, so you don’t have to. After any of these transformations, changes to the array below the byte slice do not affect the corresponding string.

An important consequence of this slice-like string design is that it is very efficient to create substrings. All you need to do is create a two-word string header. Because strings are read-only, the raw string and the string produced by the slicing operation can safely share the same array.

History: The earliest string implementations were always allocated, but when slicing was added to the language, they provided a valid string handling model. As a result, some benchmarks have been dramatically accelerated.

There’s more to strings, of course, and individual blog posts can delve deeper into them.

conclusion

Understanding how slicing works helps to understand how slicing is implemented. Slicing has a small data structure, the slice header, which is the item associated with the Slice variable and describes a portion of the separately allocated array. When we pass the slice value, the header will be copied, but will always point to the array it points to.

Once you understand how they work, slicing not only becomes easy to use, but also powerful and expressive, especially with the help of the copy and Append built-in functions.

To read more

There are many things to be found in Go between the tubes for slicing. As mentioned earlier, the “Slice Tricks” Wiki page has many examples.

There are many resources available, but the best way to learn slicing is to use slicing.