Slice’s underlying data structure
- Slice, as described in Effective Go, is a wrapper around arrays that provides more general, convenient, and powerful operations on arrays. It is of type struct at runtime and contains three properties:
// SliceHeader is the runtime representation of a slice.
// It cannot be used safely or portably and its representation may
// change in a later release.
// Moreover, the Data field is not sufficient to guarantee the data
// it references will not be garbage collected, so programs must keep
// a separate, correctly typed pointer to the underlying data.
type SliceHeader struct {
Data uintptr
Len int
Cap int
}
Copy the code
Slice Data structure source code where Data is a contiguous memory space used to store elements in a slice, Len refers to the length of the slice at initialization, Cap refers to the capacity of the entire slice
Slice the capacity of
Slice Because the underlying storage is a contiguous memory space, when the slice capacity is insufficient, the underlying capacity will be expanded. Slice expansion is usually triggered by an operation on slice’s append, such as:
var a []int
append(a, 1, 2, 3, 4)
Copy the code
So what’s the underlying logic after Append?
// append converts an OAPPEND node to SSA.
// If inplace is false, it converts the OAPPEND expression n to an ssa.Value,
// adds it to s, and returns the Value.
// If inplace is true, it writes the result of the OAPPEND expression n
// back to the slice being appended to, and returns nil.
// inplace MUST be set to false if the slice can be SSA'd.
func (s *state) append(n *Node, inplace bool) *ssa.Value {
// If inplace is false, process as expression "append(s, e1, e2, e3)":
//
// ptr, len, cap := s
// newlen := len + 3
// if newlen > cap {
// ptr, len, cap = growslice(s, newlen)
// newlen = len + 3 // recalculate to avoid a spill
// }
// // with write barriers, if needed:
// *(ptr+len) = e1
// *(ptr+len+1) = e2
// *(ptr+len+2) = e3
// return makeslice(ptr, newlen, cap)
//
//
// If inplace is true, process as statement "s = append(s, e1, e2, e3)":
//
// a := &s
// ptr, len, cap := s
// newlen := len + 3
// if uint(newlen) > uint(cap) {
// newptr, len, newcap = growslice(ptr, len, cap, newlen)
// vardef(a) // if necessary, advise liveness we are writing a new a
// *a.cap = newcap // write before ptr to avoid a spill
// *a.ptr = newptr // with write barrier
// }
// newlen = len + 3 // recalculate to avoid a spill
// *a.len = newlen
// // with write barriers, if needed:
// *(ptr+len) = e1
// *(ptr+len+1) = e2
// *(ptr+len+2) = e3
et := n.Type.Elem()
pt := types.NewPtr(et)
......
}
Copy the code
As you can see from the pseudo-code commented in the source code, append contains two logic:
- If the memory space is insufficient, expand the capacity
- After capacity expansion, add new data elements to slices
As you can see from the append parameter inplace, the append operation itself can have two cases: the first case is when the old slice is not reassigned after append
var a []int
fmt.Println(append(a, 1, 2, 4))
Copy the code
In the second case, after append, assign back to the original slice, for example:
var a []int
a = append(a, 1, 2, 4)
Copy the code
These two different cases, the underlying implementation is also different. The essential difference between them is that if append is reassigned to the original variable A, the underlying implementation logic of GO will change the pointer of the original A to the new address after the expansion, and the new slice will be directly overwritten, thus avoiding the performance problems caused by copying the old element back to the new element.
Expansion policy for Slice
Slice Different expansion policies are used to expand the slice capacity:
- If the requested capacity is more than twice the current capacity, the requested capacity will be used
- If the current slice length is less than 1024, the current capacity of double is expanded
- If the current slice length is greater than 1024, the
Current capacity * 25%
Incremental capacity expansion until the capacity added is greater than the requested capacity
Slice expansion source code
Concerns with slice usage
Fixa := make([]int, 4, 6) FMT.Printf("fixa: %+v, addr: %p \n", fixa, fixa) // Use subscript to initialize slice to get fixb fixb: = fixa[0:2] fmt.printf ("fixb: %+v, addr: %p \n", fixb, fixb) It is just a slice structure created by Go to point to the original array. // Changing fixb also changes the data in fixa, because they all point to the same memory address. %p \n", fixb, fixb) // Fixb: [111 0], addr: 0xC0000b6000 FMT.Printf(" fixa: % v, addr: %p \n", fixa, fixa) %p \n", fixa, fixa) 0xC0000B6000 // AppEnd data to fixB, does not exceed the capacity size, fixA, FixB still points to the same underlying array fixb = append(fixb, 1, 2) fmt.Printf(" Append after fixb: Printf(" appEnd fixa: %+v, addr: %p \n", fixb, fixb) // Append fixb: // Fixa: [111 0 1 2], addr: 0xC0000B6000 // Fixa: [111 0 1 2], addr: 0xC0000B6000 0xC0000B6000 // Continue to appEnd data in fixB, exceeds capacity fixb = AppEnd (fixb, 3, 4, 5) fmt.Printf(" Append after capacity exceeds Capacity fixb: %+v, addr: %p \n", fixb, fixb) FMT.Printf("fixa: %+v, addr: %p \n", fixa, fixa) 0xc00008c060 // fixa: [111 0 1 2], addr: 0xC0000B6000 // AppEnd exceeds capacity at this point fixb is a new array slice that has been overwritten to allocate memory after growslice, but fixa's address is the same as the original fixb[0] = 222 fmt.printf (" Modified fixb: %+v, fixa: %+v \n", fixb, fixa) // Change the underlying array pointed to by Fixb no longer shares the same memory address with Fixa, so change fixb's value to 111Copy the code
Conclusion:
- Slice := a[0:4] when we slice a new slice B, it still shares the same memory address with A. If we modify the elements of B, we should pay special attention to the elements of A
- For slicing operations, if the cap capacity is not exceeded, no expansion occurs. If expansion occurs after the operation, be aware of the impact between them