What the heart wants, the body wants.

1. What is memory alignment?

A. Imaginary memory

First, think about how we programmers think about memory. Yes, that’s right, as the picture below shows. Char *byte[]; char*byte[];

However, when a program is loaded from disk into memory, variables of different data types are read and written to memory in blocks of 2, 4, 8, 16, and 32 bytes. In this case, the blocks accessing memory size are referred to as memory access granularity.

B. Memory in real life

As shown in the figure above, when accessing a specific type of variable at a specific memory address, data of various data types need to be arranged in the memory space according to the figure above in a certain rule, which is called memory alignment.

💡 attention,

1. Memory alignment is the compiler’s responsibility

2. Memory alignment refers to the alignment of the start address of the variable, rather than the size of the memory used by each variable

2. Why memory alignment?

Why do I need memory alignment, not byte by byte? So let’s first look at how non-memory alignment accesses the CPU. We all know that when we execute a program, the code in physical memory is loaded directly into main memory. Then the CPU will address the memory through the bus, decode the instructions in the memory, and send the operation instructions to the operation controller for program running and data processing.

2.1 When the processor processes unaligned memory access

As shown below:

Here, we take the processor that processes 4 bytes at a time as an example. The CPU that obtains 4 bytes at a time can only get 0 ~ 3 for the first time, so it can only get 1/2 of the data at a time. Therefore, it needs to obtain the memory address for the second time, and only after two splicing processes can all the data be read.

2.2 Aligned memory access

When memory alignment is used, the type data will be read out in memory at one time according to the alignment rules. There is no need to eliminate unnecessary bytes, using space to exchange time, improving CPU throughput. As shown below:

2.3 Advantages of memory alignment

1. The atomicity

Almost all modern processors guarantee atomic operations when performing multiple tasks simultaneously. To ensure proper execution of atomic operation instructions, the CPU keeps memory aligned every time it accesses memory. For 32-bit processors, the memory address width is 4 bytes per access. If the memory is not aligned, resulting in at least twice to read when accessing a memory, and when we need to access data across two pages of virtual memory, is likely to reside on the first page, not the last page of the instruction in the virtual memory exchange generated when executing code error map page, lead to the atomic operation failure.

2. The portability

On a particular hardware platform, only certain types of data are allowed to be retrieved at certain memory locations, otherwise the system will generate exceptions.

3. Performance speed

When the CPU accesses the memory, it reads data in the unit of word size. Different bits of CPU, each read memory byte size is also different. For example, a 32-bit CPU accesses the memory with a word length of 4 bytes, while a 64-bit CPU accesses the memory with a word length of 8 bytes. In the unaligned case, the CPU accesses the data in memory twice, and it takes more clock cycles to process its data; Under the premise of memory alignment, the CPU always accesses the memory by word length. Therefore, the CPU accesses the memory data at one time, reducing the number of memory accesses, and exchanging space for time, increasing the CPU access throughput.

3.Go proper memory alignment

In front of a lot of theoretical knowledge, after all, “paper come zhongjue shallow, must know this to practice”. Unsafe is the source code for the unsafe issue.

3.1 the unsafe. Sizeof (x)

Func (s *StdSizes) Sizeof(T Type) int64 {switch T := opType (T).(Type) {// Case *Basic: assert(isTyped(T)) k := t.kind if int(k) < len(basicSizes) { if s := basicSizes[k]; S > 0 {return int64(s)}} if k == String {// The size of the word must be larger than 4 bytes (32bits) return s.ordsize * 2} // Array case *Array: n := t.len if n <= 0 { return 0 } // n > 0 a := s.Alignof(t.elem) z := s.Sizeof(t.elem) return align(z, A)*(n-1) + z // slicing case *Slice: return s.ordsize * 3 // Struct case *Struct: n := t.NumFields() if n == 0 { return 0 } offsets := s.Offsetsof(t.fields) return offsets[n-1] + S.sieof (t.fields[n-1].typ) case *_Sum: panic("Sizeof unimplemented for type sum") return s.WordSize * 2 } return s.WordSize // catch-all } // align returns the smallest y >= x such that y % a == 0. // x Func align(x, a int64) int64 {y := x + a - 1 return y - y%a}Copy the code

An 🌰 :

package main import ( "fmt" "unsafe" ) func main() { fmt.Println(unsafe.Sizeof(bool(true))) // 1 Println(unsafe.sizeof (float32(1.12))) // 4 unsafe.sizeof (int8(123))) // 1 fmt.Println(unsafe.Sizeof(int16(123))) // 2 fmt.Println(unsafe.Sizeof(int32(123))) // 4 fmt.Println(unsafe.Sizeof(int64(123))) // 8 fmt.Println(unsafe.Sizeof(uint(123))) // 8 fmt.Println(unsafe.Sizeof(uintptr(123))) // 8 fmt.Println(unsafe.Sizeof(int(1))) // 8 fmt.Println(unsafe.Sizeof(printSize())) // 8 fmt.Println(unsafe.Sizeof(string("sakura"))) // 16 fmt.Println(unsafe.Sizeof([]string{"sakura"})) // 24 } func printSize() int { return 1 }Copy the code

✅ From the above example, we can summarize the number of bytes for each type

type Memory size (64bit)
bool/uint8/int8 1
int16/uint16 2
int32/uint32 4
float32 4
int64/uint64 8
float64/complex64 8
uint/uintptr/int 8 [4 (32 bit) 】
func 8
string 16
[]T 24

3.2 the unsafe. Alignof (x)

Func (s *StdSizes) Alignof(T Type) int64 {// For arrays and structures, Switch t := opType (t).(type) {case *Array: Uneven.alignof (x)= uneven.alignof (x[0]) return s.Alignof(t.e.) case *Struct: Unsafe.alignof (x.f) Max := int64(1) for _, f := range t.fields {if a := s.Alignof(f.typ); A > Max {Max = a}} return Max case *Slice, *Interface: // If t.info () &isString! = 0 {return s.ordsize}} a := s.zigeof (T); Unsafe.alignof (x) is at least 1 if a < 1 {return 1} // complex{64,128} are aligned like [2]float{32,64}. If isComplex(T) {a /= 2 } if a > s.MaxAlign { return s.MaxAlign } return a }Copy the code

An 🌰 :

func main() { s := []string{"123"} s1 := "123" s2 := []string{"1", "2", "3"} fmt.Println(unsafe.Alignof(s)) // 8 fmt.Println(unsafe.Alignof(s1)) // 8 fmt.Println(unsafe.Alignof(s2)) // 8 fmt.Println("-------------------------") fmt.Println(unsafe.Alignof(int8(1))) // 1 fmt.Println(unsafe.Alignof(int16(1)))  // 2 fmt.Println(unsafe.Alignof(int32(1))) // 4 fmt.Println(unsafe.Alignof(int64(1))) // 8 fmt.Println(unsafe.Alignof(int(1))) // 8 fmt.Println("-------------------------") fmt.Println(unsafe.Alignof(uint8(1))) // 1 fmt.Println(unsafe.Alignof(uint16(1))) // 2 fmt.Println(unsafe.Alignof(uint32(1))) // 4 fmt.Println(unsafe.Alignof(uint64(1))) // 8 fmt.Println(unsafe.Alignof(uint(1))) // 8 }Copy the code

3.3 Memory alignment Coefficient

From the SYNC/Atomic library documentation:

– On x86-32, the 64-bit functions use instructions unavailable before the Pentium MMX.

On non-Linux ARM, the 64-bit functions use instructions unavailable before the ARMv6k core.

On both ARM and x86-32, it is the caller’s responsibility to arrange for 64-bit alignment of 64-bit words accessed atomically. The first word in a variable or in an allocated struct, array, or slice can be relied upon to be 64-bit aligned.

Atomic operations on a 64-bit word require that the address of the 64-bit word be 8-byte aligned. This is not a problem with the 64-bit architectures currently supported by standard compilers, which guarantee that the address of any 64-bit word is 8-byte aligned on 64-bit architectures.

However, on 32-bit architectures, the standard compiler guarantees address alignment of only 4 bytes for 64-bit words. A 64-bit atomic operation on a 64-bit word that is not 8-byte aligned will create a panic at run time. To make matters worse, some very old architectures do not support the basic instructions required for 64-bit atomic operations.

3.4 Alignment Rules

3.4.1 Member Alignment Rules

A data member of a struct. The first data member is placed at offset 0, and the address of each data member must be a multiple of the smaller of its size or the alignment parameter.

A qualified Go compiler must ensure that:

  1. For any type of variable x, unsafe.alignof (x) results in a minimum of 1.
  2. For a struct type variable x, the result of unsafe.alignof (x) is guaranteed to be the maximum (but minimum of 1) in unsafe.alignof (x.f) for all fields of x.
  3. For a variable x of array type, unsafe.alignof (x) is guaranteed to be equal to the alignment of a variable of the element type of the array.

Unsafe.alignof () Alignof coefficient function

3.4.2 Overall alignment rules

After the data members have been individually aligned, the structure itself is also aligned, and the overall length must be a multiple of the smaller of the alignment parameters and the length of the structure’s longest element.

An 🌰 :

func main() {
    a := Animal{123, 123, 18}
    
    fmt.Println("the first offset:", unsafe.Offsetof(a.id))
    fmt.Println("the first size:", unsafe.Sizeof(a.id))
    
    fmt.Println("the second offset:", unsafe.Offsetof(a.num))
    fmt.Println("the second size:", unsafe.Sizeof(a.num))
    
    fmt.Println("the third offset:", unsafe.Offsetof(a.age))
    fmt.Println("the third size:", unsafe.Sizeof(a.age))
    fmt.Println("the total size:", unsafe.Sizeof(a))
}

type Animal struct {
	id uint16
	num uint64
	age uint16
}

console:
the first offset: 0
the first size: 2
the second offset: 8
the second size: 8
the third offset: 16
the third size: 2
the total size: 24
Copy the code

Through unsafe. Sizeof (x), unsafe. Offsetof (x), according to the order size we can get all your variables are: 2,8,2, but in the second variable num calculating the offset of 8, because the num of memory address can’t be divided exactly by 8, so need to keep before the 6 bytes. So the offset of the third variable is 16, and the sum of the current structure calculation is 18; Since memory alignment needs to be an integer multiple of the structure’s maximum coefficient, we need to reserve 6 bytes at the end, which means the final memory footprint is 24 bytes.

3.5 Struct Memory Alignment

Here we can use an example to see how structs are memory-aligned.

import (
	"fmt"
	"unsafe"
)

func main() {
	a := Animal{123, 123, 18}
	fmt.Println("the total size:", unsafe.Sizeof(a))
	fmt.Println("----------- the optimized order ---------")
	d := Dog{123, 123, 18}
	fmt.Println(unsafe.Sizeof(d))
	fmt.Println("id's size:", unsafe.Sizeof(d.id))
	fmt.Println("age's offset:", unsafe.Offsetof(d.age))
	fmt.Println("age size:", unsafe.Sizeof(d.age))
	fmt.Println("num's offset:", unsafe.Offsetof(d.num))
	fmt.Println("num's size:", unsafe.Sizeof(d.num))
}

type Animal struct {
	id uint16
	num uint64
	age uint16
}

type Dog struct {
	id uint16
	age uint16
	num uint64
}

// console:
the total size: 24
----------- the optimized order ---------
16
id's size: 2
age's offset: 2
age size: 2
num's offset: 8
num's size: 8
Copy the code

Also using 🌰 in 3.4.2, we can clearly know that the struct memory alignment rules are consistent with the global alignment rules. But from this example we can easily see that changing the order of attributes in a struct can reduce memory usage.

To give an overview of the above optimized example, where all programs are generated by running a 64-bit compiler:

1. Attribute ID is of the data type of uint16 and occupies 2 bytes, starting from the 0th location

Unsafe.offsetof (x), the age attribute is four bytes from the second position

3. However, when the offset of num is printed, its offset value is 8, and the memory alignment coefficient of num is 8. Memory alignment needs to ensure that the start address of the current variable is divisible by 8, so it takes up 4 blank bytes first and 8 bytes from the 8th bit

4. According to the overall alignment rules, the maximum memory occupied by variables in the structure should be an integer multiple of it. The total size currently occupied is 16, which is divisible by 8.

This is because each attribute has its own memory alignment coefficient, and when its offset does not meet the overall memory alignment coefficient, the corresponding blank space will be left to meet the memory alignment. Therefore, a reasonable layout order is crucial for memory constrained programming.

3.6 Memory alignment of empty structs

3.6.1 Does an empty struct occupy memory space?

func main() { fmt.Println("size:", unsafe.Sizeof(name{})) } type name struct { } console: ------------------------------------------------------------ GOROOT=C:\Users\admin\scoop\apps\go\current #gosetup GOPATH=C:\Go\gopath #gosetup C:\Users\admin\scoop\apps\go\current\bin\go.exe build -o C:\Users\admin\AppData\Local\Temp\GoLand___1go_build_MemoryOfffset_go.exe D:\project_code\sakura\unit_go\src\code.local\string\MemoryOfffset.go #gosetup C:\Users\admin\AppData\Local\Temp\GoLand___1go_build_MemoryOfffset_go.exe #gosetup 0 Process finished with the exit code 0 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --Copy the code

It turns out that an empty struct{} instance doesn’t take up any memory, but what does an empty structure do?

3.6.2 Function of empty structure

Due to its lack of memory size, the first is to deal with memory resources, and the second is to use the structure as a placeholder due to its strong semantics. Here’s an example:

type User struct {}

func (u User) walk() {
    fmt.Println("walk on the park.")
}
Copy the code

Empty struct{} emulates a Set

package main import ( "fmt" ) type Set map[string]struct{} // Contains return bool when contains the element. func (s Set) Contains(key string) bool { _, flg := s[key] return flg } func (s Set) Add(key string) { s[key] = struct{}{} } func (s Set) Size() int { return len(s) } func (s Set) Delete(key string) { delete(s, key) } func main() { s := make(Set) s.Add("aaaa") s.Add("bbbb") fmt.Println(s.Size()) fmt.Println(s.Contains("aaaa")) // The only way to clean a map is to make a new map. S = make(Set) fmt.Println(s.Contains("bbbb")) fmt.Println(s.Size()) }Copy the code

3.6.3 There are alignment rules in empty structures

type Dog struct {
	feature struct{}
	age int16
	name string
}

type Cat struct {
	age int16
	feature struct{}
	name string
}

type Pig struct {
	age int16
	name string
	feature struct{}
}

type Duck struct {
	name string
	age int16
	feature struct{}
}

func main() {
	fmt.Println(unsafe.Sizeof(Dog{}))  // 24=16 + (2 + 6)
	fmt.Println(unsafe.Sizeof(Cat{}))  // 24=(2 + 6) + 16
	fmt.Println(unsafe.Sizeof(Pig{}))  // 32=(2 + 6) + 16 + 8
	fmt.Println(unsafe.Sizeof(Duck{})) // 32=16 + (2 + 2 + 4)
}
Copy the code

When the empty structure variable is in the last position of the structure, it needs to be aligned with the memory of the last variable. If the total memory size is not a multiple of the returned integer, it needs to be filled with blank bytes until it is divisible by the same rules as global alignment.