Is your memory aligned

When it comes to memory alignment, it’s easy to play around with things from the early days of Java. For this reason, Java8 also provides a syntax sugar @contended to help solve the problem of misaligned pseudo-shares in cache lines. However, Go currently involves similar issues, such as the atomic manipulation of memory alignment, which needs to be handled manually after Russ Cox’s announcement

On both ARM and x86-32, it is the caller’s responsibility to arrange for 64-bit alignment of 64-bit words accessed atomically.

There are some languages that automatically help developers with alignment issues, such as Rust, which has recently become popular, and Microsoft’s announcement that it will gradually move from C/C++ to Rust to build its infrastructure software. B: well… Set a flag this year.

What is memory alignment?

Memory alignment, as I understand it, falls into three general categories

Base type alignment, memory address alignment. The alignment factor for the base type may differ on different architecture CPU platforms
The alignment of cache lines between CPU and memory, which is the pseudo-sharing problem if you don’t already know about it, you probably haven’t read the previous article, “Hands on Go: An In-depth Look at Sync.pool.”
Alignment of page sizes for network or disk operations at the operating system level

Welcome to diss if you disagree. Today we are going to focus on the first area of memory alignment, starting with wikipedia’s definition

A memory address a is said to be n-byte aligned when a is a multiple of n bytes (where n is a power of 2). In this context, a byte is the smallest unit of memory access, i.e. each memory address specifies a different byte. An n-byte aligned address would have a minimum of log2(n) least-significant zeros when expressed in binary. From Wikipedia

Memory alignment, as defined by Wikipedia, is how code is laid out and used in memory after compilation. When a memory address A is a multiple of n bytes (where n is a power of 2), the memory address A is said to be n-byte aligned. In this case, the byte is the smallest unit of memory access, that is, each memory address specifies a different byte. When using binary representation, an n-byte aligned address will have at least log2(n) least significant zeros.

Why memory alignment?

Modern cpus place some restrictions on the basic types of legal addresses, and instead of reading and writing to memory byte by byte, they read and write in batches of words, usually 2, 4, or 8 words. For example, if a 64-bit CPU has an 8-byte word length, the CPU will access an 8-byte word length. Because the CPU always accesses memory based on word length, it is likely to increase the number of times the CPU accesses memory without memory alignment.

In addition to the CPU access data performance problems mentioned above, of course, many online said there is another reason “specific hardware platform only allows certain types of data at certain addresses, otherwise it will lead to abnormal conditions” but I have not encountered this situation. Ignore it for now

The Go compiler also uses memory alignment for these reasons.

So what’s the advantage of knowing the principles of memory alignment? Take the 64-bit system as an example

type Type1 struct {
	a int8
	b int64
	c int32
}

a := align.Type1{}
fmt.Printf("size of align.Type1 is %d",unsafe.Sizeof(a))
Copy the code

What do you think the output would be on a 64-bit platform? One might say, how easy is that 1+8+4=13 bytes. But.

size of align.Type1 is 24
Copy the code

Why? Let’s look at its memory layout:

But once you’ve mastered memory alignment, you’ll write code like this

type Type1 struct {
	a int8
	c int32
	b int64
}
a := align.Type1{}
fmt.Printf("size of align.Type1 is %d",unsafe.Sizeof(a))
Copy the code

The execution result

size of align.Type1 is 16
Copy the code

Apparently, 33% of the space is saved this way.

On the other hand, memory alignment is also beneficial for atomic operations. If the size of a piece of data is no larger than the word length of the platform CPU’s access memory, the data can be read in one go, and the access is atomic.

Memory alignment tips

Go Memory address alignment

As described in the Go programming language specification, computer architectures may require memory address alignment; That is, if the address of a variable is a multiple of a factor, the type of the variable is aligned. The function Alignof returns a value representing any type of alignment. (golang.org/ref/spec#Si…).

Go memory alignment and size assurance

Go’s Unsafe package provides two useful approaches.

// The number of bytes taken up by the instance of the type
func Sizeof(x ArbitraryType) uintptr
// Returns the alignment coefficient of the specified type
func Alignof(x ArbitraryType) uintptr
Copy the code

Sizeof returns the number of bytes for an instance of the type unsafe.sizeof is easy to understand. Unbroadening.Alignof returns the alignment coefficient of type, and the specific rule Go is officially defined:

For a variable x of any type: unsafe.Alignof(x) is at least 1.

For any variable x, unsafe.alignof (x) has a minimum value of 1

For a variable x of struct type: unsafe.Alignof(x) is the largest of all the values unsafe.Alignof(x.f) for each field f of x, but at least 1.

For the structure variable x, unsafe.alignof (x) equals the maximum alignment coefficient for all fields unsafe.alignof (x.f) of the structure, but the minimum is 1

For a variable x of array type: unsafe.Alignof(x) is the same as the alignment of a variable of the array’s element type.

For variable x of array type, unsafe.alignof (x) equals the alignment coefficient of the variable of array element type.

That translates to the table below

type	alignment guarantee
bool,byte,uint8,int8	1
uint16,int16	2
uint32,int32	4
Float32,complex64	4
array	It is determined by the alignment coefficient of the variable of the array element type
struct	Determines the maximum alignment coefficient for all fields of a structure
Other type	A Machine Word size, 4 bytes for 32-bit systems, 8 bytes for 64-bit systems

Go makes size and guarantees for some basic types

type	size in bytes
byte,uint8,int8	1
uint16,int16	2
uint32,int32,float32	4
uint64,int64,float64,complex64	8
Complex128	16

But a look at the official Go programming language specification shows that the Go memory alignment rules are not described in enough detail. According to the test, the alignment rules for Go and C are pretty consistent

Data member alignment rule: The first field of a structure’s data member must be set at offset 0, and the starting address of all subsequent fields must be a multiple of the default alignment coefficient and the smallest value of the length of the member of that type
The structure itself also needs to be aligned: the length of the structure itself must be a multiple of the default alignment factor and the minimum of the maximum alignment factor in the structure members

If you have any objection welcome to exchange advice.

Ensure portability of atomic operations

We might see this code in sync.waitGroup.

type WaitGroup struct {
	noCopy noCopy
	state1 [3]uint32
}
func (wg *WaitGroup) state(a) (statep *uint64, semap *uint32) {
	if uintptr(unsafe.Pointer(&wg.state1))%8= =0 {
		return (*uint64)(unsafe.Pointer(&wg.state1)), &wg.state1[2]}else {
		return (*uint64)(unsafe.Pointer(&wg.state1[1])), &wg.state1[0]}}Copy the code

Uintptr (unsafe.Pointer(& WG.state1))%8 is used to determine whether the current address is 8-byte aligned, as it is currently mostly 4-byte and 8-byte aligned

If 8 bytes are already aligned, this directly uses the first 8 bytes of space to manipulate the 64 bits;
If it is not 8-byte aligned, fill it with the first 4 bytes and ensure that the first 8 bytes of the address are 8-byte aligned.

No picture, no truth

Suppose sync.waitGroup is embedded in another structure and comes out first

type M struct {
	wg sync.WaitGroup
}
Copy the code

If unfortunately sync.waitGroup is embedded in the structure, it is not the first entry

type N struct {
	n  int8
	wg sync.WaitGroup
}
Copy the code

Don’t worry.

So sync.waitGroup keeps its atomic operations compatible across architectures by dynamically adjusting where 64-bit data is stored.

Zero size field alignment

If a structure or array type does not contain fields or elements of greater size than zero, it has a size of zero.

A struct or array type has size zero if it contains no fields (or elements, respectively) that have a size greater than zero. Two distinct zero-size variables may have the same address in memory.

For example, x [0]int8, empty struct{}. It does not need to be aligned when it is a field, but it does need to be aligned when it is the last field in a structure. Let’s take an empty structure for example

package main

import (
	"fmt"
	"unsafe"
)

type M struct {
	m int64
	x struct{}}type N struct {
	x struct{}
	n int64
}

func main(a) {
	m := M{}
	n := N{}
	fmt.Printf("as final field size:%d\nnot as final field size:%d\n", unsafe.Sizeof(m), unsafe.Sizeof(n))
}
Copy the code

The output

as final field size:16
not as final field size:8
Copy the code

The output above may look a little confusing, so the figure above

Looking at the diagram, it is obvious that an empty structure of size 0 embedded in a field of another structure does not take up space. But… If an empty structure is embedded at the end of another structure, its size becomes 1. In the case of a 64-bit system, it will have 7 bytes of padding. Why??

If we don’t allocate memory to the last bit of the embedded structure, then the empty structure points to an invalid address, just like the wild pointer in C/C++. Go is supposed to avoid this situation.

When analyzing code in the Sync package, we often see that Go does not prevent objects from being copied after use, embedding noCopy in the structure body. For example, sync. Cond

type Cond struct {
	noCopy noCopy

	// L is held while observing or changing the condition
	L Locker

	notify  notifyList
	checker copyChecker
}
type noCopy struct{}
Copy the code

This noCopy is an empty structure of size zero, and we find that it is usually placed in the first field of the structure, I think for this reason.

Atomic operation problem

For 64-bit data, take uint64 as an example. Normally, on a 64-bit system, the CPU can perform atomic operations at once because the word length is exactly the same as the 8-byte alignment. However, in a 32-bit system with 4-byte alignment and a 4-byte word length, uint64 64-bit data may be allocated between two data blocks. Therefore, the operation takes two times to complete. If other operations are performed during the two operations, atomicity cannot be guaranteed. This type of access is also insecure. Issues – 6404 (github.com/golang/go/i…

The following description is included in the atomic package

On 386, the 64-bit functions use instructions unavailable before the Pentium MMX.

On non-Linux ARM, the 64-bit functions use instructions unavailable before the ARMv6k core.

On ARM, 386, and 32-bit MIPS, it is the caller’s responsibility to arrange for 64-bit alignment of 64-bit words accessed atomically. The first word in a variable or in an allocated struct, array, or slice can be relied upon to be 64-bit aligned.

In ARM, 386, and 32-bit MIPS, the caller is responsible for arranging atomic access to 64-bit words aligned to 8 bytes, otherwise the program will panic

How to guarantee?

The first word in a variable or in an allocated struct, array, or slice can be relied upon to be 64-bit aligned.

This means that the first (64-bit) word in a opened structure, array, and slice value can be considered 8-byte aligned. A slice can be read as a value of a declared variable, a reference returned by the built-in functions make or new, or a slice can be considered a value of a slice if it is derived from a slit array and shares the first element with the array.

Alignment detection tool

fieldalignment

/ / install export GOPROXY=https://mirrors.aliyun.com/goproxy/ go get -u golang.org/x/tools/... // Use fieldalignment {package} for example fieldalignment align.  ~/workspace/workspace_github/go-snippets/align/align.go:8:8: struct of size 16 could be 8Copy the code

structlayout

Go get -u honnef. Co/go/tools / / show the structure layout go install honnef. Co/go/tools/CMD/structlayout @ latest / / redesigned struct fields a reduction in the number of filling the go Install honnef. Co/go/tools/CMD/structlayout - optimize @ latest / / use ASCII format output go install Honnef. Co/go/tools/CMD/structlayout - pretty @ latest / / third party visualization go install github.com/ajstarks/svgo/structlayout-svg@latestCopy the code

Structlayout and structLayout-SVG are used to create the visualization in the zero-size field alignment above

//struct M
structlayout -json file=~/mywork/workspace/workspace_github/go-snippets/align/align.go M | structlayout-svg -t "align.M" > m.svg
Copy the code

//struct N
structlayout -json file=~/mywork/workspace/workspace_github/go-snippets/align/align.go N | structlayout-svg -t "align.N" > n.svg
Copy the code

conclusion

Memory alignment is a way for the CPU to access data in memory more efficiently.
Do you understand how structure fields can be arranged to make memory more reasonable
Alignment of Go guarantees that if the alignment coefficient of type T is n, the address of type T must be a multiple of n, which is a power of 2
Note that zero-size fields are not placed at the end of the structure to avoid memory waste.
Care needs to be taken to ensure 8-byte alignment for 64-bit word atom access on 32-bit systems. If you don’t want to worry about memory alignment, I think use itsync.MutexTo modify the data to ensure atomicity.

If you find any questions or mistakes in this article, please pay attention to the comments on the official account. If you think you can also help a point at 😁