Original link: Poke here
preface
Unsafe. pointer is used extensively in the Go library. What is unsafe.pointer? Unsafe bags!
What is theunsafe
As we all know, Go is designed to be a strongly typed static language, so its types cannot be changed. Static also means that type checking is done before running. So in the language is not allowed two pointer types transform, used C language friends should know that this can be implemented in C language, Go do not allow such use are in security concerns, after all the compulsory transition will cause all sorts of trouble, sometimes the trouble is easy to detect, sometimes they are hidden deep, difficult to detect. Most readers probably don’t understand why casting is unsafe. Here’s a simple example in C:
int main(a){
double pi = 3.1415926;
double *pv = π
void *temp = pd;
int *p = temp;
}
Copy the code
In standard C language, any pointer of non-void type can be assigned to each other with a pointer of void type, and a pointer of void type can also be used as an intermediary to achieve indirect conversion between Pointers of different types. In the example above, the pointer pv points to an 8-byte double, but after conversion, p points to a 4-byte int. This design flaw that memory truncation occurs and memory access after conversion is a security risk. I think this is one of the reasons Go is designed to be strongly typed.
Although casting is not secure, it can be used in some special scenarios to break the type and memory safety mechanism of Go, which can bypass the type system inefficiencies and improve operation efficiency. Even though unsafe, the Go library offers an unsafe package, which isn’t recommended, it’s not unsafe to use. Even if you’re proficient, you can put it into practice.
unsafe
Realize the principle of
Before looking at the unsafe source, the standard library unsafe package only provides three methods:
func Sizeof(x ArbitraryType) uintptr
func Offsetof(x ArbitraryType) uintptr
func Alignof(x ArbitraryType) uintptr
Copy the code
Sizeof(x ArbitrayType)
The main function of the method is to use return typesx
Number of bytes occupied, but does not containx
The size of the content pointed to, andC
Language standard librarySizeof()
Methods have the same function, such as in32
On a bit machine, a pointer returns a size of 4 bytes.Offsetof(x ArbitraryType)
The offset () method returns the number of bytes between the location of a structure member and the start of the structure. The offset must be a structure and the return value is a constant.Alignof(x ArbitratyType)
Is used to return an alignment value of type, also known as alignment coefficient or alignment multiple. Alignment value is a value related to memory alignment. Proper memory alignment can improve the performance of memory reads and writes. The general alignment value is2^n
, the maximum will not exceed8
(affected by memory alignment). You can also use the reflection package function to get the alignment value, that is:unsafe.Alignof(x)
Is equivalent toreflect.TypeOf(x).Align()
. For any type of variablex
.unsafe.Alignof(x)
At least one. forstruct
Struct type variablex
To calculate thex
Each fieldf
theThe unsafe. Alignof (x, f)
.unsafe.Alignof(x)
Is equal to the maximum of that. forarray
A variable of array typex
.unsafe.Alignof(x)
Equal to the alignment multiple of the element types that make up the array. There are no empty fieldsstruct{}
And without any elementsarray
The size of the memory space occupied is0
, different sizes are0
May refer to the same block of address.
Uintptr = uintptr; uintptr = uintptr; uintptr = uintptr; In this way, specific memory can be accessed to achieve the purpose of reading and writing to different memory. All three methods are arguments to a ArbitraryType, which means any type. They also provide a Pointer type, which is a generic Pointer like void *.
type ArbitraryType int
type Pointer *ArbitraryType
// Uintptr is an integer type that is large enough to store
type uintptr uintptr
Copy the code
This may seem a bit confusing, but here’s a summary of the three pointer types:
*T
: Common type Pointer type, used to pass the address of an object. Pointer operations cannot be performed.unsafe.poniter
: generic pointer type, used to convert different types of Pointers, cannot perform pointer operations, cannot read values stored in memory (need to convert to a common pointer type)uintptr
: for pointer operations,GC
Don’t put theuintptr
When a pointer,uintptr
Unable to hold objects.uintptr
The target of the type is reclaimed.
Unsafe.Pointer is a bridge that allows Pointers of any type to be converted to each other and to be converted to uintptr for Pointer operation. In other words, the Uintptr is used for Pointer operation in combination with unsafe. Let me draw a picture:
So much for the basic principle, let’s take a look at how to use ~
unsafe.Pointer
The basic use
In atomic/value.go, an ifaceWords structure is defined, where typ and data fields are unsafe.poniter. Poniter is used here because the value passed in is the interface{} type and is strongly converted to the ifaceWords type using unbroadening. This saves both the type and the value for later write type checking. Part of the captured code is as follows:
// ifaceWords is interface{} internal representation.
type ifaceWords struct {
typ unsafe.Pointer
data unsafe.Pointer
}
// Load returns the value set by the most recent Store.
// It returns nil if there has been no call to Store for this Value.
func (v *Value) Load(a) (x interface{}) {
vp := (*ifaceWords)(unsafe.Pointer(v))
for {
typ := LoadPointer(&vp.typ) // Reads the type of an existing value
/ * *... **/ is omitted
// First store completed. Check type and overwrite data.
iftyp ! = xp.typ {// Compare the current type with the type to be saved
panic("sync/atomic: store of inconsistently typed value into Value")}}Copy the code
This is an example of using unsafe.Pointer in source code, and one day when you’re ready to read the source code, it will be everywhere. Okay, let’s write a simple example to see how unsafe.Pointer is used.
func main(a) {
number := 5
pointer := &number
fmt.Printf("number:addr:%p, value:%d\n",pointer,*pointer)
float32Number := (*float32)(unsafe.Pointer(pointer))
*float32Number = *float32Number + 3
fmt.Printf("float64:addr:%p, value:%f\n",float32Number,*float32Number)
}
Copy the code
Running results:
number:addr:0xc000018090, value:5
float64:addr:0xc000018090, value:3.000000
Copy the code
Unbroadening.Pointer the address to which the Pointer points does not change, but only the type. This example doesn’t make sense by itself, nor would it be used in a normal project.
To summarize the basics: first, cast the *T type to unbroadening.Pointer, then cast it to the Pointer type you need.
Sizeof, Alignof, Offsetof
Basic use of three functions
Let’s start with an example:
type User struct {
Name string
Age uint32
Gender bool // Male :true female: false for example
}
func func_example(a) {
// sizeof
fmt.Println(unsafe.Sizeof(true))
fmt.Println(unsafe.Sizeof(int8(0)))
fmt.Println(unsafe.Sizeof(int16(10)))
fmt.Println(unsafe.Sizeof(int(10)))
fmt.Println(unsafe.Sizeof(int32(190)))
fmt.Println(unsafe.Sizeof("asong"))
fmt.Println(unsafe.Sizeof([]int{1.3.4}))
// Offsetof
user := User{Name: "Asong", Age: 23,Gender: true}
userNamePointer := unsafe.Pointer(&user)
nNamePointer := (*string)(unsafe.Pointer(userNamePointer))
*nNamePointer = "Golang Dream Factory"
nAgePointer := (*uint32)(unsafe.Pointer(uintptr(userNamePointer) + unsafe.Offsetof(user.Age)))
*nAgePointer = 25
nGender := (*bool)(unsafe.Pointer(uintptr(userNamePointer)+unsafe.Offsetof(user.Gender)))
*nGender = false
fmt.Printf("u.Name: %s, u.Age: %d, u.Gender: %v\n", user.Name, user.Age,user.Gender)
// Alignof
var b bool
var i8 int8
var i16 int16
var i64 int64
var f32 float32
var s string
var m map[string]string
var p *int32
fmt.Println(unsafe.Alignof(b))
fmt.Println(unsafe.Alignof(i8))
fmt.Println(unsafe.Alignof(i16))
fmt.Println(unsafe.Alignof(i64))
fmt.Println(unsafe.Alignof(f32))
fmt.Println(unsafe.Alignof(s))
fmt.Println(unsafe.Alignof(m))
fmt.Println(unsafe.Alignof(p))
}
Copy the code
The sizeof the int type depends on the number of CPU bits on the machine. The sizeof the int type depends on the number of CPU bits on the machine. If the CPU is 32-bit, then the int is 4 bytes. If the CPU is 64-bit, then the int is 8 bytes. In this case, my computer is 64-bit, so the result is 8 bytes.
Unsafe. pointer; unsafe.pointer; unsafe.pointer; unsafe.pointer; unsafe.pointer; unsafe.pointer; unsafe.pointer; The Offsetof method returns the Offsetof the member variable in the structure, which is the number of bytes between the initial position of the structure and the member variable. The uintptr can’t be used as a temporary variable to store the uintptr type. We mentioned above that it is used for pointer operations. GC does not use the Uintptr as a pointer, and the Uintptr cannot hold objects. Uintptr targets are recycled, so you don’t know when they will be gapped and what errors will occur in subsequent memory operations. Here’s an example:
// Do not use it this way
p1 := uintptr(userNamePointer)
nAgePointer := (*uint32)(unsafe.Pointer(p1 + unsafe.Offsetof(user.Age)))
Copy the code
Finally, take a look at Alignof function, which is mainly to get the alignment value of variables. Except for CPU bit dependent types such as int and uintptr, the alignment value of basic types is fixed. The alignment value of structure takes the maximum value of its member alignment value.
Classic use: string and []byte conversion
Implement string to byte conversions. Normally, we might write standard conversions like this:
// string to []byte
str1 := "Golang Dream Factory"
by := []byte(s1)
// []byte to string
str2 := string(by)
Copy the code
Unsafe. Pointer (unsafe.Pointer); unsafe.Pointer ([]byte); unsafe.Pointer ([]byte); In the Reflect package there are constructs for ·string and slice, which are:
type StringHeader struct {
Data uintptr
Len int
}
type SliceHeader struct {
Data uintptr
Len int
Cap int
}
Copy the code
StringHeader represents a string runtime representation (as SliceHeader does). Comparing the String and Slice runtime representations shows that they differ by only one Cap field, so their memory layout is aligned. Unbroadening.Pointer is the best way to convert, because you can write the following code:
func stringToByte(s string) []byte {
header := (*reflect.StringHeader)(unsafe.Pointer(&s))
newHeader := reflect.SliceHeader{
Data: header.Data,
Len: header.Len,
Cap: header.Len,
}
return* (* []byte)(unsafe.Pointer(&newHeader))
}
func bytesToString(b []byte) string{
header := (*reflect.SliceHeader)(unsafe.Pointer(&b))
newHeader := reflect.StringHeader{
Data: header.Data,
Len: header.Len,
}
return* (*string)(unsafe.Pointer(&newHeader))
}
Copy the code
[]byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: [] Strong transfer automatic construction, omitted code as follows:
func bytesToString(b []byte) string {
return* (*string)(unsafe.Pointer(&b))
}
Copy the code
Although this method is more efficient, it is not recommended for everyone to use. The previous improvement has also been made. If it is unsafe, there will be great hidden dangers if it is used improperly, and some serious cases cannot be captured in RECOVER.
Memory alignment
Now in the computer memory space is divided according to the byte, theoretically seems to access to any type of variables can be from any address, but the reality is when they visit a specific type variables are often in a particular memory address access, which requires all kinds of data according to certain rules in space arrangement, Instead of sequentially discharging one after the other, this is aligned.
When the CPU accesses memory, instead of byte by byte, it accesses memory in word size units. For example, if a 32-bit CPU has a word length of 4 bytes, the CPU accesses memory in 4 bytes. This design can reduce the number of CPU accesses to the memory and increase the throughput of CPU accesses to the memory. Let’s say we need to read 8 bytes of data, 4 bytes at a time so we only need to read it 2 times. Memory alignment is also beneficial for atomic operations on variables. Each memory access is atomic, and if the size of a variable is no larger than a word length, the access to the variable after memory alignment is atomic, which is critical in concurrent scenarios.
Let’s look at an example:
// 64 bit platform, alignment parameter is 8
type User1 struct {
A int32 / / 4
B []int32 / / 24
C string / / 16
D bool / / 1
}
type User2 struct {
B []int32
A int32
D bool
C string
}
type User3 struct {
D bool
B []int32
A int32
C string
}
func main(a) {
var u1 User1
var u2 User2
var u3 User3
fmt.Println("u1 size is ",unsafe.Sizeof(u1))
fmt.Println("u2 size is ",unsafe.Sizeof(u2))
fmt.Println("u3 size is ",unsafe.Sizeof(u3))
}
// Run result MAC: 64 bits
u1 size is 56
u2 size is 48
u3 size is 56
Copy the code
As can be seen from the results, different order of field placement will occupy different memory, which is because memory alignment affects the size of struct, so sometimes a reasonable field can reduce the memory overhead. C has the same alignment rules as Go, so the alignment rules of C also apply to Go:
- For each member of the structure, the first member is located in the position of the deviation of 0, structure the offset of the first members (offset) to 0, then each member relative to the first address offset are members of the structure size and the effective alignment value the smaller the integer times, if necessary the compiler will add padding bytes between members.
- In addition to the structure members needing to be aligned, the structure itself needs to be aligned, and the length of the structure must be a multiple of the compiler’s default alignment length and the smallest data size of the longest type in the member.
Well, know the rules, we now come to analyze the above example, based on my MAC using 64 – bit CPU, is 8 to analyze the alignment parameters, int32, int32, string [], bool alignment values are respectively 4, 8, 8, 1, memory size is 4, 24, 16, 1 respectively, Let’s first analyze User1 according to the first alignment rule:
- The first field type is
int32
, the alignment value is 4 and the size is 4, so it is placed first in the memory layout. - The second field type is
[]int32
, alignment value is 8, size is 24, so its memory offset must be a multiple of 8, so in the currentuser1
In, can’t from the first4
We’re starting at place 1. We have to start at place 15
It starts at bit, which is offset by zero8
. The first4, 7
Bits are populated by the compiler, typically0
Value, also called void. The first9
Position to the first32
Bit is the second fieldB
. - The third field type is
string
, the alignment value is8
, the size of16
, so his memory offset must be a multiple of 8 becauseuser1
The first two fields are already number one32
Bit, so the offset of the next bit is exactly zero32
, which happens to be the fieldC
A multiple of the alignment value of, without padding, can be directly arranged in the third field, that is, from the first32
A to48
Bit third fieldC
. - The third field type is
bool
, the alignment value is1
, the size of1
, so his memory offset must be1
Multiple of theta, becauseuser1
The first two fields are already number one48
Bit, so the offset of the next bit is exactly zero48
. It happens to be a fieldD
Multiples of the alignment value of, without padding, can be directly sorted into the fourth field, that is, from48
To the first49
Bit is the third fieldD
. - Ok, now after the first memory alignment rule, the memory length is zero
49
Byte, we start with the first byte of memory2
Rule for alignment. According to the second rule, the default alignment is8
, the maximum type degree in the field is24
, take the smallest one, so find the alignment value of the structure is8
, our current memory length is49
, not8
Multiple of PI, so I have to complete it, so the final result is PI56
, to fill the7
position
Having said that, let’s draw a picture:
So now you get the idea, let’s do the same thing for the other two structs, but I’m not going to do it here.
One last thing to note about memory alignment is that empty struct{} does not take up any storage space and is generally not required when used as a field of another struct. There is one exception: when struct{} is the last field in a structure, memory alignment is required. Because if there is a pointer to that field, the address returned will be outside the structure, and if the pointer stays alive without freeing the corresponding memory, there will be a memory leak (the memory is not freed by the structure). Here’s an example:
func main(a) {
fmt.Println(unsafe.Sizeof(test1{})) / / 8
fmt.Println(unsafe.Sizeof(test2{})) / / 4
}
type test1 struct {
a int32
b struct{}}type test2 struct {
a struct{}
b int32
}
Copy the code
Simply put, for any type that occupies 0 bytes, such as struct {} or [0]byte, if it occurs at the end of a structure, we assume that it occupies 1 byte. So for the test1 structure, it looks like this: ‘
type test1 struct {
a int32
// b struct{}
b [1]byte
}
Copy the code
Therefore, in memory alignment, the last byte occupied by the structure is 8.
Important note: Do not add a zero-size type to the end of the structure definition
conclusion
To conclude, the unsafe package bypasses the Go type system in its quest to manipulate memory directly, which is risky to use. However, in some cases, using the functions provided by the Unsafe package can make code more efficient, and the Go source code uses the unsafe package extensively.
The unsafe package defines Pointer and three functions:
type ArbitraryType int
type Pointer *ArbitraryType
func Sizeof(x ArbitraryType) uintptr
func Offsetof(x ArbitraryType) uintptr
func Alignof(x ArbitraryType) uintptr
Copy the code
The Uintptr can convert to and from unsafe.Pointer, and the Uintptr can do math. In this way, the combination of uintptr and unbroadening.Pointer resolves the constraint that the Go Pointer cannot perform mathematical operations. Using the unsafe function, you can obtain the addresses of private members of structures and perform read and write operations on them, breaking the type-safety restrictions of Go.
Finally, we learned about memory alignment. This design can reduce the number of times the CPU accesses memory and increase the throughput of the CPU accesses memory. Therefore, the structure can save more memory by sorting the fields properly.
Well, that’s all for this article, the three qualities (share, like, read) are the author’s motivation to continue to create more quality content!
We have created a Golang learning and communication group. Welcome to join the group and we will learn and communicate together. Join the group: add me vX pull you into the group, or the public number to get into the group two-dimensional code
At the end, I will send you a small welfare. Recently, I was reading the book [micro-service architecture design mode], which is very good. I also collected a PDF, which can be downloaded by myself if you need it. Access: Follow the public account: [Golang Dreamworks], background reply: [micro service], can be obtained.
I have translated a GIN Chinese document, which will be maintained regularly. If you need it, you can download it by replying to [GIN] in the background.
Translated a Machinery Chinese document, will be regularly maintained, there is a need for friends to respond to the background [Machinery] can be obtained.
I am Asong, an ordinary programming ape. Let’s get stronger together. We welcome your attention, and we’ll see you next time
Recommended previous articles:
- Mechanics-go Asynchronous task queues
- Source analysis panic and recover, do not understand you hit me!
- Atomic Operations: The basics of concurrent programming
- Detail the implementation mechanism of defer
- You really understand interface
- Leaf-segment Distributed ID Generation System (Golang implementation version)
- 10 GIFs to help you understand sorting algorithms (with go implementation code)
- Go parameter transfer type
- Teach my sister how to write message queues
- Cache avalanche, cache penetration, cache breakdown
- Context package, read this article enough!!
- Implementation of Sync. Once for concurrent programming (with three interview questions)
- Interviewer: Have you used for-range in go? Can you explain the reasons for these problems