In the Golang[]byte.stringand[]runeMutual transformation of the underlying principles and analysis

The []byte and string conversions are often used in Golang scenarios, especially when using json.Marshal and json.Unmarshal.

This paper mainly explains the following contents:

  • Several types of mutual conversion methods and performance analysis
  • These types of underlying storage
  • Code gist

Reciprocal transformation

[] Conversion between byte and string

string -> []byte

func BenchmarkStringToByteSlice(b *testing.B) {
	s := genString(10000)
	b.ReportAllocs()
	for i := 0; i < b.N; i++ {
		bs := []byte(s)
		if len(bs) ! =len(s) {
			b.Error("error")}}}func BenchmarkStringToByteSliceUnsafe(b *testing.B) {
	s := genString(10000)
	b.ReportAllocs()
	for i := 0; i < b.N; i++ {
		l := len(s)
		bs := *(*[]byte)(unsafe.Pointer(&reflect.SliceHeader{
			Data: (*(*reflect.StringHeader)(unsafe.Pointer(&s))).Data,
			Len:  l,
			Cap:  l,
		}))
		if len(bs) ! =len(s) {
			b.Error("error")}}}Copy the code

The first is using []byte, which is the usual way to convert. The second is using unsafe. The difference between the two is that memory is reallocated and memory is reused.

The results from Benchmark confirm this

go test  -run=BenchmarkStringToByteSlice -bench=StringToByteSlice
# go-demo.testgoos: darwin goarch: amd64 pkg: go-demo BenchmarkStringToByteSlice-12 1164224 964 ns/op 10285 B/op 1 allocs/op BenchmarkStringToByteSliceUnsafe-12 1000000000 0.380 ns/op 0 B/op 0 Allocs /op PASS OK go-Demo 2.089sCopy the code

[]byte -> string

func BenchmarkSliceByteToString(b *testing.B) {
	bs := genSliceByte(100)
	b.ReportAllocs()
	for i := 0; i < b.N; i++ {
		s := string(bs)
		if len(s) ! =len(bs) {
			b.Error("error")}}}func BenchmarkSliceByteToStringUnsafe(b *testing.B) {
	bs := genSliceByte(100)
	b.ReportAllocs()
	for i := 0; i < b.N; i++ {
		s := *(*string)(unsafe.Pointer(&bs))
		if len(s) ! =len(bs) {
			b.Log("slice: ".len(bs), " string: ".len(s))
			b.Error("error: ")}}}Copy the code

The benchmark results

go test  -run=BenchmarkSliceByteToString -bench=SliceByteToString
# go-demo.testgoos: darwin goarch: amd64 pkg: Go - demo BenchmarkSliceByteToString - 12 35913873 32.4 ns/op 112 B/op 1 allocs/op BenchmarkSliceByteToStringUnsafe - 12 1000000000 0.253 ns/op 0 B/op 0 Allocs /op PASS OK Go-demo 3.796sCopy the code

Conversion of string and []rune

The conversion of []rune to string is similar to that of []rune. The conversion of []rune to string is similar to that of []rune

func BenchmarkSliceRuneToStringUnsafe(b *testing.B) {
	bs := genSliceRune(100)
	s1 := string(bs)
	b.ReportAllocs()
	for i := 0; i < b.N; i++ {
		var l int
		for _, r := range bs {
			l += utf8.RuneLen(r)
		}
		s := *(*string)(unsafe.Pointer(&reflect.StringHeader{
			Data: (*(*reflect.SliceHeader)(unsafe.Pointer(&bs))).Data,
			Len:  l,
		}))
		if len(s1) ! =len(s) {
			b.Error("error")}}}Copy the code

Analysis of the underlying storage of String and Slice

Reflect. SliceHeader and reflect. StringHeader

type StringHeader struct {
	Data uintptr
	Len  int
}
type SliceHeader struct {
	Data uintptr
	Len  int
	Cap  int
}
Copy the code

The type is basically the same, but the Cap is added to Slice, which means that [] bytes can be forcibly converted to string using Pointers, but not vice versa

Slice’s underlying storage

type slice struct {
	array unsafe.Pointer
	len   int
	cap   int
}
Copy the code
Take a look at slice’s underlying structure in assembly form
package pkg

// var data = make([]int, 0, 10)
var data = []int{1.2}
Copy the code
go tool compile -S pkg.go
go.cuinfo.packagename. SDWARFINFO dupok size=0
	0x0000 70 6b 67                                         pkg
"".data SDATA size=24
	0x0000 00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00  ................
	0x0010 02 00 00 00 00 00 00 00                          ........
	rel 0+8 t=1 ""..stmp_0+0
""..stmp_0 SNOPTRDATA size=16
	0x0000 01 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00  ................
...
Copy the code

Data corresponds to size is 24 (8byte pointer, len and cap are 8byte each), and the contents of slice are two ints corresponding to the contents of “”.stmp_0″

Further analyze the binary corresponding to the data

  • The data is + 800 02...Corresponding len,
  • The data is + 1600 02Corresponding to the cap

The entire slice struct is distributed compactly in memory, so we can perform a pointer class cast, similar to a c++ reinterpret_cast

The underlying structure of string
package pkg

var testStr = "abc"

Copy the code
go.cuinfo.packagename. SDWARFINFO dupok size=0
	0x0000 70 6b 67                                         pkg
go.string."abc" SRODATA dupok size=3
	0x0000 61 62 63                                         abc
"".testStr SDATA size=16
	0x0000 00 00 00 00 00 00 00 00 03 00 00 00 00 00 00 00  ................
	rel 0+8 t=1 go.string."abc"+0
Copy the code

Similar to the previous slice, the size has been changed to 16

Fat Pointer

Structures like slice are often called fatpointers in C, so for those interested, Go Slices are Fat Pointers

conclusion

  • This paper introduces the conversion of String, []byte and []rune in Golang and a simple performance analysis
  • Slice’s underlying storage in Golang