One day, while taking a walk after lunch, a colleague asked a question related to Go. It was a question asked by the interviewer in his previous interview, but he still hasn’t found the answer. This quiz is the title of this blog post: How to invert strings with Zero memory allocation in Go?

Inversion strings are easy to handle in Go. We simply convert a string to a byte slice using []byte(string), swap the first byte with the last byte, swap the second byte with the penultimate byte, and so on. After reversing the entire byte slice, we convert the byte slice to a string. The time complexity of the entire inversion operation is O(n). Note that strings containing multi-byte text such as Chinese need to be converted to []rune. To reduce the complexity of processing, this post will only consider inversion of English strings. The relevant codes are as follows:

func main(a) {
	str := "hello,world"
	bytes := reverse([]byte(str))
	println(string(bytes))
}

func reverse(s []byte) []byte {
	for i, j := 0.len(s)- 1; i < j; i, j = i+1, j- 1 {
		s[i], s[j] = s[j], s[i]
	}
	return s
}
Copy the code

This completes the goal of reversing strings, but memory allocation occurs when a string is interconverted with []byte or []rune. Please refer to []byte(string) and String ([]byte) in [] Byte ([]byte) to see why memory is allocated. This is the section. I won’t go into details in this post.

Since the []byte(string) and String ([]byte) methods used in the above processing require memory allocation, is there a conversion method that does not allocate memory?

The answer is yes. Because the underlying type of string and []byte are roughly the same, Pointer type conversions can be performed via the unsafe.Pointer, which is a common way to optimize the interchange of strings and byte slices. Specific implementation can refer to the following:

func bytes2string(b []byte) string{
    return* (*string)(unsafe.Pointer(&b))
}

func string2bytes(s string) []byte {
	return* (* []byte)(unsafe.Pointer(
		&struct {
			string
			Cap int
		}{s, len(s)},
	))
}
Copy the code

Following the reverse string processing above, let’s try using no memory allocation and hit Run online:

func main(a) {
	str := "hello,world"
	fmt.Println("Original string:", str)
	bytes := reverse(string2bytes(str))
	fmt.Println("Reverse string:", bytes2string(reverse(bytes)))
}

func reverse(s []byte) []byte {
	for i, j := 0.len(s)- 1; i < j; i, j = i+1, j- 1 {
		s[i], s[j] = s[j], s[i]
	}
	return s
}

func bytes2string(b []byte) string {
	return* (*string)(unsafe.Pointer(&b))
}

func string2bytes(s string) []byte {
	return* (* []byte)(unsafe.Pointer(
		&struct {
			string
			Cap int
		}{s, len(s)},
	))
}
Copy the code

Running the above code we can see a SEGV memory error similar to the following:

unexpected fault address 0x461f48
fatal error: fault
[signal SIGSEGV: segmentation violation code=0x2 addr=0x461f48 pc=0x454f97]
Copy the code

This is because we operate directly on the character level, whereas the string level is stored in the.rodata section of the process memory layout, which is read-only. When we reverse a character, we write it, so the above segment error is reported at runtime, which is why the string in Go is read-only. The underlying structure of a string is as follows:

It seems impossible to invert strings with zero memory allocation all the way down. All you need to do is change the permissions on the memory in which the underlying content of the above character resides to make it writable. The MProtect system call is provided in Linux and can be used to change read and write permissions on a process memory page. Note that the smallest unit of mProtect operation is the memory page, and the address parameters passed in need to be aligned with page boundaries. The final code is as follows. Click Run Online.

func main(a) {
	str := "hello,world"

	sh := *(*reflect.StringHeader)(unsafe.Pointer(&str))
	page := getPage(uintptr(unsafe.Pointer(sh.Data)))
	syscall.Mprotect(page, syscall.PROT_READ|syscall.PROT_WRITE) // Change the read-only permission of the memory page

	fmt.Println("Original string:", str)
	bytes := (*(*[0xFF]byte)(unsafe.Pointer(sh.Data)))[:len(str)]
	fmt.Println("Reverse string:", bytes2string(reverse(bytes)))
}

func reverse(s []byte) []byte {
	for i, j := 0.len(s)- 1; i < j; i, j = i+1, j- 1 {
		s[i], s[j] = s[j], s[i]
	}
	return s
}

func getPage(p uintptr) []byte {
	return(* (* [0xFFFFFF]byte)(unsafe.Pointer(p & ^uintptr(syscall.Getpagesize()- 1))))[:syscall.Getpagesize()]
}

func bytes2string(b []byte) string {
	return* (*string)(unsafe.Pointer(&b))
}

func string2bytes(s string) []byte {
	return* (* []byte)(unsafe.Pointer(
		&struct {
			string
			Cap int
		}{s, len(s)},
	))
}
Copy the code

Mission accomplished. Note that the use of fixed-size arrays in the code above is not a perfect solution.