The problem
While coding yesterday, I discovered a feature of the Go language that I hadn’t noticed before.
Take a look at the code below
import "strings"
func someFunc(a) {
s := "some words"
for i, c := range s {
strings.Contains("another string", c)
}
}
Copy the code
This code is actually uncompilable. Because the signature of Contains is func Contains(S, substr String) bool, and the type of C has become rune.
why
Rune is a rune in English, as anyone who plays fantasy RPGS should know. In Go, a rune represents a Unicode code point. Go’s for loop automatically parses the string as Unicode, returning I as a byte bit and C as a single Unicode character.
Take this code for example (from official documentation: golang.org/doc/effecti…)
for pos, char := range "Japanese \x80" { // \x80 is an illegal UTF-8 encoding
fmt.Printf("character %#U starts at byte position %d\n", char, pos)
}
Copy the code
prints
character U+65E5 'day' starts at byte position 0
character U+672C 'this' starts at byte position 3
character U+FFFD '�' starts at byte position 6
character U+8A9E 'language' starts at byte position 7
Copy the code
The answer
So there are two ways to improve my code.
Methods a
import "strings"
func someFunc() {
s := "some words"
for i, c := range s {
strings.Contains("another string", string(c))
}
}
Copy the code
Force c back to string.
Method 2
import "strings"
func someFunc(a) {
s := "some words"
for i, c := range s {
strings.ContainsRune("another string", c)
}
}
Copy the code
Use the ContainsRune function.
conclusion
Programmers working in non-English speaking countries have more experience with Unicode. I rarely deal with them myself. However, I like Go’s design for [] Byte, String, and Rune. This makes character processing very easy. As a former Python2 programmer, my quip is in place.
space.bilibili.com/16696495
Welcome to pay attention to my public number and B station!