When writing unit tests, I encountered a function introversion that led to the failure to execute unit tests, which I had not studied carefully before. Taking advantage of the project’s mandatory requirements for unit tests, I carefully studied the introversion mechanism of Go language in the process of strengthening the writing of unit tests, as summarized below.
What’s inside collect
Simply put, it is to call a very simple function, the content of the called function will be directly displayed in the call, avoiding the consumption of the function call.
Why be introverted
Function calls are not free; there are three steps. Create a new stack frame and record the details of the caller. Saves the contents of any registers that will be used by the called function to the stack. Calculates the address of the called function and executes a jump instruction to that new address.
Introversion has two reasons. First, it eliminates the overhead of function calls themselves. The second is that it enables the compiler to perform other optimization strategies more efficiently.
In any language, there is a cost to calling a function. Marshalling the parameters into registers or on the stack (depending on the ABI) can be costly when retrieving the results in reverse order. Introducing a function call causes the program counter to jump from one point in the instruction stream to another, which can cause the pipe to block. There is usually pre-processing inside a function, which requires a new stack frame to be prepared for the function’s execution, and subsequent processing, similar to pre-processing, which requires the stack frame space to be freed before being returned to the caller.
Function calls in Go consume additional resources to support dynamic stack growth. Upon entering the function, the amount of stack space available to goroutine is equal to the amount of space required by the function. If the available space is different, the pre-processing jumps to the runtime logic of copying the data to a new, larger space, which leads to a larger stack space. When this copy is complete, the runtime jumps back to the original function entry, performs stack space checks, and the function call continues. In this way, Goroutine can start with a very small stack space and claim more space as needed.
The check costs very little — just a few instructions — and because the Goroutine grows geometrically, the check rarely fails. Thus, the branch prediction unit of modern processors hides the cost of stack space checking by assuming that the check will succeed. When the processor predicts the wrong stack space check and must discard the operation it speculatively performs, the cost of pipe blocking is less than the cost of the operation needed to run to increase the stack space of the Goroutine.
While modern processors can optimize the overhead of generics and GO-specific elements in each function call with predictive execution techniques, those overhead cannot be completely eliminated, so there is a performance cost in each function call that performs the necessary work. The overhead of a function call itself is fixed, and it is more expensive to call smaller functions than larger ones because they do less useful work during each call.
The way to eliminate this overhead must be to eliminate the function call itself, which is what the Go compiler does, in some cases by replacing the function call with its contents. This process is called inlining because it expands the function body at the point of the function call.
Improved optimization Opportunities Dr. Cliff Click describes inlining as an optimization that modern compilers do, along with constant propagation and dead code elimination, as fundamental to compiler optimization. In fact, inlining lets the compiler look deeper, allowing it to look at the context of a particular function being called and see logic that can be further simplified or eliminated altogether. Because inlining can be performed recursively, this optimization can be performed not only in each individual function context, but also in the entire chain of function calls.
One downside of introversion is that when using inline functions, the resulting executable file becomes larger, so you need to consider the amount of code that is inlined, the number of calls, and the overhead of maintaining the inline relationship.
How to forbid introversion
You may want to disable introversion when writing unit tests. For individual methods, use //go:noinline, noting that there is no interval between // and go. For global use, you can increase Option by -gcflags=-l. Alternatively, -gcflags=”-l -l” turns on inlining, along with the more aggressive inlining strategy.
How do I view compiler optimizations?
The compiler will automatically help us decide whether to be introspect. We can do this by going build-gcflags =”-m -m”./… Current service overall introversion.
If you look at this code,
package main
func add(a, b int) int {
return a + b
}
func iter(num int) int {
res := 1
for i := 1; i <= num; i++ {
res = add(res, i)
}
return res
}
func main() {
n := 100
_ = iter(n)
}
Copy the code
Run the go build-gcFlags =”-m -m” main.go command to display the following information, where cost is the number of nodes, as described below.
./main.go:3:6: can inline add with cost 4 as: func(int, int) int { return a + b }
./main.go:7:6: cannot inline iter: unhandled op FOR
./main.go:10:12: inlining call to add func(int, int) int { return a + b }
./main.go:15:6: can inline main with cost 67 as: func() { n := 100; _ = iter(n) }
Copy the code
Which functions are not introverted?
Functions that include closure calls, defer, go, select, for, etc., will not be internalized. In addition to these issues, you need to determine the number of nodes requested in Go when parsing the AST. If the number of nodes exceeds 80, the function call will not be introverted.
Each node consumes one budget. For example, the line a = a + 1 contains five nodes: AS, NAME, ADD, NAME, LITERAL. The following is the corresponding SSA
In the SRC/CMD/compile/internal/gc/inl. Go file you can view the current operation cannot be reserved keywords.
case OCLOSURE,
OCALLPART,
ORANGE,
OFOR,
OFORUNTIL,
OSELECT,
OTYPESW,
OGO,
ODEFER,
ODCLTYPE, // can't print yet
OBREAK,
ORETJMP:
v.reason = "unhandled op " + n.Op.String()
return true
Copy the code
Will the stack still be printed after panic?
So for example,
package main
func sub(a, b int) {
a = a - b
panic("i am a panic information")}func max(a, b int) int {
if a < b {
sub(a, b)
}
return a
}
func main(a) {
x, y := 1.2
_ = max(x, y)
}
Copy the code
Execute the code to output the results, and still print out the specific error points.
panic: i am a panic information
goroutine 1 [running]:
main.sub(...)
/Users/slp/go/src/workspace/example/main.go:5
main.max(...)
/Users/slp/go/src/workspace/example/main.go:10
main.main()
/Users/slp/go/src/workspace/example/main.go:17 +0x3a
Copy the code
Because Go internally maintains an inlining tree for each goroutine with inlining optimization, the tree can be viewed by running the Go build-gcflags =”-d pctab=pctoinline” main. Go command.
funcpctab "".sub [valfunc=pctoinline]
...
wrote 3 bytes to 0xc000082668
00 42 00
funcpctab "".max [valfunc=pctoinline]
...
wrote 7 bytes to 0xc000082f68
00 3c 02 1d 01 09 00
-- inlining tree for "".max:
0 | -1 | "".sub (/Users/slp/go/src/workspace/example/main.go:10:6) pc=59
--
funcpctab "".main [valfunc=pctoinline]
...
wrote 11 bytes to 0xc0004807e8
00 1d 02 01 01 07 04 16 03 0c 00
-- inlining tree for "".main:
0 | -1 | "".max (/Users/slp/go/src/workspace/example/main.go:17:9) pc=30
1 | 0 | "".sub (/Users/slp/go/src/workspace/example/main.go:10:6) pc=29
--
Copy the code
Refer to the article
- Inline optimization in Go
- Dissecting 5 features that make the Go language efficient (2/5): Function calls are not free
- Go inline optimization in detail