.
Since its release, Go has been known for its high performance and high concurrency. Because the standard library provides HTTP packages, even novice programmers can easily write HTTP services.
However, every coin has two sides. A language, with its merits to be proud of, must also hide a lot of holes. Beginners who don’t know these pits can easily fall into them. This series of blog posts will start with panic and Recover in THE Go language, and introduce the various pits that the author has stepped on and how to fill them.
First know Panic and Recover
panic
The word panic, in English, means panic, etc. Literally, in Go, it means a very serious problem, one that programmers fear most. Once there, it means the end of the program and exit. The panic keyword in Go language is mainly used to actively throw exceptions, similar to the throw keyword in Java and other languages.
recover
Recover is a word that means to recover in English. Literally, in Go, it stands for restoring the state of a program from a serious error to a normal state. Go language recover keyword is mainly used to catch exceptions, let the program back to the normal state, similar to Java languages such as try… The catch.
The author has 6 years of Linux system C language development experience. C does not have the concept of exception catching, there is no try… Catch, there is no panic or recover. However, the difference between an exception and an if Error then return approach is mainly in the depth of the function call stack. The diagram below:
In normal logic, the function call stack is backtracked one by one, while exception catching can be understood as a long jump in the program call stack. This is done in C through setjump and longjump functions. For example:
#include <setjmp.h>
#include <stdio.h>
static jmp_buf env;
double divide(double to, double by)
{
if(by == 0)
{
longjmp(env, 1);
}
return to / by;
}
void test_divide(a)
{
divide(2.0);
printf("done\n");
}
int main(a)
{
if (setjmp(env) == 0)
{
test_divide(a); }else
{
printf("Cannot / 0\n");
return - 1;
}
return 0;
}
Copy the code
Due to a long jump, the normal execution flow was interrupted by a direct jump from divide to main. After compiling, the above code will print Cannot / 0 instead of done. Isn’t that amazing?
Try catch, Recover, setjump and other mechanisms will make the current state of the program (mainly the STACK pointer register SP of the CPU and the program counter PC, Go’s Recover relies on defer to maintain SP and PC) and is saved in a memory shared with Throw, Panic and Longjump. When there is an exception, the sp and PC register values saved before are extracted from the memory, and the function stack is directly moved back to the position pointed by SP, and the next instruction pointed by THE IP register is executed to restore the program from the abnormal state to the normal state.
Go deep into Panic and Recover
The source code
The source code for panic and recover can be found in the Go source code SRC /runtime/panic. Go, named gopanic and gorecover.
// Gopanic code, SRC/Runtime /panic. Go line 454
// The implementation of the predefined function panic
func gopanic(e interface{}) {
gp := getg()
ifgp.m.curg ! = gp {print("panic: ")
printany(e)
print("\n")
throw("panic on system stack")}ifgp.m.mallocing ! =0 {
print("panic: ")
printany(e)
print("\n")
throw("panic during malloc")}ifgp.m.preemptoff ! ="" {
print("panic: ")
printany(e)
print("\n")
print("preempt off reason: ")
print(gp.m.preemptoff)
print("\n")
throw("panic during preemptoff")}ifgp.m.locks ! =0 {
print("panic: ")
printany(e)
print("\n")
throw("panic holding locks")}var p _panic
p.arg = e
p.link = gp._panic
gp._panic = (*_panic)(noescape(unsafe.Pointer(&p)))
atomic.Xadd(&runningPanicDefers, 1)
for {
d := gp._defer
if d == nil {
break
}
// If the panic that triggered defer was triggered in the previous panic or Goexit defer, remove the previous defer from the list. The previous panic or Goexit will not continue.
if d.started {
ifd._panic ! =nil {
d._panic.aborted = true
}
d._panic = nil
d.fn = nil
gp._defer = d.link
freedefer(d)
continue
}
// Mark defer as started, but keep it on the list so that if stack growth or garbage collection occurs before ReflectCall starts executing D.stone, Traceback can find and update the parameter frames of defer.
d.started = true
// Save panic that is performing defer. If a new panic is triggered in the defer function of that panic, it will find D in the list and mark D. _panic as aborted.
d._panic = (*_panic)(noescape(unsafe.Pointer(&p)))
p.argp = unsafe.Pointer(getargp(0))
reflectcall(nil, unsafe.Pointer(d.fn), deferArgs(d), uint32(d.siz), uint32(d.siz))
p.argp = nil
// Reflectcall will not panic, remove d.
ifgp._defer ! = d { throw("bad defer entry in panic")
}
d._panic = nil
d.fn = nil
gp._defer = d.link
// GC() is used here to trigger stack shrinkage to test stack copy. Because it's test code, it's commented out. Reference stack_test. Go: TestStackPanic
//GC()
pc := d.pc
sp := unsafe.Pointer(d.sp) // must be a pointer to be adjusted during stack replication
The // defer handler is allocated dynamically and needs to be freed after execution. So, if defer is never executed (for example, if you keep creating defer in an infinite loop), it will cause a memory leak
freedefer(d)
if p.recovered {
atomic.Xadd(&runningPanicDefers, - 1)
gp._panic = p.link
// Exit panic already marked, but still left in the g.anic list, remove them from the list.
forgp._panic ! =nil && gp._panic.aborted {
gp._panic = gp._panic.link
}
if gp._panic == nil { // must be done with signal
gp.sig = 0
}
// Pass the recovering stack frame to recovery.
gp.sigcode0 = uintptr(sp)
gp.sigcode1 = pc
mcall(recovery)
throw("recovery failed") // McAll should not return}}// If all defer has been iterated, which means no recover (as mentioned earlier, McAll recovery does not return), proceed with subsequent panic processes, such as printing call stack information and error messages
// Since it is not safe to call any user code after freezing the world, we call preprintpanics to call all the Error and String methods necessary to prepare the String output from Panic before startPanic.
preprintpanics(gp._panic)
fatalpanic(gp._panic) // Should not return* (*int) (nil) = 0 // Since Fatalpanic should not return, it is not normally executed here. If it does, this line of code will trigger panic
}
Copy the code
// Gorecover code, SRC /runtime/panic. Go line 585
// Implement the predefined function recover.
// Cannot split the stack because it needs to reliably find its caller's stack segment.
//
// TODO(rsc): Once we commit to CopyStackAlways,
// this doesn't need to be nosplit.
//go:nosplit
func gorecover(argp uintptr) interface{} {
// When dealing with panic, the call to the Recover function must be placed in the top-level handler of defer.
// p.argp is the argument pointer to the top-level delay function call, compared to argp passed by the caller, which can be recovered if it is consistent.
gp := getg()
p := gp._panic
ifp ! =nil && !p.recovered && argp == uintptr(p.argp) {
p.recovered = true
return p.arg
}
return nil
}
Copy the code
From the function code, we can see that the main internal flow of Panic looks like this:
- Gets where the current caller is
g
, that is,goroutine
- Traverse and execute
g
In thedefer
function - if
defer
There are calls in the functionrecover
“And found that it had happenedpanic
, it willpanic
Marked asrecovered
- In a traverse
defer
Process if the discovery has been marked asrecovered
Is extracteddefer
Sp and PC, saved ing
In the two status code fields. - call
runtime.mcall
Cut tom->g0
And to jump torecovery
Function, which takes the previously obtainedg
Pass as a parameterrecovery
Function.runtime.mcall
The code in go source codesrc/runtime/asm_xxx.s
,xxx
Is the platform type, such asamd64
. The code is as follows:
// src/runtime/asm_amd64.S the first274Func McAll (fn func(*g)) // Switch to m->g0// Fn must never return. It should gogo(&g->sched) // To keep running g. runtime· McAll (SB), NOSPLIT, $0-8 MOVQ fn+0(FP), DI get_tls(CX) MOVQ g(CX), AX // save state in g->sched MOVQ 0(SP), BX // caller's PC
MOVQ BX, (g_sched+gobuf_pc)(AX)
LEAQ fn+0(FP), BX // caller's SP MOVQ BX, (g_sched+gobuf_sp)(AX) MOVQ AX, (g_sched+gobuf_g)(AX) MOVQ BP, (g_sched+gobuf_bp)(AX) // switch to m->g0 & its stack, call fn MOVQ g(CX), BX MOVQ g_m(BX), BX MOVQ m_g0(BX), SI CMPQ SI, AX // if g == m-> JNE 3(PC), MOVQ $runtime· MOVQ SI, g(CX) // g = m->g0 MOVQ (g_sched+gobuf_sp)(SI), SP // sp = m->g0->sched.sp PUSHQ AX MOVQ DI, DX MOVQ 0(DI), DI CALL DI POPQ AX MOVQ $runtime·badmcall2(SB), AX JMP AX RETCopy the code
M ->g0 = m->g0 = m->g0 = m->g0 = m->g0
recovery
In the delta function, thetag
The two status codes trace back to the stack pointer SP and restore the program counter PC to the scheduler, and callgogo
reschedulingg
That will beg
Revert to callrecover
Function position, goroutine continues execution. The code is as follows:
// Gorecover code, SRC/Runtime /panic. Go line 637
// After panic, when recover is called in a delay function, the stack is retraced and execution continues as if the caller of the delay function returns normally.
func recovery(gp *g) {
// Info about defer passed in G struct.
sp := gp.sigcode0
pc := gp.sigcode1
// The arguments to the delay function must already be stored on the stack.
ifsp ! =0 && (sp < gp.stack.lo || gp.stack.hi < sp) {
print("recover: ", hex(sp), " not in [", hex(gp.stack.lo), ",", hex(gp.stack.hi), "]\n")
throw("bad recovery")}// Let the deferProc of the deferred function return again, this time 1. Calling the function jumps to the standard return end.
gp.sched.sp = sp
gp.sched.pc = pc
gp.sched.lr = 0
gp.sched.ret = 1
gogo(&gp.sched)
}
Copy the code
// src/runtime/asm_amd64.S the first274Func gogo(buf *gobuf) // Restore state from gobuf; longjmpThe TEXT runtime, gogo (SB),NOSPLIT.$16-8
MOVQ buf+0(FP), BX // gobuf
MOVQ gobuf_g(BX), DX
MOVQ 0(DX), CX// make sure g ! = nil get_tls(CX)
MOVQ DX, g(CX)
MOVQ gobuf_sp(BX), SP// Restore from gobufSPTo make the jump laterMOVQ gobuf_ret(BX), AX
MOVQ gobuf_ctxt(BX), DX
MOVQ gobuf_bp(BX), BP
MOVQ $0, gobuf_sp(BX) // Here gobuf is cleaned up for garbage collection.MOVQ $0, gobuf_ret(BX)
MOVQ $0, gobuf_ctxt(BX)
MOVQ $0, gobuf_bp(BX)
MOVQ gobuf_pc(BX), BX// Recover PC from gobuf for jumpJMP BX
Copy the code
The above is the Go low-level exception processing process, simplified into three steps:
defer
Call from a functionrecover
- The trigger
panic
And cut to theruntime
Environmental capturedefer
Call therecover
的g
The sp and PC - Back to the
defer
中recover
The processing logic behind it
What are the pits
As mentioned earlier, the panic function is mainly used to actively trigger exceptions. When we implemented the business code, in the program startup stage, if the resource initialization error, we can actively call Panic to immediately end the program. For starters, this is fine and easy to do.
However, the reality can be harsh — Go’s Runtime code calls panic at various points, which is a lot of digging for newcomers who don’t know the underlying implementation of Go. It is impossible to write robust Go code without familiarity with these pits.
Next, the author gives you a fine count of what pits.
-
Slice subscript out of bounds
This one is easier to understand. For statically typed languages, an array index out of bounds is a fatal error. The following code can be verified:
package main
import (
"fmt"
)
func foo(a){
defer func(a){
if err := recover(a); err ! =nil {
fmt.Println(err)
}
}()
var bar = []int{1}
fmt.Println(bar[1])}func main(a){
foo()
fmt.Println("exit")}Copy the code
Output:
runtime error: index out of range
exit
Copy the code
Because recover is used in the code, the program is restored and exit is printed.
If you comment out the recover lines, the following log will be printed:
panic: runtime error: index out of range
goroutine 1 [running]:
main.foo()
/home/letian/work/go/src/test/test.go:14 +0x3e
main.main()
/home/letian/work/go/src/test/test.go:18 +0x22
exit status 2
Copy the code
-
Access an uninitialized pointer or a nil pointer
This should make sense to anyone with c/ C ++ development experience. But this is the most common type of error for beginners who have never used Pointers before. The following code can be verified:
package main
import (
"fmt"
)
func foo(a){
defer func(a){
if err := recover(a); err ! =nil {
fmt.Println(err)
}
}()
var bar *int
fmt.Println(*bar)
}
func main(a){
foo()
fmt.Println("exit")}Copy the code
Output:
runtime error: invalid memory address or nil pointer dereference
exit
Copy the code
If you comment out the recover lines, it will print:
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x4869ff]
goroutine 1 [running]:
main.foo()
/home/letian/work/go/src/test/test.go:14 +0x3f
main.main()
/home/letian/work/go/src/test/test.go:18 +0x22
exit status 2
Copy the code
-
Trying to go to something that’s already close
chan
Send data inThis is just learning how to use it
chan
Beginner’s mistakes. The following code can be verified:
package main
import (
"fmt"
)
func foo(a){
defer func(a){
if err := recover(a); err ! =nil {
fmt.Println(err)
}
}()
var bar = make(chan int.1)
close(bar)
bar<- 1
}
func main(a){
foo()
fmt.Println("exit")}Copy the code
Output:
send on closed channel
exit
Copy the code
If recover is commented out, it prints:
panic: send on closed channel
goroutine 1 [running]:
main.foo()
/home/letian/work/go/src/test/test.go:15 +0x83
main.main()
/home/letian/work/go/src/test/test.go:19 +0x22
exit status 2
Copy the code
SRC /runtime/chan.go = chansend; SRC /runtime/chan.go = chansend;
// SRC /runtime/chan.go line 269
// If block is not nil, the protocol will not sleep, but returns if it cannot complete.
// When closing channels in sleep, you can wake up sleep with g.param == nil.
// We can easily loop and rerun the operation and see that it is closed.
func chansend(c *hchan, ep unsafe.Pointer, block bool, callerpc uintptr) bool {
if c == nil {
if! block {return false
}
gopark(nil.nil, waitReasonChanSendNilChan, traceEvGoStop, 2)
throw("unreachable")}if debugChan {
print("chansend: chan=", c, "\n")}if raceenabled {
racereadpc(c.raceaddr(), callerpc, funcPC(chansend))
}
// Fast path: check for failed non-blocking operation without acquiring the lock.
//
// After observing that the channel is not closed, we observe that the channel is
// not ready for sending. Each of these observations is a single word-sized read
// (first c.closed and second c.recvq.first or c.qcount depending on kind of channel).
// Because a closed channel cannot transition from 'ready for sending' to
// 'not ready for sending', even if the channel is closed between the two observations,
// they imply a moment between the two when the channel was both not yet closed
// and not ready for sending. We behave as if we observed the channel at that moment,
// and report that the send cannot proceed.
//
// It is okay if the reads are reordered here: if we observe that the channel is not
// ready for sending and then observe that it is not closed, that implies that the
// channel wasn't closed during the first observation.
if! block && c.closed ==0 && ((c.dataqsiz == 0 && c.recvq.first == nil) ||
(c.dataqsiz > 0 && c.qcount == c.dataqsiz)) {
return false
}
var t0 int64
if blockprofilerate > 0 {
t0 = cputicks()
}
lock(&c.lock)
ifc.closed ! =0 {
unlock(&c.lock)
panic(plainError("send on closed channel"))}ifsg := c.recvq.dequeue(); sg ! =nil {
// Found a waiting receiver. We pass the value we want to send
// directly to the receiver, bypassing the channel buffer (if any).
send(c, sg, ep, func(a) { unlock(&c.lock) }, 3)
return true
}
if c.qcount < c.dataqsiz {
// Space is available in the channel buffer. Enqueue the element to send.
qp := chanbuf(c, c.sendx)
if raceenabled {
raceacquire(qp)
racerelease(qp)
}
typedmemmove(c.elemtype, qp, ep)
c.sendx++
if c.sendx == c.dataqsiz {
c.sendx = 0
}
c.qcount++
unlock(&c.lock)
return true
}
if! block { unlock(&c.lock)return false
}
// Block on the channel. Some receiver will complete our operation for us.
gp := getg()
mysg := acquireSudog()
mysg.releasetime = 0
ift0 ! =0 {
mysg.releasetime = - 1
}
// No stack splits between assigning elem and enqueuing mysg
// on gp.waiting where copystack can find it.
mysg.elem = ep
mysg.waitlink = nil
mysg.g = gp
mysg.isSelect = false
mysg.c = c
gp.waiting = mysg
gp.param = nil
c.sendq.enqueue(mysg)
goparkunlock(&c.lock, waitReasonChanSend, traceEvGoBlockSend, 3)
// Ensure the value being sent is kept alive until the
// receiver copies it out. The sudog has a pointer to the
// stack object, but sudogs aren't considered as roots of the
// stack tracer.
KeepAlive(ep)
// someone woke us up.
ifmysg ! = gp.waiting { throw("G waiting list is corrupted")
}
gp.waiting = nil
if gp.param == nil {
if c.closed == 0 {
throw("chansend: spurious wakeup")}panic(plainError("send on closed channel"))
}
gp.param = nil
if mysg.releasetime > 0 {
blockevent(mysg.releasetime-t0, 2)
}
mysg.c = nil
releaseSudog(mysg)
return true
}
Copy the code
-
Read and write the same map concurrently
For students who just learned concurrent programming, it is also easy to meet the problem of reading and writing map concurrently. The following code can be verified:
package main
import (
"fmt"
)
func foo(a){
defer func(a){
if err := recover(a); err ! =nil {
fmt.Println(err)
}
}()
var bar = make(map[int]int)
go func(a){
defer func(a){
if err := recover(a); err ! =nil {
fmt.Println(err)
}
}()
for{
_ = bar[1]}} ()for{
bar[1] =1}}func main(a){
foo()
fmt.Println("exit")}Copy the code
Output:
fatal error: concurrent map read and map write goroutine 5 [running]: runtime.throw(0x4bd8b0, 0 x21)/home/letian. GVM/gos go1.12 / SRC/runtime/panic. Go: 617 + 0 x72 fp = 0 xc00004c780 sp = 0 = 0 x427f22 xc00004c750 PCS runtime.mapaccess1_fast64(0x49eaa0, 0xc000088180, 0x1, 0 xc0000260d8)/home/letian. GVM/gos/go1.12 / SRC/runtime/map_fast64 go: 21 + 0 x1a8 fp = 0 xc00004c7a8 sp = 0 xc00004c780 pc=0x40eb58 main.foo.func2(0xc000088180) /home/letian/work/go/src/test/test.go:21 +0x5c fp=0xc00004c7d8 sp=0xc00004c7a8 PC = 0 x48708c runtime. Goexit ()/home/letian /. GVM gos/go1.12 / SRC/runtime/asm_amd64. S: 1337 + 0 x1 xc00004c7e0 fp = 0 sp=0xc00004c7d8 pc=0x450e51 created by main.foo /home/letian/work/go/src/test/test.go:14 +0x68 goroutine 1 [runnable]: main.foo() /home/letian/work/go/src/test/test.go:25 +0x8b main.main() /home/letian/work/go/src/test/test.go:30 +0x22 exit status 2Copy the code
If you are careful, you will notice that the exit we printed at the end of the program does not appear in the output log, but directly prints the call stack. Look at the code in SRC/Runtime /map.go and you’ll find these lines:
ifh.flags&hashWriting ! =0 {
throw("concurrent map read and map write")}Copy the code
Unlike the previous cases, the exception thrown by a call to the throw function in the Runtime cannot be caught by recover in the business code, which is the most fatal. So, where a map is read and written concurrently, the map should be locked.
-
Types of assertions
Use type assertion pairs
interface
It is also easy to accidentally step on a pit when casting, and this pit is used immediatelyinterface
For a while people also tend to ignore the problem. The following code can be verified:
package main
import (
"fmt"
)
func foo(a){
defer func(a){
if err := recover(a); err ! =nil {
fmt.Println(err)
}
}()
var i interface{} = "abc"
_ = i.([]string)}func main(a){
foo()
fmt.Println("exit")}Copy the code
Output:
interface conversion: interface {} is string, not []string
exit
Copy the code
SRC /runtime/iface. Go
// panicdottypeE is called when doing an e.(T) conversion and the conversion fails.
// have = the dynamic type we have.
// want = the static type we're trying to convert to.
// iface = the static type we're converting from.
func panicdottypeE(have, want, iface *_type) {
panic(&TypeAssertionError{iface, have, want, ""})}// panicdottypeI is called when doing an i.(T) conversion and the conversion fails.
// Same args as panicdottypeE, but "have" is the dynamic itab we have.
func panicdottypeI(have *itab, want, iface *_type) {
var t *_type
ifhave ! =nil {
t = have._type
}
panicdottypeE(t, want, iface)
}
Copy the code
More and more panic
There are many more places to use panic in the Go library. You can search for panic in the source code.
Due to the limitation of space, this article will not introduce the techniques of pit filling. Thanks for reading!
Next time forecast
Channel and Goroutine
Recommend the article
How to use Go to create a ten-million-level flow second kill system