Wechat search [brain into fried fish] follow this fried fish with fried liver. In this paper, making github.com/eddycjy/blo… Already included, there are my series of articles, materials and open source Go books.

Hello, I’m fried fish.

A few days ago we talked about what happens if you open two Goroutines on a single-core CPU and one of them loops forever? As we mentioned in one detail:

If you have a new friend, there’s a lot of questions about how to preempt the P in Go, and how does that work?

In today’s article we are going to decipher P preemption.

History of the scheduler

In the early days of Go, Goroutine was not designed to be preemptive. In the early days, Goroutine only triggered scheduling switch for read/write, active yield, lock, etc.

A serious problem with this is that if a Goroutine is blocking the call while the garbage collector is STW, the garbage collector will wait for it.

In this case, preemptive scheduling is needed to solve the problem. If a Goroutine takes too long to run, it needs to be preempted.

The Go language began implementing a preemptive scheduler in Go1.2 and continues to evolve to this day:

  • Go0.x: single-threaded scheduler.
  • Go1.0: Multithreaded scheduler.
  • Go1.1: Scheduler based on task theft.
  • Go1.2-go1.13: Collaborative based preemptive scheduler.
  • Go1.14: Signal-based preemptive scheduler.

A new proposal for the scheduler, Non-Uniform Memory Access (NUMA), was delayed because the implementation was too complex and the priority was not high enough.

Dmitry Vyukov proposed the NUMA-Aware Scheduler for Go.

Why preempt P

Why would you want to grab P? To put it bluntly, if you don’t grab P, you won’t have a chance to run. Or if resources are unevenly distributed,

This is obviously not reasonable in scheduler design.

As in this example:

// Main Goroutine 
func main(a) {
    // Simulate a single-core CPU
    runtime.GOMAXPROCS(1)
    
    // Simulate a Goroutine loop
    go func(a) {
        for {
        }
    }()

    time.Sleep(time.Millisecond)
    fmt.Println("Brain goes into fried fish.")}Copy the code

This example, in older versions of Go, would block forever and never see the light of day, a preemption scenario.

However, some friends may ask whether there will be new problems after preempting. Because M, which was using P, gets cold (M is bound to P), it can’t continue without P.

This is fine because the Goroutine is already blocked on the system call and no new calls will be executed for the time being.

But what if the code is able to run again after running for a long time (business allows long waits) and the Goroutine recovers from blocking and expects to continue running without P?

In this case, the Goroutine can check whether its M is still bound to P like any other Goroutine:

  • If P is present, the state can be adjusted to continue running.
  • If there is no P, you can grab P again, and then possess and bind P for your own use.

That is, preemption of P, which in itself is a two-way behavior, you grab my P, I can grab someone else’s P to keep going.

How to preempt P

After explaining why he preempted P, let’s dig a little deeper. How did “he” preempt P?

This brings us to the runtime.retake method mentioned earlier, which handles the following two scenarios:

  • Preemption blocks P on the system call.
  • Preemption running time too long G.

The analysis is as follows for the scenario of P preemption:

func retake(now int64) uint32 {
	n := 0
	// Lock all P to prevent change
	lock(&allpLock)
	// go to the main logic and loop through all P's
	for i := 0; i < len(allp); i++ {
		_p_ := allp[i]
		pd := &_p_.sysmontick
		s := _p_.status
		sysretake := false.if s == _Psyscall {
			// Check whether one Sysmon Tick cycle is exceeded
			t := int64(_p_.syscalltick)
			if! sysretake &&int64(pd.syscalltick) ! = t { pd.syscalltick =uint32(t)
				pd.syscallwhen = now
				continue}... } } unlock(&allpLock)return uint32(n)
}
Copy the code

This method locks allpLock first, which means what it means, and allpLock prevents the array from changing.

It protects the p-free reads and size changes of allP, idlepMask, and timerpMask attributes, as well as all writes to ALLP to avoid affecting subsequent operations.

The scene of a

Once the pre-processing is complete, the main logic is used to process allp (allp) one by one using the all-powerful for loop.

			t := int64(_p_.syscalltick)
			if! sysretake &&int64(pd.syscalltick) ! = t { pd.syscalltick =uint32(t)
				pd.syscallwhen = now
				continue
			}
Copy the code

In the first scenario, syscallTick is determined. If there is more than one Sysmon tick period (at least 20us) in the syscall, P is preempted from the syscall; otherwise, P is skipped.

Scenario 2

If not, the logic continues as follows:

func retake(now int64) uint32 {
	for i := 0; i < len(allp); i++ {
		...
		if s == _Psyscall {
			// Start at this point
			if runqempty(_p_) && 
      atomic.Load(&sched.nmspinning)+atomic.Load(&sched.npidle) > 0 && 
      pd.syscallwhen+10*1000*1000 > now {
				continue}... } } unlock(&allpLock)return uint32(n)
}
Copy the code

The second scenario focuses on this long list of judgments:

  • runqempty(_p_) == trueMethod determines whether the task queue P is empty to detect if any other tasks need to be executed.
  • atomic.Load(&sched.nmspinning)+atomic.Load(&sched.npidle) > 0It determines whether there is idle P and P that is scheduling to steal G.
  • pd.syscallwhen+10*1000*1000 > nowDetermines whether the system call time exceeds 10ms.

The odd thing here is that the runqEmpty method has already determined that there are no other tasks, which means that there are no tasks to be performed, so there is no need to rob P.

But the reality is that, in the end, you want to continue to occupy P because it might prevent the Sysmon thread from sleeping deeply.

After completing the above judgment, the stage of snatching P is entered:

func retake(now int64) uint32 {
	for i := 0; i < len(allp); i++ {
		...
		if s == _Psyscall {
			// Take the top half
			unlock(&allpLock)
			incidlelocked(- 1)
			if atomic.Cas(&_p_.status, s, _Pidle) {
				if trace.enabled {
					traceGoSysBlock(_p_)
					traceProcStop(_p_)
				}
				n++
				_p_.syscalltick++
				handoffp(_p_)
			}
			incidlelocked(1)
			lock(&allpLock)
		}
	}
	unlock(&allpLock)
	return uint32(n)
}
Copy the code
  • Unlock related attributes: need to be calledunlockMethods the unlockallpLock, so as to achieve the acquisitionsched.lockTo proceed to the next step.
  • Reducing idle M: The number of idle M needs to be reduced before atomic operation (CAS) (assuming one is running). Otherwise, it is possible to exit the system call, increment the NMIDLE, and report a deadlock event in the event of a snatch of M.
  • Modify P state: callatomic.CasMethod sets the state of the stolen P to idle so that it can be used by other M.
  • Rob P and regulate M: callhandoffpMethod snatches P from the system call or locked M, and the new M takes over the P.

conclusion

So far, the basic process of P preemption has been completed, and the following conditions can be obtained:

  1. If a system call timeout exists: if a task has more than one sysmon tick period (at least 20us), P is preempted from the system call.
  2. If there are no free P’s: all P’s are already bound to M. The need to preempt the P currently in the system call, which is not actually needed by the system call, allocates it to another M to schedule another G.
  3. If there is G waiting to run in the run queue of P, to ensure that G in the local queue of P is scheduled in time. And its own P is busy with system calls, too busy to manage. At this point, another M will be found to take over P, so as to realize the purpose of continuing to schedule G.

If you have any questions, welcome feedback and communication in the comments section. The best relationship is mutual achievement. Your praise is the biggest motivation for the creation of fried fish, thank you for your support.

This article is constantly updated. You can search “Brain into fried fish” on wechat to read it. Reply [000] There are the answers and materials for the interview algorithm of first-line big factories that I prepared. In this paper, making github.com/eddycjy/blo… Star has been included, welcome to urge more.

reference

  • NUMA-aware scheduler for Go
  • go-under-the-hood
  • In-depth analysis of Go-preemptive scheduling
  • Go language scheduler source code scenario analysis