2021-03-05: In GO, I/O intensive applications, such as many file I/O, disk I/O, network I/O, will increase GOMAXPROCS help performance? Why is that? Answer 2021-03-05:
This is what they ask you in an interview. The real answer is not yet known.
Answer 1: Adjusting this parameter affects the number of P’s, which in turn affects the number of M’s (threads) working. That means you can have more threads of execution. Take network IO for example, network IO is asynchronous in Golang, using epoll pool for I/O reuse. Each network call is actually asynchronous, sending data to memory, the scheduling power can be transferred to other goroutine, so, in fact, if a thread can handle it, the performance is not bad, this time you add more P is not much improvement. It’s only noticeable if you can’t handle network IO on a single thread (each one is slow). This is a bit special if it’s disk IO. Disk IO is not asynchronous, there’s no AIO. So your disk IO calls will get stuck, and sysmon will wait for the system call to timeout before preempting M, which will take time. So, in this case, you can actually get some performance improvement by doing more M, which is equivalent to doing more M in parallel. In either case, the number of P’s is not recommended to exceed the number of local cpus. Because multiple cpus are truly executing in parallel, the upper layers are simulated by scheduling switches.
GOMAXPROCS uses the default number of hardware threads on the CPU, which is not appropriate for most IO intensive applications. At least 5 times the number of hardware threads should be configured, up to 256. The GO scheduler is dull, and it probably does nothing until M blocks. After a long time, a P/M is blocked by Syscall. And then, you’re going to force this P with the free M. Note: Scheduler dullness is not M dullness. M, the operating system thread, is very sensitive and will be scheduled by the operating system whenever it blocks (except in rare cases of spin). But GO’s scheduler waits for an interval before acting, again to reduce the number of times the scheduler intervenes. That is, if an API called by M causes the operating system thread to block, the operating system will immediately dispatch M to suspend and wait for the block to be cleared. At this point, the Go scheduler does not immediately snatch the P from M. So that’s going to result in some amount of P being wasted. This is why GOMAXPROCS that are too small, that is, the number of P’s is too small, and go programs that are IO intensive (or syscall heavy) will run slowly. So, GOMAXPROCS are large, more than 8 times the size of hardware threads, so is there any overhead? The answer is that there is overhead, but it is far less than the overhead of CPU underutilization caused by the Go runtime’s sluggish scheduling of M to rob P.
[GO] Configure GOMAXPROCS to double the performance
Have you set GOMAXPROCS correctly?
Go coroutine details
Go, IO intensive applications, such as lots of file IO, disk IO, network I… How to solve it?