Some time ago, a student said that the container of a service kept restarting because it exceeded the memory limit, and asked us whether there was a memory leak. We should quickly check and solve the problem to save the problem. Shocked, we quickly checked the monitoring + alarm system and performance analysis, and found that the application indicators were not high at all, not like there was a leak.
So what is the problem? We went into a container to check the system indicators of Top, and the results are as follows:
PID VSZ RSS ... COMMAND
67459 2007m 136m ... ./eddycjy-server
Copy the code
From the point of view of the result, there is nothing expensive, mainly a Go process, a student said that VSZ is so high, and the container memory index on a cloud actually happens to be close to the value of VSZ, so a student suspected that VSZ is the cause, think there is a certain correlation.
From the final conclusion, the above statement is not completely correct, so in today, this article will mainly focus on the Go process VSZ to analyze, see why it is so “high”, and before the formal analysis, the first section for the leading supplementary knowledge, we can read in order.
Basic knowledge of
What is a VSZ
VSZ is the total amount of virtual memory that the process can use. It includes all memory that the process can access, including memory that was swapped out (Swap), memory allocated but not used, and memory from shared libraries.
Why virtual memory
Previously we saw that VSZ is the total virtual memory size of the process, so if we want to understand VSZ, we need to first understand “why virtual memory?” .
Essentially, in a system of processes are Shared with other processes of the CPU and main memory resources, and in modern operating systems, the use of multiple processes are very common, so if too many processes need too much memory, so in the absence of virtual memory, physical memory is likely to be inadequate, will lead to some of the tasks can’t run, There are even some very strange phenomena, such as “one process accidentally writes to the memory used by another process”, which will cause memory destruction, so virtual memory is a very important medium.
What does virtual memory contain
The virtual memory is divided into kernel virtual memory and process virtual memory. The virtual memory of each process is independent, as shown in the figure above.
Here also added, in the kernel virtual memory, is included in the kernel code and data structures, and some parts of the kernel virtual memory will be mapped to all processes Shared physical pages, so you will see the “kernel virtual memory” is actually included in the physical memory mapping relationship, and they both exist. In application scenarios, each process also shares kernel code and global data structures, thus being mapped to the physical pages of all processes.
Important capabilities of virtual memory
In order to more effectively manage memory and reduce errors, modern system provides an abstract concept of main memory, which is today’s leading role, is called virtual memory (VM), virtual memory is hardware, hardware address translation, main memory, disk files and the kernel software interaction, it provides each process with a large, consistent and private address space, Virtual memory provides three important capabilities:
- It uses main memory efficiently by treating it like a cache of address space stored on disk, keeping only active areas in main memory, and passing data back and forth between disk and main memory as needed.
- It simplifies memory management by providing a consistent address space for each process.
- It protects the address space of each process from being corrupted by other processes.
summary
There may be a lot of divergence above. Simply speaking, we focus on these knowledge points in this paper as follows:
- Virtual memory is where all kinds of memory interactions occur, and it contains more than just “itself”. For this article, we’ll focus on VSZ, or process virtual memory, which contains your code, data, heap, stack segments, and shared libraries.
- Virtual memory, as a tool for memory protection, can ensure that the memory space between processes is independent from other processes. Therefore, the VSZ size of each process is different and does not affect each other.
- With virtual memory, the total amount of memory allocated to each process can be greater than the actual available physical memory, so you will find that the physical memory of your process is always much lower than the virtual memory.
Troubleshoot problems
Now that we know the basics, we can start troubleshooting. The first step is to write a test program to see what VSZ looks like in the original Go program without any business logic.
test
Application code:
func main(a) {
r := gin.Default()
r.GET("/ping".func(c *gin.Context) {
c.JSON(200, gin.H{
"message": "pong",
})
})
r.Run(": 8001")}Copy the code
Check the process status:
$ ps aux 67459USER PID %CPU %MEM VSZ RSS ... Eddycjy 67459 0.0 0.0 4297048 960...Copy the code
As a result, VSZ is 4297048K, that is, around 4G. At first glance, it is quite scary. There is no business logic, but why it is so high is really curious.
Make sure there are no leaks.
In the unknown, we can first look at Runtime.memStats and pprof to determine whether the application is leaking or not. However, this is a demo application, there is no business logic, so we can be sure that there is no direct relationship with the application.
# runtime.MemStats
# Alloc = 1298568
# TotalAlloc = 1298568
# Sys = 71893240
# Lookups = 0
# Mallocs = 10013
# Frees = 834
# HeapAlloc = 1298568
# HeapSys = 66551808
# HeapIdle = 64012288
# HeapInuse = 2539520
# HeapReleased = 64012288
# HeapObjects = 9179
...
Copy the code
Go FAQ
Then my first reaction was to check the Go FAQ (because I have seen it, I remember it). The question was “Why does my Go process use so much virtual memory?” , the answers are as follows:
The Go memory allocator reserves a large region of virtual memory as an arena for allocations. This virtual memory is local to the specific Go process; the reservation does not deprive other processes of memory.
To find the amount of actual memory allocated to a Go process, use the Unix top command and consult the RES (Linux) or RSIZE (macOS) columns.
This FAQ was submitted in October 2012, but there was no further explanation after so many years. After reviewing issues and forum, some closed issues pointed to FAQ, which obviously could not satisfy my thirst for knowledge, so I continued to explore to see what was in it.
Viewing a Memory Map
In the preceding figure, we mentioned the process virtual memory, which contains your code, data, heap, stack segment, and shared library. It is possible that the process did some memory mapping, which caused a large amount of memory to be reserved. To confirm this, we used the following command to check:
$ vmmap --wide 67459. ==== Non-writable regions for process 67459
REGION TYPE START - END [ VSIZE RSDNT DIRTY SWAP] PRT/MAX SHRMOD PURGE REGION DETAIL
__TEXT 00000001065ff000-000000010667b000 [ 496K 492K 0K 0K] r-x/rwx SM=COW /bin/zsh
__LINKEDIT 0000000106687000- 0000000106699000. [ 72K 44K 0K 0K] r--/rwx SM=COW /bin/zsh
MALLOC metadata 000000010669b000-000000010669c000 [ 4K 4K 4K 0K] r--/rwx SM=COW DefaultMallocZone_0x10669b000 zone structure
...
__TEXT 00007fff76c31000-00007fff76c5f000 [ 184K 168K 0K 0K] r-x/r-x SM=COW /usr/lib/system/libxpc.dylib
__LINKEDIT 00007fffe7232000-00007ffff32cb000 [192.6M 17.4M 0K 0K] r--/r-- SM=COW dyld shared cache combined __LINKEDIT
...
==== Writable regions for process 67459
REGION TYPE START - END [ VSIZE RSDNT DIRTY SWAP] PRT/MAX SHRMOD PURGE REGION DETAIL
__DATA 000000010667b000- 0000000106682000. [ 28K 28K 28K 0K] rw-/rwx SM=COW /bin/zsh
...
__DATA 0000000106716000-000000010671e000 [ 32K 28K 28K 4K] rw-/rwx SM=COW /usr/lib/zsh/5.3/zsh/zle.so
__DATA 000000010671e000-000000010671f000 [ 4K 4K 4K 0K] rw-/rwx SM=COW /usr/lib/zsh/5.3/zsh/zle.so
__DATA 0000000106745000- 0000000106747000. [ 8K 8K 8K 0K] rw-/rwx SM=COW /usr/lib/zsh/5.3/zsh/complete.so
__DATA 000000010675a000-000000010675b000 [ 4K 4K 4K 0K] rw-
...
Copy the code
This section mainly uses macOS vmmap command to check the memory mapping of this process, so that you can know the memory mapping of this process. From the output analysis, it can be seen that these associated shared libraries do not occupy a large space, causing the root cause of VSZ is not shared libraries and binaries. But we don’t see a lot of memory reserved behavior, which is a problem.
Note: For Linux, you can run cat /proc/pid/maps or cat /proc/pid/smaps to view maps.
Viewing system calls
Since we don’t see any explicit memory reservation behavior in the memory map, let’s look at the system call of the process to determine whether it has memory operation behavior, as follows:
$ sudo dtruss -a ./awesomeProject
...
4374/0x206a2: 15620 6 3 mprotect(0x1BC4000, 0x1000, 0x0) = 0 0
...
4374/0x206a2: 15781 9 4 sysctl([CTL_HW, 3, 0, 0, 0, 0] (2), 0x7FFEEFBFFA64, 0x7FFEEFBFFA68, 0x0, 0x0) = 0 0
4374/0x206a2: 15783 3 1 sysctl([CTL_HW, 7, 0, 0, 0, 0] (2), 0x7FFEEFBFFA64, 0x7FFEEFBFFA68, 0x0, 0x0) = 0 0
4374/0x206a2: 15899 7 2 mmap(0x0, 0x40000, 0x3, 0x1002, 0xFFFFFFFFFFFFFFFF, 0x0) = 0x4000000 0
4374/0x206a2: 15930 3 1 mmap(0xC000000000, 0x4000000, 0x0, 0x1002, 0xFFFFFFFFFFFFFFFF, 0x0) = 0xC000000000 0
4374/0x206a2: 15934 4 2 mmap(0xC000000000, 0x4000000, 0x3, 0x1012, 0xFFFFFFFFFFFFFFFF, 0x0) = 0xC000000000 0
4374/0x206a2: 15936 2 0 mmap(0x0, 0x2000000, 0x3, 0x1002, 0xFFFFFFFFFFFFFFFF, 0x0) = 0x59B7000 0
4374/0x206a2: 15942 2 0 mmap(0x0, 0x210800, 0x3, 0x1002, 0xFFFFFFFFFFFFFFFF, 0x0) = 0x4040000 0
4374/0x206a2: 15947 2 0 mmap(0x0, 0x10000, 0x3, 0x1002, 0xFFFFFFFFFFFFFFFF, 0x0) = 0x1BD0000 0
4374/0x206a2: 15993 3 0 madvise(0xC000000000, 0x2000, 0x8) = 0 0
4374/0x206a2: 16004 2 0 mmap(0x0, 0x10000, 0x3, 0x1002, 0xFFFFFFFFFFFFFFFF, 0x0) = 0x1BE0000 0
...
Copy the code
In this section, we listen to and look at all the system calls made by running this program using the macOS dtruss command and find the following methods that have some bearing on memory management:
- Mmap: Creates a new virtual memory area, but note that when the system calls Mmap, it only allocates a portion of virtual memory. It does not allocate or map the actual physical memory. When you access this area, the actual physical memory is allocated at the current time. If the memory space is not officially used, there will be no increase in physical memory.
- Madvise: Provides recommendations on the use of memory, such as MADV_NORMAL, MADV_RANDOM, MADV_SEQUENTIAL, MADV_WILLNEED, MADV_DONTNEED, and so on.
- Mprotect: Sets the protection of memory areas, such as PROT_NONE, PROT_READ, PROT_WRITE, PROT_EXEC, PROT_SEM, PROT_SAO, PROT_GROWSUP, PROT_GROWSDOWN, etc.
- Sysctl: Dynamically modifies kernel operating parameters while the kernel is running.
What is suspicious here is the Mmap method, which has been called more than 10 times in the final statistics of dTruss. We can believe that it has made a large number of virtual memory requests during Go Runtime, and then we can look down to see exactly at what stage the virtual memory requests are made.
Note: On Linux, use the strace command.
Look at the Go Runtime
Start the process
Through the above analysis, we can know that VSZ is not low when Go program is started, and it is confirmed that it is not a shared library, and the system call does call mmap and other methods when the program is started, so we can fully suspect that Go retains this memory space in the initialization stage. The first step is to check the Go boot startup process to see where the application is. The boot process is as follows:
graph TD A(rt0_darwin_amd64.s:8<br/>_rt0_amd64_darwin) -->|JMP| B(asm_amd64.s:15<br/>_rt0_amd64) B --> |JMP|C(asm_amd64.s:87<br/>runtime-rt0_go) C --> D(runtime1.go:60<br/>runtime-args) D --> E(os_darwin.go:50<br/>runtime-osinit) E --> F(proc.go:472<br/>runtime-schedinit) F --> G(proc.go:3236<br/>runtime-newproc) G --> H(proc.go:1170<br/>runtime-mstart) H --> I(run runtime-main on newly created p and m)Copy the code
- Runtime-osinit: Obtains the number of CPU cores.
- Runtime-schedinit: Initializes the environment in which a program runs (including stacks, memory allocators, garbage collection, P, etc.).
- Runtime-newproc: Create a new G and bind runtime.main.
- Run-time mstart: starts thread M.
Note: from @Cao Da “Go program startup process” and @Quan Cheng “Go program is how to run”, we recommend you to read.
Initialize the operating environment
Obviously, we want to look at the schedinit method in Runtime, as follows:
func schedinit() {
...
stackinit()
mallocinit()
mcommoninit(_g_.m)
cpuinit() // must run before alginit
alginit() // maps must not be used before this call
modulesinit() // provides activeModules
typelinksinit() // uses maps, activeModules
itabsinit() // uses activeModules
msigsave(_g_.m)
initSigmask = _g_.m.sigmask
goargs()
goenvs()
parsedebugvars()
gcinit()
...
}
Copy the code
From the point of view of purpose, it is obvious that the mallocinit method initializes the memory allocator, so let’s move on.
Initialize the memory allocator
mallocinit
In the boot process, mallocinit is mainly responsible for the initialization of the Go program’s memory allocator. Today, mallocinit is mainly for the virtual memory address segment, as follows:
func mallocinit() { ... if sys.PtrSize == 8 { for i := 0x7f; i >= 0; i-- { var p uintptr switch { case GOARCH == "arm64" && GOOS == "darwin": p = uintptr(i)<<40 | uintptrMask&(0x0013<<28) case GOARCH == "arm64": p = uintptr(i)<<40 | uintptrMask&(0x0040<<32) case GOOS == "aix": if i == 0 { continue } p = uintptr(i)<<40 | uintptrMask&(0xa0<<52) case raceenabled: ... default: p = uintptr(i)<<40 | uintptrMask&(0x00c0<<32) } hint := (*arenaHint)(mheap_.arenaHintAlloc.alloc()) hint.addr = p hint.next, mheap_.arenaHints = mheap_.arenaHints, hint } } else { ... }}Copy the code
- Determine whether the current system is 64-bit or 32-bit.
- The reserved address is set from 0x7FC000000000 to 0x1C000000000.
- Determine the current
GOARCH
,GOOS
Or whether race check is enabled, according to different situations to apply for different sizes of contiguous memory address, and herep
Is the start address of the contiguous memory address to be applied for. - Save the information about the arena you just calculated to
arenaHint
In the.
The range of addressing for different bits of virtual memory is different. Therefore, it is necessary to distinguish between them. Otherwise, there will be high virtual memory mapping problems. When requesting reserved space, we often refer to the arenaHint structure, which is a node in the arenaHints list and has the following structure:
type arenaHint struct {
addr uintptr
down bool
next *arenaHint
}
Copy the code
- Addr:
arena
Start address of - Down: Is it the last one
arena
- Next: Next
arenaHint
Pointer address of
Go Runtime divides the requested virtual memory into three chunks, as follows:
- Spans: Record a map of an Arena area page number to Mspan.
- Bitmap: Identifies the use of an arena. Functionally, it will be used to identify which spatial addresses of an arena already hold objects.
- Arean: Arean is the Go heap area managed by Mheap and has a MaxMem of 512GB-1. In terms of functions, Go will apply for a continuous segment of virtual memory space address to arean during initialization and reserve the arean for processing when it really needs to apply for space on the heap, which will be transformed into physical memory.
In this case, you need to understand the role of arean in Go memory.
mmap
We already know what mallocinit is used for, but you might be wondering if the mMAP system call we saw earlier is related to it. Let’s take a look at the lower-level code below:
func sysAlloc(n uintptr, sysStat *uint64) unsafe.Pointer {
p, err := mmap(nil, n, _PROT_READ|_PROT_WRITE, _MAP_ANON|_MAP_PRIVATE, -1, 0)
...
mSysStatInc(sysStat, n)
return p
}
func sysReserve(v unsafe.Pointer, n uintptr) unsafe.Pointer {
p, err := mmap(v, n, _PROT_NONE, _MAP_ANON|_MAP_PRIVATE, -1, 0)
...
}
func sysMap(v unsafe.Pointer, n uintptr, sysStat *uint64) {
...
munmap(v, n)
p, err := mmap(v, n, _PROT_READ|_PROT_WRITE, _MAP_ANON|_MAP_FIXED|_MAP_PRIVATE, -1, 0)
...
}
Copy the code
There are a number of system-level memory invocation methods in Go Runtime, and the main ones covered in this article are as follows:
- SysAlloc: applies for the cleared memory space from the OS. The call parameter is
_PROT_READ|_PROT_WRITE, _MAP_ANON|_MAP_PRIVATE
, the resulting results need to be memory aligned. - SysReserve: The address space reserved for memory from the OS, when no physical memory has been allocated
_PROT_NONE, _MAP_ANON|_MAP_PRIVATE
, the resulting results need to be memory aligned. - SysMap: notifies the OS system that we want to use the reserved memory space with the call parameter
_PROT_READ|_PROT_WRITE, _MAP_ANON|_MAP_FIXED|_MAP_PRIVATE
.
This might seem like a good idea, but it’s not obvious where the mallocinit method refers to the mmap method when initialized.
for i := 0x7f; i >= 0; i-- {
...
hint := (*arenaHint)(mheap_.arenaHintAlloc.alloc())
hint.addr = p
hint.next, mheap_.arenaHints = mheap_.arenaHints, hint
}
Copy the code
In fact, when mheap_.arenahintalloc.alloc () is called, the sysAlloc method under the Mheap is called, and sysAlloc is related to the Mmap method, and this method is not the same as regular sysAlloc. As follows:
var mheap_ mheap ... func (h *mheap) sysAlloc(n uintptr) (v unsafe.Pointer, size uintptr) { ... for h.arenaHints ! = nil { hint := h.arenaHints p := hint.addr if hint.down { p -= n } if p+n < p { v = nil } else if arenaIndex(p+n-1) >= 1<<arenaBits { v = nil } else { v = sysReserve(unsafe.Pointer(p), n) } ... }Copy the code
You can be surprised to find that mheap.sysAlloc actually calls sysReserve, which is the specific method to reserve the address space of memory from the OS.
summary
In this section, we first wrote a test program, and then tracked the suspect step by step according to the unconventional investigation thinking. The overall process is as follows:
- through
top
或ps
To view the process running status and analyze basic indicators. - through
pprof
或runtime.MemStats
Check the running status of the application and analyze whether the application level is compromised or high. - through
vmmap
Command to view the memory mapping of the process and check whether a certain area in the virtual space of the process is high, for example, the shared library. - through
dtruss
Command, view the program system call situation, analysis may appear some special behavior, such as: in the analysis we foundmmap
The percentage of method calls is high, and there is good reason to suspect that Go has made a lot of memory reservations at startup. - Go Runtime to do further source code analysis, because in front of the source code, there is no secret, no need to rely on guessing.
In conclusion, VSZ does not have much to do with shared libraries, etc., and is directly related to Go Runtime, which is the Runtime heap (MALloC) shown in the previous figure. By converting to Go Runtime, a certain amount of virtual space is reserved during the initialization phase of the mallocinit memory allocator.
But when reserving virtual memory space, what is affected, is also a philosophical question. From the source, mainly as follows:
- It is influenced by different OS architectures (GOARCH/GOOS) and bits (32/64 bits).
- Due to the effect of memory alignment, the calculated size of memory space needs to be aligned before it is reserved.
conclusion
Step by step, we explained where Go is, what factors it is subject to, and what methods are called to preserve so much virtual memory space, but we must be worried about the process virtual memory (VSZ) is high, there is no problem, I analyze as follows:
- VSZ doesn’t mean you’re actually using that physical memory, so there’s nothing to worry about.
- VSZ doesn’t stress the GC, which manages the physical memory the process actually uses, and VSZ doesn’t cost much until you actually use it.
- VSZ is basically an inaccessible memory map, meaning it does not have access to memory (read, write, or execute).
thinking
It is a relief to see this, because Go VSZ is high, there is no very substantial problem for us, but on second thought, why does Go apply for so much virtual memory?
Overall consideration is as follows:
- Go is designed with that in mind
arena
和bitmap
For subsequent use, the entire memory address space is reserved in advance. - With the gradual use of Go Runtime and applications, you will certainly start to actually request and use memory
arena
和bitmap
The memory allocator can be used to change the allocated memory address space to the actual available physical memory, which can greatly improve performance.
My official account
Share Go language, micro service architecture and strange system design, welcome to pay attention to my public number and I exchange and communication.
The best relationship is mutual achievement. Your praise is the biggest motivation for the creation of fried fish. Thank you for your support.