Some time ago, a student said that the container of a service kept restarting because it exceeded the memory limit, and asked us whether there was a memory leak. We should quickly check and solve the problem to save the problem. Shocked, we quickly checked the monitoring + alarm system and performance analysis, and found that the application indicators were not high at all, not like there was a leak.

So what is the problem? We went into a container to check the system indicators of Top, and the results are as follows:

PID       VSZ    RSS   ... COMMAND
67459     2007m  136m  ... ./eddycjy-server
Copy the code

From the point of view of the result, there is nothing expensive, mainly a Go process, a student said that VSZ is so high, and the container memory index on a cloud actually happens to be close to the value of VSZ, so a student suspected that VSZ is the cause, think there is a certain correlation.

From the final conclusion, the above statement is not completely correct, so in today, this article will mainly focus on the Go process VSZ to analyze, see why it is so “high”, and before the formal analysis, the first section for the leading supplementary knowledge, we can read in order.

Basic knowledge of

What is a VSZ

VSZ is the total amount of virtual memory that the process can use. It includes all memory that the process can access, including memory that was swapped out (Swap), memory allocated but not used, and memory from shared libraries.

Why virtual memory

Previously we saw that VSZ is the total virtual memory size of the process, so if we want to understand VSZ, we need to first understand “why virtual memory?” .

Essentially, in a system of processes are Shared with other processes of the CPU and main memory resources, and in modern operating systems, the use of multiple processes are very common, so if too many processes need too much memory, so in the absence of virtual memory, physical memory is likely to be inadequate, will lead to some of the tasks can’t run, There are even some very strange phenomena, such as “one process accidentally writes to the memory used by another process”, which will cause memory destruction, so virtual memory is a very important medium.

What does virtual memory contain

The virtual memory is divided into kernel virtual memory and process virtual memory. The virtual memory of each process is independent, as shown in the figure above.

Here also added, in the kernel virtual memory, is included in the kernel code and data structures, and some parts of the kernel virtual memory will be mapped to all processes Shared physical pages, so you will see the “kernel virtual memory” is actually included in the physical memory mapping relationship, and they both exist. In application scenarios, each process also shares kernel code and global data structures, thus being mapped to the physical pages of all processes.

Important capabilities of virtual memory

In order to more effectively manage memory and reduce errors, modern system provides an abstract concept of main memory, which is today’s leading role, is called virtual memory (VM), virtual memory is hardware, hardware address translation, main memory, disk files and the kernel software interaction, it provides each process with a large, consistent and private address space, Virtual memory provides three important capabilities:

  1. It uses main memory efficiently by treating it like a cache of address space stored on disk, keeping only active areas in main memory, and passing data back and forth between disk and main memory as needed.
  2. It simplifies memory management by providing a consistent address space for each process.
  3. It protects the address space of each process from being corrupted by other processes.

summary

There may be a lot of divergence above. Simply speaking, we focus on these knowledge points in this paper as follows:

  • Virtual memory is where all kinds of memory interactions occur, and it contains more than just “itself”. For this article, we’ll focus on VSZ, or process virtual memory, which contains your code, data, heap, stack segments, and shared libraries.
  • Virtual memory, as a tool for memory protection, can ensure that the memory space between processes is independent from other processes. Therefore, the VSZ size of each process is different and does not affect each other.
  • With virtual memory, the total amount of memory allocated to each process can be greater than the actual available physical memory, so you will find that the physical memory of your process is always much lower than the virtual memory.

Troubleshoot problems

Now that we know the basics, we can start troubleshooting. The first step is to write a test program to see what VSZ looks like in the original Go program without any business logic.

test

Application code:

func main(a) {
	r := gin.Default()
	r.GET("/ping".func(c *gin.Context) {
		c.JSON(200, gin.H{
			"message": "pong",
		})
	})
	r.Run(": 8001")}Copy the code

Check the process status:

$ ps aux 67459USER PID %CPU %MEM VSZ RSS ... Eddycjy 67459 0.0 0.0 4297048 960...Copy the code

As a result, VSZ is 4297048K, that is, around 4G. At first glance, it is quite scary. There is no business logic, but why it is so high is really curious.

Make sure there are no leaks.

In the unknown, we can first look at Runtime.memStats and pprof to determine whether the application is leaking or not. However, this is a demo application, there is no business logic, so we can be sure that there is no direct relationship with the application.

# runtime.MemStats
# Alloc = 1298568
# TotalAlloc = 1298568
# Sys = 71893240
# Lookups = 0
# Mallocs = 10013
# Frees = 834
# HeapAlloc = 1298568
# HeapSys = 66551808
# HeapIdle = 64012288
# HeapInuse = 2539520
# HeapReleased = 64012288
# HeapObjects = 9179
...
Copy the code

Go FAQ

Then my first reaction was to check the Go FAQ (because I have seen it, I remember it). The question was “Why does my Go process use so much virtual memory?” , the answers are as follows:

The Go memory allocator reserves a large region of virtual memory as an arena for allocations. This virtual memory is local to the specific Go process; the reservation does not deprive other processes of memory.

To find the amount of actual memory allocated to a Go process, use the Unix top command and consult the RES (Linux) or RSIZE (macOS) columns.

This FAQ was submitted in October 2012, but there was no further explanation after so many years. After reviewing issues and forum, some closed issues pointed to FAQ, which obviously could not satisfy my thirst for knowledge, so I continued to explore to see what was in it.

Viewing a Memory Map

In the preceding figure, we mentioned the process virtual memory, which contains your code, data, heap, stack segment, and shared library. It is possible that the process did some memory mapping, which caused a large amount of memory to be reserved. To confirm this, we used the following command to check:

$ vmmap --wide 67459. ==== Non-writable regions for process 67459
REGION TYPE                      START - END             [ VSIZE  RSDNT  DIRTY   SWAP] PRT/MAX SHRMOD PURGE    REGION DETAIL
__TEXT                 00000001065ff000-000000010667b000 [  496K   492K     0K     0K] r-x/rwx SM=COW          /bin/zsh
__LINKEDIT             0000000106687000- 0000000106699000. [   72K    44K     0K     0K] r--/rwx SM=COW          /bin/zsh
MALLOC metadata        000000010669b000-000000010669c000 [    4K     4K     4K     0K] r--/rwx SM=COW          DefaultMallocZone_0x10669b000 zone structure
...
__TEXT                 00007fff76c31000-00007fff76c5f000 [  184K   168K     0K     0K] r-x/r-x SM=COW          /usr/lib/system/libxpc.dylib
__LINKEDIT             00007fffe7232000-00007ffff32cb000 [192.6M  17.4M     0K     0K] r--/r-- SM=COW          dyld shared cache combined __LINKEDIT
...        

==== Writable regions for process 67459
REGION TYPE                      START - END             [ VSIZE  RSDNT  DIRTY   SWAP] PRT/MAX SHRMOD PURGE    REGION DETAIL
__DATA                 000000010667b000- 0000000106682000. [   28K    28K    28K     0K] rw-/rwx SM=COW          /bin/zsh
...   
__DATA                 0000000106716000-000000010671e000 [   32K    28K    28K     4K] rw-/rwx SM=COW          /usr/lib/zsh/5.3/zsh/zle.so
__DATA                 000000010671e000-000000010671f000 [    4K     4K     4K     0K] rw-/rwx SM=COW          /usr/lib/zsh/5.3/zsh/zle.so
__DATA                 0000000106745000- 0000000106747000. [    8K     8K     8K     0K] rw-/rwx SM=COW          /usr/lib/zsh/5.3/zsh/complete.so
__DATA                 000000010675a000-000000010675b000 [    4K     4K     4K     0K] rw-
...
Copy the code

This section mainly uses macOS vmmap command to check the memory mapping of this process, so that you can know the memory mapping of this process. From the output analysis, it can be seen that these associated shared libraries do not occupy a large space, causing the root cause of VSZ is not shared libraries and binaries. But we don’t see a lot of memory reserved behavior, which is a problem.

Note: For Linux, you can run cat /proc/pid/maps or cat /proc/pid/smaps to view maps.

Viewing system calls

Since we don’t see any explicit memory reservation behavior in the memory map, let’s look at the system call of the process to determine whether it has memory operation behavior, as follows:

$ sudo dtruss -a ./awesomeProject
...
 4374/0x206a2:     15620       6      3 mprotect(0x1BC4000, 0x1000, 0x0)		 = 0 0
...
 4374/0x206a2:     15781       9      4 sysctl([CTL_HW, 3, 0, 0, 0, 0] (2), 0x7FFEEFBFFA64, 0x7FFEEFBFFA68, 0x0, 0x0)		 = 0 0
 4374/0x206a2:     15783       3      1 sysctl([CTL_HW, 7, 0, 0, 0, 0] (2), 0x7FFEEFBFFA64, 0x7FFEEFBFFA68, 0x0, 0x0)		 = 0 0
 4374/0x206a2:     15899       7      2 mmap(0x0, 0x40000, 0x3, 0x1002, 0xFFFFFFFFFFFFFFFF, 0x0)		 = 0x4000000 0
 4374/0x206a2:     15930       3      1 mmap(0xC000000000, 0x4000000, 0x0, 0x1002, 0xFFFFFFFFFFFFFFFF, 0x0)		 = 0xC000000000 0
 4374/0x206a2:     15934       4      2 mmap(0xC000000000, 0x4000000, 0x3, 0x1012, 0xFFFFFFFFFFFFFFFF, 0x0)		 = 0xC000000000 0
 4374/0x206a2:     15936       2      0 mmap(0x0, 0x2000000, 0x3, 0x1002, 0xFFFFFFFFFFFFFFFF, 0x0)		 = 0x59B7000 0
 4374/0x206a2:     15942       2      0 mmap(0x0, 0x210800, 0x3, 0x1002, 0xFFFFFFFFFFFFFFFF, 0x0)		 = 0x4040000 0
 4374/0x206a2:     15947       2      0 mmap(0x0, 0x10000, 0x3, 0x1002, 0xFFFFFFFFFFFFFFFF, 0x0)		 = 0x1BD0000 0
 4374/0x206a2:     15993       3      0 madvise(0xC000000000, 0x2000, 0x8)		 = 0 0
 4374/0x206a2:     16004       2      0 mmap(0x0, 0x10000, 0x3, 0x1002, 0xFFFFFFFFFFFFFFFF, 0x0)		 = 0x1BE0000 0
...
Copy the code

In this section, we listen to and look at all the system calls made by running this program using the macOS dtruss command and find the following methods that have some bearing on memory management:

  • Mmap: Creates a new virtual memory area, but note that when the system calls Mmap, it only allocates a portion of virtual memory. It does not allocate or map the actual physical memory. When you access this area, the actual physical memory is allocated at the current time. If the memory space is not officially used, there will be no increase in physical memory.
  • Madvise: Provides recommendations on the use of memory, such as MADV_NORMAL, MADV_RANDOM, MADV_SEQUENTIAL, MADV_WILLNEED, MADV_DONTNEED, and so on.
  • Mprotect: Sets the protection of memory areas, such as PROT_NONE, PROT_READ, PROT_WRITE, PROT_EXEC, PROT_SEM, PROT_SAO, PROT_GROWSUP, PROT_GROWSDOWN, etc.
  • Sysctl: Dynamically modifies kernel operating parameters while the kernel is running.

What is suspicious here is the Mmap method, which has been called more than 10 times in the final statistics of dTruss. We can believe that it has made a large number of virtual memory requests during Go Runtime, and then we can look down to see exactly at what stage the virtual memory requests are made.

Note: On Linux, use the strace command.

Look at the Go Runtime

Start the process

Through the above analysis, we can know that VSZ is not low when Go program is started, and it is confirmed that it is not a shared library, and the system call does call mmap and other methods when the program is started, so we can fully suspect that Go retains this memory space in the initialization stage. The first step is to check the Go boot startup process to see where the application is. The boot process is as follows:

graph TD A(rt0_darwin_amd64.s:8<br/>_rt0_amd64_darwin) -->|JMP| B(asm_amd64.s:15<br/>_rt0_amd64) B --> |JMP|C(asm_amd64.s:87<br/>runtime-rt0_go) C --> D(runtime1.go:60<br/>runtime-args) D --> E(os_darwin.go:50<br/>runtime-osinit) E --> F(proc.go:472<br/>runtime-schedinit) F --> G(proc.go:3236<br/>runtime-newproc) G --> H(proc.go:1170<br/>runtime-mstart) H --> I(run runtime-main on newly created p and m)Copy the code
  • Runtime-osinit: Obtains the number of CPU cores.
  • Runtime-schedinit: Initializes the environment in which a program runs (including stacks, memory allocators, garbage collection, P, etc.).
  • Runtime-newproc: Create a new G and bind runtime.main.
  • Run-time mstart: starts thread M.

Note: from @Cao Da “Go program startup process” and @Quan Cheng “Go program is how to run”, we recommend you to read.

Initialize the operating environment

Obviously, we want to look at the schedinit method in Runtime, as follows:

func schedinit() {
	...
	stackinit()
	mallocinit()
	mcommoninit(_g_.m)
	cpuinit()       // must run before alginit
	alginit()       // maps must not be used before this call
	modulesinit()   // provides activeModules
	typelinksinit() // uses maps, activeModules
	itabsinit()     // uses activeModules

	msigsave(_g_.m)
	initSigmask = _g_.m.sigmask

	goargs()
	goenvs()
	parsedebugvars()
	gcinit()
  ...
}
Copy the code

From the point of view of purpose, it is obvious that the mallocinit method initializes the memory allocator, so let’s move on.

Initialize the memory allocator

mallocinit

In the boot process, mallocinit is mainly responsible for the initialization of the Go program’s memory allocator. Today, mallocinit is mainly for the virtual memory address segment, as follows:

func mallocinit() { ... if sys.PtrSize == 8 { for i := 0x7f; i >= 0; i-- { var p uintptr switch { case GOARCH == "arm64" && GOOS == "darwin": p = uintptr(i)<<40 | uintptrMask&(0x0013<<28) case GOARCH == "arm64": p = uintptr(i)<<40 | uintptrMask&(0x0040<<32) case GOOS == "aix": if i == 0 { continue } p = uintptr(i)<<40 | uintptrMask&(0xa0<<52) case raceenabled: ... default: p = uintptr(i)<<40 | uintptrMask&(0x00c0<<32) } hint := (*arenaHint)(mheap_.arenaHintAlloc.alloc()) hint.addr = p hint.next, mheap_.arenaHints = mheap_.arenaHints, hint } } else { ... }}Copy the code
  • Determine whether the current system is 64-bit or 32-bit.
  • The reserved address is set from 0x7FC000000000 to 0x1C000000000.
  • Determine the currentGOARCH,GOOSOr whether race check is enabled, according to different situations to apply for different sizes of contiguous memory address, and herepIs the start address of the contiguous memory address to be applied for.
  • Save the information about the arena you just calculated toarenaHintIn the.

The range of addressing for different bits of virtual memory is different. Therefore, it is necessary to distinguish between them. Otherwise, there will be high virtual memory mapping problems. When requesting reserved space, we often refer to the arenaHint structure, which is a node in the arenaHints list and has the following structure:

type arenaHint struct {
	addr uintptr
	down bool
	next *arenaHint
}
Copy the code
  • Addr:arenaStart address of
  • Down: Is it the last onearena
  • Next: NextarenaHintPointer address of

Go Runtime divides the requested virtual memory into three chunks, as follows:

  • Spans: Record a map of an Arena area page number to Mspan.
  • Bitmap: Identifies the use of an arena. Functionally, it will be used to identify which spatial addresses of an arena already hold objects.
  • Arean: Arean is the Go heap area managed by Mheap and has a MaxMem of 512GB-1. In terms of functions, Go will apply for a continuous segment of virtual memory space address to arean during initialization and reserve the arean for processing when it really needs to apply for space on the heap, which will be transformed into physical memory.

In this case, you need to understand the role of arean in Go memory.

mmap

We already know what mallocinit is used for, but you might be wondering if the mMAP system call we saw earlier is related to it. Let’s take a look at the lower-level code below:

func sysAlloc(n uintptr, sysStat *uint64) unsafe.Pointer {
	p, err := mmap(nil, n, _PROT_READ|_PROT_WRITE, _MAP_ANON|_MAP_PRIVATE, -1, 0)
	...
	mSysStatInc(sysStat, n)
	return p
}

func sysReserve(v unsafe.Pointer, n uintptr) unsafe.Pointer {
	p, err := mmap(v, n, _PROT_NONE, _MAP_ANON|_MAP_PRIVATE, -1, 0)
	...
}

func sysMap(v unsafe.Pointer, n uintptr, sysStat *uint64) {
	...
	munmap(v, n)
	p, err := mmap(v, n, _PROT_READ|_PROT_WRITE, _MAP_ANON|_MAP_FIXED|_MAP_PRIVATE, -1, 0)
  ...
}
Copy the code

There are a number of system-level memory invocation methods in Go Runtime, and the main ones covered in this article are as follows:

  • SysAlloc: applies for the cleared memory space from the OS. The call parameter is_PROT_READ|_PROT_WRITE, _MAP_ANON|_MAP_PRIVATE, the resulting results need to be memory aligned.
  • SysReserve: The address space reserved for memory from the OS, when no physical memory has been allocated_PROT_NONE, _MAP_ANON|_MAP_PRIVATE, the resulting results need to be memory aligned.
  • SysMap: notifies the OS system that we want to use the reserved memory space with the call parameter_PROT_READ|_PROT_WRITE, _MAP_ANON|_MAP_FIXED|_MAP_PRIVATE.

This might seem like a good idea, but it’s not obvious where the mallocinit method refers to the mmap method when initialized.

for i := 0x7f; i >= 0; i-- {
	...
	hint := (*arenaHint)(mheap_.arenaHintAlloc.alloc())
	hint.addr = p
	hint.next, mheap_.arenaHints = mheap_.arenaHints, hint
}
Copy the code

In fact, when mheap_.arenahintalloc.alloc () is called, the sysAlloc method under the Mheap is called, and sysAlloc is related to the Mmap method, and this method is not the same as regular sysAlloc. As follows:

var mheap_ mheap ... func (h *mheap) sysAlloc(n uintptr) (v unsafe.Pointer, size uintptr) { ... for h.arenaHints ! = nil { hint := h.arenaHints p := hint.addr if hint.down { p -= n } if p+n < p { v = nil } else if arenaIndex(p+n-1) >= 1<<arenaBits { v = nil } else { v = sysReserve(unsafe.Pointer(p), n) } ... }Copy the code

You can be surprised to find that mheap.sysAlloc actually calls sysReserve, which is the specific method to reserve the address space of memory from the OS.

summary

In this section, we first wrote a test program, and then tracked the suspect step by step according to the unconventional investigation thinking. The overall process is as follows:

  • throughtoppsTo view the process running status and analyze basic indicators.
  • throughpprofruntime.MemStats Check the running status of the application and analyze whether the application level is compromised or high.
  • throughvmmapCommand to view the memory mapping of the process and check whether a certain area in the virtual space of the process is high, for example, the shared library.
  • throughdtrussCommand, view the program system call situation, analysis may appear some special behavior, such as: in the analysis we foundmmapThe percentage of method calls is high, and there is good reason to suspect that Go has made a lot of memory reservations at startup.
  • Go Runtime to do further source code analysis, because in front of the source code, there is no secret, no need to rely on guessing.

In conclusion, VSZ does not have much to do with shared libraries, etc., and is directly related to Go Runtime, which is the Runtime heap (MALloC) shown in the previous figure. By converting to Go Runtime, a certain amount of virtual space is reserved during the initialization phase of the mallocinit memory allocator.

But when reserving virtual memory space, what is affected, is also a philosophical question. From the source, mainly as follows:

  • It is influenced by different OS architectures (GOARCH/GOOS) and bits (32/64 bits).
  • Due to the effect of memory alignment, the calculated size of memory space needs to be aligned before it is reserved.

conclusion

Step by step, we explained where Go is, what factors it is subject to, and what methods are called to preserve so much virtual memory space, but we must be worried about the process virtual memory (VSZ) is high, there is no problem, I analyze as follows:

  • VSZ doesn’t mean you’re actually using that physical memory, so there’s nothing to worry about.
  • VSZ doesn’t stress the GC, which manages the physical memory the process actually uses, and VSZ doesn’t cost much until you actually use it.
  • VSZ is basically an inaccessible memory map, meaning it does not have access to memory (read, write, or execute).

thinking

It is a relief to see this, because Go VSZ is high, there is no very substantial problem for us, but on second thought, why does Go apply for so much virtual memory?

Overall consideration is as follows:

  • Go is designed with that in mindarenabitmapFor subsequent use, the entire memory address space is reserved in advance.
  • With the gradual use of Go Runtime and applications, you will certainly start to actually request and use memoryarenabitmapThe memory allocator can be used to change the allocated memory address space to the actual available physical memory, which can greatly improve performance.

My official account

Share Go language, micro service architecture and strange system design, welcome to pay attention to my public number and I exchange and communication.

The best relationship is mutual achievement. Your praise is the biggest motivation for the creation of fried fish. Thank you for your support.