Memory allocation for general programs

Before we talk about the memory allocation for Golang, let’s look at the memory distribution for general programs:

This is the logical classification of program memory.

Let’s look at a real (real logic) picture of memory for a typical program:

The core idea of memory allocation in Go

Go is a built-in runtime programming language (Runtime), and programming languages like the built-in runtime often dispense with traditional memory allocation and manage it themselves. In this way, operations such as pre-allocation and memory pooling can be performed to avoid performance problems caused by system calls and prevent the need for system calls every time memory is allocated.

The core ideas of Go memory allocation can be divided into the following points:

  • Each time you apply for a large chunk of memory from the operating system, Go allocates this chunk of memory to reduce system calls

  • The memory allocation algorithm adopts Google’s TCMalloc algorithm. The algorithm is more complex, and its principle can be consulted. The core idea is to split the memory into very small, multi-level management, to reduce the granularity of the lock.

  • When object memory is reclaimed, it is not actually freed; it is simply put back into a pre-allocated chunk of memory for reuse. Only when there is too much idle memory, it tries to return some memory to the operating system to reduce the overall overhead

Go memory structure

Go allocates a contiguous chunk of memory (virtual memory) when the program starts. As a whole:

The size of the span and bitmap in the figure changes as the heap changes

arena

The arena area is what is commonly called the heap. There are two kinds of “things” in the heap, in terms of management and use:

One is a large block of memory consisting of multiple contiguous pages from an administrative allocation perspective:There are many “objects” in the heap:

spans

The SPANS region, you can think of as the spans region for managing the assigned Arena (aka heap) described above. This area is storedmspanThe pointer,mspanWe’ll talk about what that is. The SPANS region is used to indicate which page of an Arena belongs tomspan.

Mspan is arguably the most basic unit of GO memory management, but memory usage ultimately comes down to “objects.” How does mSPAN relate to objects? In fact, “objects” must also be placed in the page, after all, page is the basic unit of memory storage.

Let’s take a look at how objects and memory are allocated in general

If you reallocate “p4”, is there not enough memory to allocate? Is there a lot of debris?

In this general allocation case, memory fragmentation occurs. How does GO solve this problem?

It can be summed up in two words: To each according to his needs. Go divides the memory blocks into 67 kinds of different sizes, and then divides the 67 kinds of large memory blocks into small ones one by one (which can be approximately understood as the equivalent of different sizespage) is calledspan(continuouspage) in gomspan.When allocating objects, select objects of similar size according to their sizespanIn this way, the fragmentation problem is solved.

The span code with different sizes in 67 is commented as follows (current version 1.11) :


     

    // class bytes/obj bytes/span objects tail waste max waste

    // 1 8 8192 1024 0 87.50%

    // 2 16 8192 512 0 43.75%

    // 3 32 8192 256 0 46.88%

    // 4 48 8192 170 32 31.52%

    // 5 64 8192 128 0 23.44%

    // 6 80 8192 102 32 19.07%

    // 7 96 8192 85 32 15.95%

    // 8 112 8192 73 16 13.56%

    // 9 128 8192 64 0 11.72%

    // 10 144 8192 56 128 11.82%

    // 11 160 8192 51 32 9.73%

    // 12 176 8192 46 96 9.59%

    // 13 192 8192 42 128 9.25%

    // 14 208 8192 39 80 8.12%

    // 15 224 8192 36 128 8.15%

    // 16 240 8192 34 32 6.62%

    // 17 256 8192 32 0 5.86%

    // 18 288 8192 28 128 12.16%

    // 19 320 8192 25 192 11.80%

    // 20 352 8192 23 96 9.88%

    // 21 384 8192 21 128 9.51%

    // 22 416 8192 19 288 10.71%

    // 23 448 8192 18 128 8.37%

    // 24 480 8192 17 32 6.82%

    // 25 512 8192 16 0 6.05%

    // 26 576 8192 14 128 12.33%

    // 27 640 8192 12 512 15.48%

    // 28 704 8192 11 448 13.93%

    // 29 768 8192 10 512 13.94%

    // 30 896 8192 9 128 15.52%

    // 31 1024 8192 8 0 12.40%

    // 32 1152 8192 7 128 12.41%

    // 33 1280 8192 6 512 15.55%

    // 34 1408 16384 11 896 14.00%

    // 35 1536 8192 5 512 14.00%

    // 36 1792 16384 9 256 15.57%

    // 37 2048 8192 4 0 12.45%

    // 38 2304 16384 7 256 12.46%

    // 39 2688 8192 3 128 15.59%

    // 40 3072 24576 8 0 12.47%

    // 41 3200 16384 5 384 6.22%

    // 42 3456 24576 7 384 8.83%

    // 43 4096 8192 2 0 15.60%

    // 44 4864 24576 5 256 16.65%

    // 45 5376 16384 3 256 10.92%

    // 46 6144 24576 4 0 12.48%

    // 47 6528 32768 5 128 6.23%

    // 48 6784 40960 6 256 4.36%

    // 49 6912 49152 7 768 3.37%

    // 50 8192 8192 1 0 15.61%

    // 51 9472 57344 6 512 14.28%

    // 52 9728 49152 5 512 3.64%

    // 53 10240 40960 40 4.99%

    // 54 10880 32768 3 128 6.24%

    // 55 12288 24576 2 0 11.45%

    // 56 13568 40960 3 256 9.99%

    // 57 14336 57344 4 0 5.35%

    // 58 16384 16384 1 0 12.49%

    // 59 18432 73728 4 0 11.11%

    // 60 19072 57344 3 128 3.57%

    // 61 20480 40960 20 6.87%

    // 62 21760 65536 3 256 6.25%

    // 63 24576 24576 1 0 11.45%

    // 64 27264 81920 3 128 10.00%

    // 65 28672 57344 2 0 4.91%

    // 66 32768 32768 1 0 12.50%

Copy the code

Say what each column represents:

  • Class: Class ID. Each SPAN structure has a class ID that represents the types of objects that can be processed by the span

  • Bytes /obj: This class represents the number of bytes of the object

  • Bytes/SPAN: Indicates the number of bytes occupied by each span. That is, the number of pages x the size of pages

  • Objects: Number of objects per span, aka (bytes/spans)/(bytes/obj)

  • Waste bytes: Memory fragmentation per span, aka (bytes/ SPANS) % (bytes/obj)

The reading method is as follows: Take the span of type (class) 1 as an example. The size of elements in the SPAN is 8 bytes. The span itself occupies a page, that is, 8 KB.

If you are careful, you may notice that there are 66 types in the code, as well as a special span: for objects larger than 32K, a special span is allocated directly from the heap. This special span is of type (class) 0 and contains only one large object. The span size is determined by the size of the object.

bitmap

There are several types of bitmaps :Stack, data, and BSS bitmaps, and heap bitmaps. The purpose of this bitmap is to tag objects in the arena(or heap). The first is to mark whether an object exists in the corresponding address, and the second is to mark whether the object has been marked by GC. One bit for each function, so heap bitmaps use two bits. A byte in the bitmap region corresponds to the four pointer size memory of the Arena region with the following structure:

Bitmap addresses grow from high to low.

Macro picture:

The main purpose of bitmaps is to serve GC.

arenaContains the basic snap-in and the objects or entities generated by the program runtime, respectivelyspansandbitmapThe two non-heap areas of memory correspond to. The logical diagram is as follows:Spans and Bitmap both dynamically size depending on arena dynamics.

Memory management component

The main memory management components of GO are MSPAN, McAche, McEntral, and MHeap

  • Mspan is the basic unit of memory management, where data is stored directly.

  • McAche: Each run-time Goroutine is bound to a McAche (specifically, the P in the bound GMP concurrency model, so mSPAN can be allocated without a lock, more on that later), and the McAche allocates the memory space required by the Goroutine to run (i.e., mspan).

  • McEntral shards backup Mspans for all McAche

  • Mheap represents all heap space held by the Go program. It also manages unused spans and requests new memory from the operating system as needed.

mspan

One might ask: Where is the MSPAN structure stored? In fact, the MEMORY of the MSPAN structure itself is allocated from the system and won’t be discussed here.mspanIn the above,spansIs convenient according to the size of the object to allocate the use of memory blocks, a total of 67 types; The main solution is memory fragmentation, which reduces memory fragmentation and improves memory usage.mspanIs a bidirectional linked list, where the main attributes are shown below:

Spans are the basic unit of memory management in Go, and YOU’ve been covered in detail on SPANS above, so I don’t want to go over them here.

mcache

In order to prevent multiple threads from locking memory, goroutine allocates a cache of span memory blocks for each thread. This cache is a McAche bound to each Goroutine. There is no lock contention between goroutines when applying for memory.

How did you do that?

Before we Go, please review the concurrent scheduling model of Go. If you are not familiar with it, please read my article on the Concurrent scheduling principle of Go

Then look at the picture below:

That’s basically what it looks like. Notice where our McAche is? Right on the P! You can see why there is no lock contention, because at runtime a Goroutine can only be associated with a P, and McAche is on P, so there can be no lock contention.

Let’s look at the structure of a McAche:

The SPAN list in McAche is divided into two groups, one for objects that contain pointer types and the other for objects that do not. Why split?

The main purpose of this is to facilitate GC, and there is no need to further scan the list of objects that do not contain Pointers for references to other active objects during garbage collection (if you are not familiar with GO’s GC, see my article to illustrate Golang’s GC algorithm).

Objects <= 32K are allocated directly with McAche.

Here, I think it is necessary to talk about the classification of objects in GO according to the size dimension. Divided into three categories:

  • Tinny allocations (size < 16 bytes, no Pointers)

  • small allocations (16 bytes < size <= 32k)

  • large allocations (size > 32k)

The first two categories: Tiny Allocations and Small Allocations are allocated directly through McAche.

For Tiny allocations, there is a tiny allocator that allocations objects that do not contain Pointers, such as small strings, independent escape variables that do not contain Pointers, and so on.

Small allocations is that McAche looks for its own size to match MSPAN based on object sizes. When McAch runs out of free space, a new MSPAN of the desired size is obtained from McEntral’s MSPANS list.

mcentral

Provides a shard Mspan for all McAche. Each McEntral keeps a list of global Mspans of a specific type, both allocated and unallocated.

Remember the 67 types of MSPAN? There are as many McEntral as there are types of Mspans.

Each McEntral contains a list of two Mspans:

  • There are no free objects or mspan lists that have been cached by McAche.

  • Empty MSPAN List with free objects

Because MSPAN is global and is accessed by all McAche, concurrency issues arise and McEntral has a lock.

A singlemcentralThe structure is as follows:

If McEntral has no free MSPAN list when it needs to allocate memory, it needs to fetch it from mheap.

mheap

Mheap can be considered the entire heap space held by the Go program, and MHeap is globally unique and can be considered a global variable. Its structure is as follows:

The Mheap contains everything except the McAche, which exists in P of the GMP scheduling model of Go, as mentioned above. You can refer to my article at https://mp.weixin.qq.com/s/74hbRTQ2TjdH5G9F2of4_g. On closer inspection, you can see that there is also a lock lock in the Mheap. What does this lock do?

As we know, objects larger than 32K are defined as large objects that are allocated directly from the Mheap. These large object requests are made by McAche, and McAche is on P. There are often multiple Ps when the program is running. Therefore, the memory requests are concurrent. So in order to be thread-safe, there must be a global lock.

If the allocated memory is not available in the MHEAP when it is needed, a new set of pages (minimum 1MB) is requested from the operating system.

Go Summary of memory allocation process

There are three types of objects:

  • Tiny object, size < 16B

  • Small object. 16 bytes < size <= 32K

  • Large object size > 32K

There are three ways of distribution:

  • Tinny allocations (size < 16 bytes, no Pointers) miniature allocator allocations.

  • Small allocations (size <= 32K) was normal; First by calculating the size specification used, then by using the block allocation of the corresponding size specification in McAche

  • Large allocations (size > 32K) Large object allocations; Direct allocation via Mheap. These large objects are requested at the expense of a global lock, so only one P can be requested at any given point in time.

Object allocation:

  • Size range in (size < 16B), does not contain Pointers to the object. Micro distributor distribution on McAche

  • Size range in (0 < size < 16B), including pointer object: normal allocation

  • Size range: (16B < size <= 32KB) : normal allocation

  • Size range: (size > 32KB) : large object allocation

Distribution order:

  • First by calculating the size specification used.

  • Then use the block allocation of the corresponding size specification in McAche.

  • If there are no blocks available in McEntral, they are applied to the Mheap and the algorithm finds the most appropriate MSPAN.

  • If the requested MSPAN exceeds the requested size, it will be shard based on demand to return the number of pages required by the user. The remaining pages form a new MSPAN and are put back into the mheap’s free list.

  • If there are no spans available in the Mheap, a new set of pages (minimum 1MB) is requested from the operating system.

The memory management of Go is very complicated, and each version has slight changes. Here, I only talk about the things that are easiest to master in a macro way. I hope you can give me your comments as much as possible.

Please follow my wechat official account:

    Internet Technology Nest

You can leave a message on the public account directly

References:

  • The distribution of program in memory at https://www.cnblogs.com/Lynn-Zhang/p/5449199.html

  • Starting from the memory allocation at https://mp.weixin.qq.com/s/EyWKFRu1xryoHY386QUcuA

  • Translation: Go memory allocator visual guide to https://www.linuxzen.com/go-memory-allocator-visual-guide.html

  • Diagram to Go language memory allocation at https://juejin.cn/post/6844903795739082760

  • Golang source exploration (3) the GC implementation principle of https://www.cnblogs.com/zkweb/p/7880099.html

  • Straightforward interpretation Go memory allocation principle of https://yq.aliyun.com/articles/652551

  • The rain mark < >

  • Go memory allocation (English) https://andrestc.com/post/go-memory-allocation-pt1/