Go language built-in runtime (namely runtime), abandoned the traditional memory allocation method, instead of self-management. This allows you to autonomously implement better memory usage patterns, such as memory pooling, pre-allocation, and so on. This way, no system call is required for every memory allocation.

The memory allocation algorithm of Golang runtime is mainly derived from the TCMalloc algorithm developed by Google for C language, which stands for Thread-Cache Malloc. The core idea is to divide memory into multiple levels of management to reduce the granularity of locks. It manages the available heap memory in a two-level allocation mode: each thread maintains an independent memory pool and allocates memory from this pool first. When the memory pool is insufficient, it applies to the global memory pool to avoid frequent competition between different threads for the global memory pool.

For a better reading experience, manually paste the article contents:

Basic concept

When the Go program starts, it requests a chunk of memory from the operating system (note that this is still a virtual address space, and no memory is actually allocated), cuts it up and manages it itself.

The requested memory block is allocated three regions, respectively 512MB, 16GB, and 512GB on the X64.

The Arena area is what we call the heap area, where Go dynamically allocates memory, and it divides memory into 8KB pages, some of which are grouped together called Mspans.

The bitmap area identifies which addresses hold objects in the Arena area and uses a 4-bit flag to indicate whether the object contains Pointers and GC flag information. One byte of memory in the bitmap corresponds to four pointer sizes (pointer size is 8B) in the Arena area, so the size of the bitmap area is 512GB/(4 x 8B)=16GB. The diagram below:

In the figure above, you can also see that the high-address part of the Bitmap points to the low-address part of the Arena region, which means that the bitmap addresses grow from the high-address to the low-address part.

The SPANS region holds Pointers to A SPAN (aka a couple of arena-split pages, more on that later), each pointer to a page, so the SPANS region is 512GB/8KB*8B=512MB. Divide by 8KB to calculate the number of pages in the ARENA region, and finally multiply by 8 to calculate the size of all the SPANS Pointers. When you create a SPANS, you populate the SPANS region on a per-page basis, and when you reclaim an object, you can easily find the SPANS it belongs to based on its address.

Memory management unit

Mspan: The basic unit of memory management in Go, which is a large block of memory consisting of a contiguous 8KB page. Note that the page here is not the same thing as the page of the operating system itself, which is typically several times the size of the operating system page. In a nutshell: MSPAN is a two-ended linked list containing the starting address, mSPAN size, number of pages, and so on.

Each MSPAN is divided into several objects according to the size of its own attribute SizeClass. Each object can store one object. And it uses a bitmap to mark its unused objects. The SizeClass attribute determines the size of an object, while mSPAN allocates only objects that are close to the size of an object, but smaller than the size of an object. Another concept: SpanClass, which is similar to SizeClass,


     

    Size_Class = Span_Class / 2

Copy the code

This is because each SizeClass has two MSpans, so there are two Spanclasses. One is assigned to objects that have Pointers, and the other to objects that don’t have Pointers. This will benefit the garbage collection mechanism, which will be discussed in a later article.

In the figure below, MSPAN consists of a set of contiguous pages divided into objects of a certain size.

There are 67 mSPAN SizeClass types in Go1.9.2. Each mSPAN object size is a multiple of 8*2n, which is written in the code:


     

    // path: /usr/local/go/src/runtime/sizeclasses.go

    const _NumSizeClasses = 67

    var class_to_size = [_NumSizeClasses]uint16{0, 8, 16, 32, 48, 64, 80, 96, 112, 128, 144, 160, 176, 192, 208, 224, 240, 256, 288, 320, 352, 384, 416, 448, 480, 512, 576, 640, 704, 768, 896, 1024, 1152, 1280, 1408, 1536,1792, 2048, 2304, 2688, 3072, 3200, 3456, 4096, 4864, 5376, 6144, 6528, 6784, 6912, 8192, 9472, 9728, 10240, 10880, 12288, 13568, 14336, 16384, 18432, 19072, 20480, 21760, 24576, 27264, 28672, 32768}

Copy the code

The size of mspan objects can be determined based on the size of mspan objects. For example, if size eclass is 3, object is 32B. A 32B object can store objects whose sizes range from 17B to 32B. For small objects (less than 16B), the allocator merges them, assigning several objects to the same object.

The largest number in the array is 32768, which is 32KB. Anything larger than that is a large object, and it is treated very differently, which we’ll talk about later. Incidentally, type SizeClass is 0 for large objects, which are actually allocated directly by heap memory, while small objects are allocated by MSPAN.

For mSPAN, its SizeClass determines the number of pages it gets, which is also written in code:


     

    // path: /usr/local/go/src/runtime/sizeclasses.go

    const _NumSizeClasses = 67

    var class_to_allocnpages = [_NumSizeClasses]uint8{0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 2, 1, 2, 1, 3, 2, 3, 1, 3, 2, 3, 4, 5, 6, 1, 7, 6, 5, 4, 3, 5, 7, 2, 9, 7, 5, 8, 3, 10, 7, 4}

Copy the code

For example, when applying for a MSPAN with object size 32B, the classtosize index is 3 and the number of pages in the class_to_allocnPages array is 1.

Mspan structure definition:


     

    // path: /usr/local/go/src/runtime/mheap.go

    type mspan struct {

    // List forward pointer, used to link spans together

       next *mspan

    // List forward pointer, used to link spans together

       prev *mspan

    // Start address, which is the address of the managed page

       startAddr uintptr

    // Manage the number of pages

       npages uintptr

    // Number of blocks, indicating the number of blocks available for allocation

       nelems uintptr

    // Allocate bitmaps, each representing whether a block has been allocated

       allocBits *gcBits

    // The number of allocated blocks

       allocCount uint16

    // The class ID in the class table is related to Size Classs

       spanclass spanClass  

    // The size of the object in the class table, i.e. the block size

       elemsize uintptr

    }

Copy the code

Let’s put MSPAN into a larger perspective:

In the figure above, you can see that there are two S’s pointing to the same Mspan, because the P they point to belongs to the same Mspan. So, you can quickly find the S that points to it by the address on the arena, and you can quickly find the MSPAN by the S. Recall earlier that each pointer to the MSPAN area corresponds to a page.

Suppose the size of the first mspan on the left is equal to 10. According to the previous class_to_size array, the size of the msAPn object is 144B, and the number of allocated objects is 8KB/144B=56.89. So there will be some memory wasted. The Go source code has the size of the mSPAN wasted for all SizeClass; According to the class_to_allocnPages array, the MSPAN consists of only 1 page. Assuming that the MSPAN is assigned to a pointer less object, spanClass equals 20.

StartAddr points directly to a location in the Arena region, indicating the starting address of the MSPAN, and allocBits point to a bitmap, each indicating whether a block has been allocated an object. AllocCount represents the total number of allocated objects.

Thus, the individual field parameters for the first MSPAN from the left are shown below:

Memory management component

Memory allocation is done by the memory allocator. The allocator consists of three components: McAche, McEntral, and Mheap.

mcache

McAche: Each worker thread is bound to a McAche, which locally caches available MSPAN resources, so that Goroutine resources can be allocated directly, because there are no goroutines competing, so lock resources are not consumed.

McAche structure definition:


     

    //path: /usr/local/go/src/runtime/mcache.go

    type mcache struct {

       alloc [numSpanClasses]*mspan

    }

    numSpanClasses = _NumSizeClasses << 1

Copy the code

McAche uses SpanClasses as an index to manage multiple MSPANS for allocation, which contains all specifications of the MSPAN. It is twice the size of _NumSizeClasses (67*2=134). To speed up the recovery of memory, half of the mSPAN is allocated with no Pointers and the other half with Pointers.

Mspan for a pointerless object does not need to further scan for references to other active objects during garbage collection. I’ll cover this in a later garbage collection article, but I’ll stop here.

When the McAche is initialized, it does not have any MSPAN resources. During the use of the McAche, it is dynamically requested from McEntral and then cached. If the size of an object is smaller than or equal to 32KB, the mspan of the McAche is used to allocate the object.

mcentral

McEntral: Provides shard MSPAN resources for all McAche. Each Central keeps a list of global Mspans of a specific size, both allocated and unallocated. Each McEntral corresponds to a type of MSPAN, and the type of MSPAN results in different sizes of objects it divides. Mspans are fetched from McEntral when the worker thread does not have an appropriate (that is, a certain size) Mspan in its McAche.

McEntral is shared by all worker threads, and multiple Goroutines compete, thus consuming lock resources. Structure definition:


     

    //path: /usr/local/go/src/runtime/mcentral.go

    type mcentral struct {

    / / the mutex

       lock mutex

    / / specification

       sizeclass int32

    // There is a mSPAN list of free objects

       nonempty mSpanList

    // There are no free mspan lists, or msapn lists that have been taken by McAche

       empty mSpanList

    // Total number of allocated objects

       nmalloc uint64

    }

Copy the code

As a graph:

Empty indicates that the mspan in the list has been assigned an object, or that the Mspan has been cached, and that the mspan is owned by the worker thread. Nonempty represents an MSPAN list with free objects. Each central structure is maintained in the Mheap.

Here’s how McAche gets and returns mSPAN from McEntral:

  • Get lock; Find an available MSPAN from the NonEmpty list; And delete it from the NonEmpty list; Add the removed MSPAN to the empty list; Return the MSPAN to the worker thread; Unlocked.

  • Return lock; Delete mspan from empty list; Add mSPAN to nonempty list; Unlocked.

mheap

Mheap: Represents all heap space held by the Go program, which manages heap memory using a global object of the MHEAP, _Mheap.

When McEntral has no free MSPAN, it applies to mHeap. When the MHEAP has no resources, it requests new memory from the operating system. Mheap is mainly used for memory allocation of large objects and for managing the uncut MSPAN, which is used to slice McEntral into small objects.

Also, we can see that the MHeap contains all sizes of McEntral, so when a McAche requests a MSPAN from McEntral, it only needs to use the lock in the separate McEntral and does not affect the application of other sizes of MSpan.

Mheap structure definition:


     

    //path: /usr/local/go/src/runtime/mheap.go

    type mheap struct {

       lock mutex

    // SPANS: Point to the SPANS area, mapping the RELATIONSHIP between MSPAN and Page

       spans []*mspan

    // Points to the head of the bitmap, which grows from high to low

       bitmap uintptr

    // indicate the first address of arena

       arena_start uintptr

    // Indicates that the arena area is already in use

       arena_used  uintptr

    // Indicates the end address of arena

       arena_end   uintptr

       central [67*2]struct {

           mcentral mcentral

           pad [sys.CacheLineSize - unsafe.Sizeof(mcentral{})%sys.CacheLineSize]byte

       }

    }

Copy the code

As a graph:

In the figure above, we see that the bitmap and arenA_start point to the same address. This is because the bitmap addresses grow from high to low, so they point to the same memory location.

Memory Allocation Process

As mentioned in the previous article, “Where Golang’s Variables Go”, the results of escape analysis determine whether a variable is allocated on the stack or on the heap. In general, compilers tend to allocate variables on the stack because of its low overhead, the extreme being “zero garbage”, where all variables are allocated on the stack so that there is no memory fragmentation, garbage collection and so on.

According to the size of objects, Go’s memory allocator allocates objects into three categories: small objects (smaller than or equal to 16B), common objects (larger than 16B, smaller than or equal to 32KB), and large objects (larger than 32KB).

The general distribution process:

  • >32KB objects, allocated directly from the Mheap;

  • Objects <=16B are allocated using McAche’s Tiny allocator;

  • (16B,32KB), first calculate the size of the object, and then use the MSPAN of the corresponding size in the McAche to allocate the object.

    • If the McAche does not have an MSPAN, apply to McEntral

    • If McEntral does not have an MSPAN of the appropriate size, it applies to mHEAP

    • If the MHEAP does not have an appropriate mSPAN, apply to the operating system

conclusion

The memory allocation of the Go language is very complex, and one of its principles is that what can be reused must be reused. The source code is hard to follow, and there may be another article about the source code of memory allocation to read. Let’s briefly summarize this article.

The article takes a rough look at Go’s memory allocation without going into details. In general, you know how it works, up to this point.

  • When the program starts, Go requests a chunk of memory from the operating system and manages it itself.

  • The basic unit of Go memory management is the MSPAN, which consists of several pages, each of which can allocate objects of a specific size.

  • McAche, McEntral, and Mheap are the three components of Go memory management, layer by layer. McAche manages the mSPAN cached locally by threads. McEntral manages the global MSPAN for all threads; Mheap manages all dynamically allocated memory for Go.

  • Tiny objects are allocated in an Object to save resources, using tiny allocator to allocate memory; General small objects allocate memory via MSPAN; Large objects are allocated memory directly by mheap.

Better reading experience, computer side to open the original text reading.

The resources

“Easy to understand, very clear” https://yq.aliyun.com/articles/652551

The memory allocator initialization process, distribution of flow chart is very detailed 】 https://www.jianshu.com/p/47691d870756

The global figure 】 https://swanspouse.github.io/2018/08/22/golang-memory-model/

“Rain mark Go1.5 source reading” https://github.com/qyuhen/book

[figure good] https://www.jianshu.com/p/47691d870756

[whole] https://juejin.cn/post/6844903506948669447

[source reading] http://legendtkl.com/2017/04/02/golang-alloc/

The key recommendations into the transistor the figure is very good 】 https://www.linuxzen.com/go-memory-allocator-visual-guide.html

Overall description object allocation process 】 【 http://gocode.cc/project/4/article/103

The actual Linux command] https://mikespook.com/2014/12/%E7%90%86%E8%A7%A3-go-%E8%AF%AD%E8%A8%80%E7%9A%84%E5%86%85%E5%AD%98%E4%BD%BF%E7 %94%A8/

The flow chart of the overall object allocation function call link 】 http://blog.newbmiao.com/2018/08/20/go-source-analysis-of-memory-alloc.html

[source explained very detailed] https://www.cnblogs.com/zkweb/p/7880099.html

[source reading] https://zhuanlan.zhihu.com/p/34930748