General program memory allocation

Before we talk about Golang’s memory allocation, let’s look at the memory distribution of a general program:

This is the logical classification of program memory.

Let’s look at the real (real logic) diagram of the memory of a general program:

Go’s core memory allocation idea

Go is a built-in runtime programming language (runtime), and programming languages such as Go often abandon traditional memory allocation and manage it themselves. This enables operations such as pre-allocation, memory pooling, and so on to avoid the performance problems of system calls, which are required every time memory is allocated.

The core ideas of Go memory allocation can be divided into the following points:

  • Each time from the operating system to apply for a large block of memory, by Go to do the allocation of this block of memory, reducing the system calls
  • The memory allocation algorithm is Google’sTCMalloc algorithm. The algorithm is complex, and its principle can be consulted by yourself. The core idea is to divide the memory into very small, multi-level management, to reduce the lock granularity.
  • When an object’s memory is reclaimed, it is not actually freed, but simply put back into a preallocated chunk of memory for reuse. Only when the memory is too idle, will it try to return some memory to the operating system to reduce the overall overhead

Go memory structure

Go allocates a contiguous block of memory (virtual memory) when a program starts. The whole is as follows:

The size of the span and bitmap in the graph changes with the heap

arena

The Arena area is commonly referred to as the heap. There are two types of “things” in heap, in terms of management and use:

One is a large chunk of memory that consists of multiple contiguous pages from an administrative allocation perspective:

spans

Spans area, you might think of it as the area you use to administer an Arena (or heap) mentioned above. This area holds a pointer to an mspan, and we’ll talk about what an mspan is. Spans area Specifies what Mspan a page in an Arena area belongs to.

Mspan is arguably the most basic unit of GO memory management, but memory usage ultimately falls on “objects.” What is the relationship between mspan and objects? In fact, the “object” must also be in the page, after all, the page is the basic memory storage unit.

Let’s take a look at how objects and memory are allocated in general

If “p4” is reallocated, is there not enough memory to allocate? Is there a lot of debris?

How does GO solve the problem of memory fragmentation in this general allocation situation?

It can be summed up in two words: according to need. Go divides memory blocks into 67 kinds of different sizes, and then divides these 67 kinds of large memory blocks into small ones one by one (which can be approximately understood as pages of different sizes), called SPAN (continuous Page), which is mSPAN mentioned above in GO language.

span

The different sizes of span code in 67 are commented as follows (current version 1.11) :

// class bytes/obj bytes/span objects tail waste max waste
// 1 8 8192 1024 0 87.50%
// 2 16 8192 512 0 43.75%
// 3 32 8192 256 0 46.88%
// 4 48 8192 170 32 31.52%
// 5 64 8192 128 0 23.44%
// 6 80 8192 102 32 19.07%
// 7 96 8192 85 32 15.95%
// 8 112 8192 73 16 13.56%
// 9 128 8192 64 0 11.72%
// 10 144 8192 56 128 11.82%
// 11 160 8192 51 32 9.73%
// 12 176 8192 46 96 9.59%
// 13 192 8192 42 128 9.25%
// 14 208 8192 39 80 8.12%
// 15 224 8192 36 128 8.15%
// 16 240 8192 34 32 6.62%
// 17 256 8192 32 0 5.86%
// 18 288 8192 28 128 12.16%
// 19 320 8192 25 192 11.80%
// 20 352 8192 23 96 9.88%
// 21 384 8192 21 128 9.51%
// 22 416 8192 19 288 10.71%
// 23 448 8192 18 128 8.37%
// 24 480 8192 17 32 6.82%
// 25 512 8192 16 0 6.05%
// 26 576 8192 14 128 12.33%
// 27 640 8192 12 512 15.48%
// 28 704 8192 11 448 13.93%
// 29 768 8192 10 512 13.94%
// 30 896 8192 9 128 15.52%
// 31 1024 8192 8 0 12.40%
// 32 1152 8192 7 128 12.41%
// 33 1280 8192 6 512 15.55%
// 34 1408 16384 11 896 14.00%
// 35 1536 8192 5 512 14.00%
// 36 1792 16384 9 256 15.57%
// 37 2048 8192 4 0 12.45%
// 38 2304 16384 7 256 12.46%
// 39 2688 8192 3 128 15.59%
// 40 3072 24576 8 0 12.47%
// 41 3200 16384 5 384 6.22%
// 42 3456 24576 7 384 8.83%
// 43 4096 8192 2 0 15.60%
// 44 4864 24576 5 256 16.65%
// 45 5376 16384 3 256 10.92%
// 46 6144 24576 4 0 12.48%
// 47 6528 32768 5 128 6.23%
// 48 6784 40960 6256 4.36%
// 49 6912 49152 7 768 3.37%
// 50 8192 8192 1 0 15.61%
// 51 9472 57344 6 512 14.28%
// 52 9728 49152 5 512 3.64%
// 53 10240 40960 40 4.99%
// 54 10880 32768 3 128 6.24%
// 55 12288 24576 2 0 11.45%
// 56 13568 40960 3 256 9.99%
// 57 14336 57344 4 0 5.35%
// 58 16384 16384 1 0 12.49%
// 59 18432 73728 4 0 11.11%
// 60 19072 57344 3 128 3.57%
// 61 20480 40960 20 6.87%
// 62 21760 65536 3 256 6.25%
// 63 24576 24576 1 0 11.45%
// 64 27264 81920 3 128 10.00%
// 65 28672 57344 2 0 4.91%
// 66 32768 32768 1 0 12.50%

Copy the code

What does each column mean?

  • Class: Class ID. Each span structure has a class ID that represents the type of object that the span can handle
  • Bytes /obj: The class represents the number of bytes of the object
  • Bytes /span: The number of bytes per span in the heap, that is, number of pages * page size
  • Objects: Number of objects per span, AKA (bytes/ Spans)/(bytes/obj)
  • Waste Bytes: Memory fragmentation per span, AKA (bytes/ Spans) % (bytes/obj)

For example, if span is of class 1, the element size of span is 8 bytes. Span itself occupies a page, that is, 8K. A total of 1024 objects can be stored.

If you are careful, you may notice that there are 66 types in the code, and one special span: when an object greater than 32K is present, a special span is allocated directly from the heap. This special span has a class of 0 and contains only one large object. The span size is determined by the size of the object.

bitmap

There are several types of bitmaps :Stack, data, and BSS bitmaps, and heap bitmaps. What this bitmap does is tag objects in the Arena (heap). One is to mark whether there is an object in the corresponding address, and the other is to mark whether the object has been marked by GC. One function is one bit, so heap bitmaps uses two bits. A byte in the Bitmap region corresponds to the size of the four Pointers in the Arena region.

Bitmap addresses grow from higher addresses to lower addresses.

Macro picture:

Arena contains basic snap-ins and objects or entities generated when your program runs, which are matched by the memory of two non-heap areas, Aka Bitmap and SPANS, respectively. The logic diagram is as follows:

Memory Management Component

The GO memory management components include MSPAN, McAche, McEntral, and Mheap

  • mspanThe basic unit for memory management, where data is directly stored.
  • mcache: the one bound to every runtime Goroutinemcache(Specifically the P in the bound GMP concurrency model, so it can be allocated lock-freemspan, more on that later),mcacheAllocates the memory space needed to run the Goroutine (i.emspan).
  • mcentralFor allmcacheSlice and dice the backupmspan
  • mheapRepresents all heap space held by the Go program. It also manages idle spans and asks the operating system for new memory when needed.

mspan

mspan
spans
mspan

Mspan is a basic unit of memory management in GO that I covered in detail on SPANS above, so I won’t go into it here.

mcache

In order to avoid constant locking when multiple threads apply for memory, Goroutine allocates a cache of span memory blocks for each thread. This cache is McAche, and each Goroutine is bound to a McAche. There is no lock contention when each Goroutine applies for memory.

How did you do that?

Before speaking, please review the Go concurrent scheduling model, if you still don’t understand, please see me this article mp.weixin.qq.com/s/74hbRTQ2T…

Then look at the picture below:

So this is basically what the figure above looks like. Where’s our McAche? On P! The reason why there is no lock contention is that at run time a Goroutine can only be associated with a P, and McAche is on P, so there is no lock contention.

Let’s take a look at the structure of McAche:

The span list in McAche is divided into two groups, one for objects that contain pointer types, and the other for objects that do not. Why separate?

For GC convenience, there is no need to scan the list of objects that do not contain Pointers for references to other active objects during garbage collection. (If you are not familiar with GO GC, see my article mp.weixin.qq.com/s/_h0-8hma5…) .

Objects with <= 32K are allocated directly through McAche.

At this point, I feel it is necessary to say something about the classification of objects in GO by size dimension. Divided into three categories:

  • Tinny Allocations (size < 16 bytes, no Pointers)
  • small allocations (16 bytes < size <= 32k)
  • large allocations (size > 32k)

The first two categories: Tiny Allocations and Small allocations are allocated directly through McAche.

For Tiny allocations, there is a tiny Allocator for allocations, and the allocations are objects that do not contain Pointers, such as small strings and independent escape variables that do not contain Pointers.

Small allocations is where McAche matches its own size to mSPAN based on the size of the object. When McAch runs out of space, it gets a new Mspan of the desired size from McEntral’s MSpans list.

mcentral

Mspan for all McAche. Each McEntral holds a list of global MSpans of a specific type, both allocated and unallocated.

Remember the 67 types of MSPAN? There are as many McEntral as there are mspan types.

Each McEntral contains a list of two MSpans:

  • There is no free object ormspanHas beenmcacheThe cachemspanList (empty mspanList)
  • There are free objectsmspanList (empty mspanList)

Because mSPAN is global and is accessed by all McAche’s, there is a concurrency issue and McEntral has a lock.

The single McEntral structure is as follows:

If there is no free mSPAN list in McEntral, obtain it from mheap.

mheap

Mheap can be considered as the entire heap space held by the Go program. Mheap is globally unique and can be considered as a global variable. Its structure is as follows:

Mheap contains everything besides above speak McAche, McAche is found in the Go GMP scheduling model of P, as has been said above, concurrency model about the requirements of GMP, can refer to my article mp.weixin.qq.com/s/74hbRTQ2T… If you look closely, you can see that there is also a lock in the Mheap. What does this lock do?

We know that objects greater than 32K are defined as large objects and allocated directly via mHEAP. The requests for these large objects are issued by McAche, and McAche is on P. When the program is running, there are often multiple P’s. Therefore, the memory requests are concurrent. So in order to be thread safe, you must have a global lock.

If there is no more memory left in the MHEAP, the operating system is asked for a new set of pages (minimum 1MB).

Summary of Go memory allocation process

There are three types of objects:

  • Small object, size < 16B
  • Small objects. Bytes < size <= 32K
  • Large object size > 32K

There are three distribution methods:

  • Tinny Allocations (Size < 16 bytes, no Pointers) Mini-allocations.
  • Small allocations (size <= 32K) Normal allocations; This is done first by calculating the size specification to be used, and then using the block allocation in McAche that corresponds to the size specification
  • Large allocations (size > 32K) Large object allocations; Directly throughmheapAllocation. These large objects are applied at the cost of a global lock, so only one P can be applied at any given point in time.

Object allocation:

  • The size range is in (size < 16B) and does not contain a pointer to an object.mcacheMicrodispenser on the distribution
  • Objects with size range (0 < size < 16B) that contain Pointers: normal allocation
  • If the size range is (16B < SIZE <= 32KB) : Normal allocation
  • If the size ranges from (size > 32KB) : large objects are allocated

Distribution order:

  • Start by calculating the size specification used.
  • Then use themcacheIs used to allocate blocks of the corresponding size.
  • ifmcentralIf there are no available blocks inmheapApply and use the algorithm to find the best fitmspan.
  • If the application is acceptedmspanIf the request size is exceeded, it will be shred on demand to return the number of pages required by the user. The remaining pages form a new MSPAN and are put back into the free list of the Mheap.
  • If no span is available in mHEAP, a new set of pages (minimum 1MB) is requested from the operating system.

The memory management of Go is very complicated, and there are subtle changes in each version. Here, I only talk about some things that are easy to master at a macro level. I hope you can give me your comments.

Please follow my wechat official accountInternet Technology Nest

Questions can be left in the public number directly

References:

  • Distribution of programs in memory www.cnblogs.com/Lynn-Zhang/…
  • Starting from the memory allocation mp.weixin.qq.com/s/EyWKFRu1x…
  • Go Memory Allocator Visualization Guide www.linuxzen.com/go-memory-a…
  • Juejin. Im /post/684490…
  • Golang source exploration (3) the implementation principle of GC www.cnblogs.com/zkweb/p/788…
  • Go memory allocation principle of simple interpretation yq.aliyun.com/articles/65…
  • <

    >
  • Go memory allocation andrestc.com/post/go-mem…