A, TCMalloc
Go memory management is based on TCMalloc design, so learn the TCMalloc principle before learning Go memory management
TCMalloc(Thread Cache Malloc) is a thread-level memory management mode.
TCMalloc advantage:
1. Fast speed
2. Reduce lock competition. For small objects, locks are used only when there are not enough free blocks allocated by the corresponding thread; For large objects, TCMalloc tries to use effective spin locks
In summary: maximize memory usage and minimize allocation time.
This image comes from: wallenwang.com/2018/11/tcm…
This article also refers to the article to learn, write very detailed, interested can click learning ~ (more like big guy learning)
Basically understand this diagram, TCMalloc understand no problem.
Explanation of some nouns in the picture:
1, Pages
Pages is the base unit of memory managed by TCMalloc and the default size is 8KB
2, span
One or more Pages form a span, and TCMolloc applies for memory in the unit of span. Span is managed by PageHeap and can be split into multiple identical page sizes for small objects. Can also be used as a whole by large objects.
Allocate memory, split span; Reclaim memory and merge spans.
3, the size of the class
Each size class corresponds to a different free block size, such as 8 bytes, 16 bytes, and so on. There are 85 categories (1B~256KB).
4, ThreadCache
Memory allocated separately for each thread, which is allocated and released without locking. As you can see, ThreadCache is made up of multiple classes, with separate free lists for each class size.
5, CentralCache
When ThreadCache has no free objects, it requests them like CentralCache. CentralCache refers to the cache common to all threads,
Since the cache is common to the thread, use the spin lock on the request. As you can see, CentralCache is also sorted by size class.
6, PageHeap
Complete the mapping between Page and span. When CentralCache runs out of memory, PageHeap is applied to. The basic unit of PageHeap is a span. CentralCache divides the span into size class sizes for use.
PageHeap can be divided into two types, the list is less than or equal to 128 according to the linked list cache management; More than 128 are stored in an ordered set.
7, VirtualMemory
Virtual memory: Users apply for virtual memory. (Not physical memory)
2. Memory allocation and reclamation
It is easier to understand by referring to the first figure in this article.
1) principle of small object memory allocation and memory reclamation :(small object size :(0, 256KB))
Allocate: When a thread applies for memory, map the allocated memory to the corresponding size class. (No lock required) Check the size class corresponding to the FreeList in ThreadCache. If ThreadCache’s FreeList has a free object, it returns a free object and the allocation ends. If ThreadCache does not have any free objects, it fetches objects from the corresponding class size in CentralCache. CentralCache is shared by all threads, so spinlocks are required. Place the allocated class size in a ThreadCache FreeList, return the object, and the allocation ends. If CentralCache also has no objects available, apply for a span from PageHeap, break the span into class sizes and place them in CentralCache’s freeList.
Reclaim: calculates the page number based on the requested memory address, finds the corresponding span based on the page number, and knows the corresponding size class based on the span. If the page number does not exceed the threshold of ThreadCache (2MB), the garbage collection mechanism is used to move the page to CentralCache
2) principle of object memory allocation and memory reclamation :(object size :(256KB, 1MB))
Allocate: select a non-empty linked list M(n pages) in the order of span list in PageHeap, and then divide M into 2 classes according to the size of memory, one is k pages that meet the size, return objects, the end of allocation. Another type of N-K page will continue in the span list of n-kPage. If PageHeap does not have a suitable free block, it is allocated according to the large object memory allocation
Reclaim: Calculate the page number based on the requested memory address, find the corresponding SPAN based on the page number, find the corresponding SPAN size, and reclaim the page
3) principle of large object memory allocation and memory reclamation :(large object size :(1MB, +∞))
Allocate: In PageHeap span set, select the latest span for allocation (N pages), also divided into 2 categories, one is k pages meet the size, return the object, the end of allocation. If n-k>128, place the remaining pages in the span set, and continue to place the rest in the span list of n-K pages.
Reclaim: Calculates the page number based on the requested memory address, finds the corresponding SPAN based on the page number, finds the corresponding SPAN size, and reclaims the page. If there is no corresponding span size, the page is stored in the SPAN set
3. Memory fragmentation processing
Memory fragmentation simply cannot be redistributed to applications. Allocate internal fragmentation and external fragmentation. Internal fragmentation is internal fragmentation. The allocator allocates more memory than the program requests, and internal fragmentation is generated. External fragmentation is a block of memory that is too small to be allocated to an application.
How does TCMalloc deal with internal and external shards?
Internal fragments:
TCMalloc pre-assigns a variety of size-classes: 8, 16, 32, 48, 64, 80, 96, 112, 128, 144, 160, 176…
TCMalloc’s goal is to generate up to 12.5% of memory fragmentation. You can see that the size above is not assigned by a power of two, because the fragmentation would be larger if it were. For example, if you apply 65 bytes, a power of 2 will allocate 128 bytes, while TCMalloc only allocates 80 bytes, which reduces fragmentation.
Within 16 bytes, each 8 bytes is divided into a size class: 8,16
Size class: 32,48,64…
The value is 128B to 256 bytes. The value is increased by x/8 each time: 128+128/8=144 and so on
Size classes greater than or equal to 1024 are aligned with 128:
External debris:
TCMalloc’s CentralCache requests memory from Page heap on a Page basis. When applying for 1024,
1Page (8192)%1024=0 There is no memory fragmentation. When applying for a class-size of 1152 (8192%1152=128), 128 external fragmentation is generated. In order to make the memory fragmentation rate up to 12.5%, you can apply for more pages to solve the problem. That is, merging adjacent pages reduces external fragmentation.
TCMalloc also considers the same class-size for merging. Here, the same means that the allocated objects are of the same size and a size with fewer fragments is used.
Learning the TCMalloc memory management principle, the next article we learn the Go language memory management principle.
Refer to the article
Blog.itpub.net/15480802/vi…
Blog.csdn.net/junlon2006/…
Wallenwang.com/2018/11/tcm…
www.360doc.com/content/13/…
Blog.csdn.net/kelvin_yin/…