Analysis of Go memory source code
Learn the last 2 Go language memory management principle (a, TCMalloc principle juejin.cn/post/691900… Go Memory management principle juejin.cn/post/692048…) We understand the source code flow is very easy.
Let’s start with McAche, McEntral, and Mheap. We select the longest used field for analysis. Note: This is all based on the source analysis of GO1.14
// mcache
type mcache struct {
tiny uintptr // Allocator start address
tinyoffset uintptr // The next one is an idle bias
local_tinyallocs uintptr // The number of allocated objects
alloc [numSpanClasses]*mspan // Apply 134 spans
/ / stack linked list
stackcache [_NumStackOrders]stackfreelist
// Local allocator stats, flushed during GC.
local_largefree uintptr
local_nlargefree uintptr
local_nsmallfree [_NumSizeClasses]uintptr
flushGen uint32
}
// mcentral
type mcentral struct {
lock mutex / / the mutex
spanclass spanClass / / 67 * 2
nonempty mSpanList // Contains a list of free objects
empty mSpanList // There is no free object list
nmalloc uint64 // Number of allocated objects
}
// mheap
type mheap struct {
lock mutex / / the mutex
pages pageAlloc // page allocation data structure
sweepgen uint32 / / GC
sweepdone uint32 / / GC
sweepers uint32 / / GC
allspans []*mspan // Span of all applications
sweepSpans [2]gcSweepBuf / / GC
pagesInUse uint64 // pages of spans in stats mSpanInUse; updated atomically
arenas [1 << arenaL1Bits]*[1 << arenaL2Bits]*heapArena
// arenaHints is a list of addresses at which to attempt to
// add more heap arenas. This is initially populated with a
// set of general hint addresses, and grown with the bounds of
// actual heap arena ranges.
arenaHints *arenaHint
// arena is a pre-reserved space for allocating heap arenas
// (the actual arenas). This is only used on 32-bit.
arena linearAlloc
// allArenas is the arenaIndex of every mapped arena. This can
// be used to iterate through the address space.
//
// Access is protected by mheap_.lock. However, since this is
// append-only and old backing arrays are never freed, it is
// safe to acquire mheap_.lock, copy the slice header, and
// then release mheap_.lock.
allArenas []arenaIdx
// sweepArenas is a snapshot of allArenas taken at the
// beginning of the sweep cycle. This can be read safely by
// simply blocking GC (by disabling preemption).
sweepArenas []arenaIdx / / GC
// curArena is the arena that the heap is currently growing
// into. This should always be physPageSize-aligned.
curArena struct {
base, end uintptr
}
// McEntral allocates memory from McEntral when McAche does not have enough memory to allocate
// McEntral is managed by mheap
central [numSpanClasses]struct {
mcentral mcentral
pad [cpu.CacheLinePadSize - unsafe.Sizeof(mcentral{})%cpu.CacheLinePadSize]byte}}Copy the code
Next, we introduce the process of memory allocation according to the classification of Tiny objects, small objects and large objects.
0. Objects allocate memory in the NewObject () interface, which includes memory requisition and reclamation. This article mainly study memory application source code
func newobject(typ *_type) unsafe.Pointer {
return mallocgc(typ.size, typ, true)}func mallocgc(size uintptr, typ *_type, needzero bool) unsafe.Pointer {
// Check whether there is a pointer.// size <= 32KB
if size <= maxSmallSize {
// Tiny object memory allocation < 16B
if noscan && size < maxTinySize {
// Memory allocation for tiny objects
} else {
// Small object allocation
} else {
// Large object allocation}.../ / GC collection
return x
}
Copy the code
1, Tiny object memory request (<16B)
func mallocgc(size uintptr, typ *_type, needzero bool) unsafe.Pointer{...// Tiny objects are applied when the size is smaller than 16B
if noscan && size < maxTinySize {
// Tiny allocator.
off := c.tinyoffset
// align to adjust the offset
if size&7= =0 { // The multiples of 7 are resized to 8
off = alignUp(off, 8)}else if size&3= =0 {
off = alignUp(off, 4)}else if size&1= =0 {
off = alignUp(off, 2)}// If offset +size is less than 16B, allocate small objects
ifoff+size <= maxTinySize && c.tiny ! =0 {
// The object fits into existing tiny block.
x = unsafe.Pointer(c.tiny + off)
c.tinyoffset = off + size
c.local_tinyallocs++ // The number of applications +1
mp.mallocing = 0
releasem(mp)
return x
}
// The current tiny block memory space is insufficient, request a span from mache
// tinySpanClass = 5 (0,0,1,1)
span := c.alloc[tinySpanClass]
// I try to get memory from the current span. If I don't get memory, it returns 0
v := nextFreeFast(span)
if v == 0 {
// No memory from allocCache, netxtFree function
// Try to get a new span memory size from McEntral
// Replace the original memory block with insufficient memory space and allocate memory
v, _, shouldhelpgc = c.nextFree(tinySpanClass) // This will be explained below
}
x = unsafe.Pointer(v)
(*[2]uint64)(x)[0] = 0(* [2]uint64)(x)[1] = 0
// See if we need to replace the existing tiny block with the new one
// based on amount of remaining free space.
if size < c.tinyoffset || c.tiny == 0 {
c.tiny = uintptr(x)
c.tinyoffset = size
}
size = maxTinySize
}
....
}
Copy the code
If the local memory blocks are insufficient, apply for more memory blocks from McAche
// Apply for free memory in mache
func (c *mcache) nextFree(spc spanClass) (v gclinkptr, s *mspan, shouldhelpgc bool) {
s = c.alloc[spc]
shouldhelpgc = false
freeIndex := s.nextFreeIndex()
if freeIndex == s.nelems {
// The span is full.x
if uintptr(s.allocCount) ! = s.nelems {println("runtime: s.allocCount=", s.allocCount, "s.nelems=", s.nelems)
throw("s.allocCount ! = s.nelems && freeIndex == s.nelems")
}
c.refill(spc) // This place McAche applied to McEntral
shouldhelpgc = true
s = c.alloc[spc]
// After McAche has applied to McEntral, it applies to McAche again
freeIndex = s.nextFreeIndex()
}
if freeIndex >= s.nelems {
throw("freeIndex is not valid")
}
v = gclinkptr(freeIndex*s.elemsize + s.base())
s.allocCount++
if uintptr(s.allocCount) > s.nelems {
println("s.allocCount=", s.allocCount, "s.nelems=", s.nelems)
throw("s.allocCount > s.nelems")}return
}
Copy the code
To summarize, requests for Tiny objects (less than 16B) are made locally if the Tiny allocator currently has enough memory locally; Otherwise, apply for sapN with sizeclass=5 from McAche. If McAche does not have span, apply for SAPN with McEntral.
2. Small memory application (< 32KB)
func mallocgc(size uintptr, typ *_type, needzero bool) unsafe.Pointer{{// Memory allocation for small objects <32KB
var sizeclass uint8
// All levels are divided into 2 classes
// 1) size <= 1024-8
if size <= smallSizeMax- 8 - {
// eg: size =700 (size+smallSizeDiv-1)/smallSizeDiv = (700+8-1/8)= 88 sizeclass=28 =>704
// Get sizeclass
sizeclass = size_to_class8[(size+smallSizeDiv- 1)/smallSizeDiv]
} else { // 2) size >1024-8
// Get sizeclass
sizeclass = size_to_class128[(size-smallSizeMax+largeSizeDiv- 1)/largeSizeDiv]
}
// sizeclass specifies the size of a span
size = uintptr(class_to_size[sizeclass])
spc := makeSpanClass(sizeclass, noscan)
// find the corresponding span
span := c.alloc[spc]
// Try to fetch memory from allocCache
v := nextFreeFast(span) // This will be explained below
if v == 0 {
v, span, shouldhelpgc = c.nextFree(spc)
}
x = unsafe.Pointer(v)
ifneedzero && span.needzero ! =0 {
memclrNoHeapPointers(unsafe.Pointer(v), size)
}
}
}
Copy the code
If the cache does not have a corresponding span, apply to Central using the nextFreeFast(SPAN) interface, which has been introduced in Tiny object memory application.
The Refill interface is mainly used in the nextFreeFast(SPAN) interface
func (c *mcache) refill(spc spanClass) {
// Return the current cached span to the central lists.
s := c.alloc[spc]
if uintptr(s.allocCount) ! = s.nelems { throw("refill of span with free space remaining")}ifs ! = &emptymspan {// Mark this span as no longer cached.
ifs.sweepgen ! = mheap_.sweepgen+3 {
throw("bad sweepgen in refill")
}
atomic.Store(&s.sweepgen, mheap_.sweepgen)
}
// Apply to Central
s = mheap_.central[spc].mcentral.cacheSpan() // This will be explained below
if s == nil {
throw("out of memory")}if uintptr(s.allocCount) == s.nelems {
throw("span has no free space")}// Indicate that this span is cached and prevent asynchronous
// sweeping in the next sweep phase.
s.sweepgen = mheap_.sweepgen + 3
c.alloc[spc] = s
}
Copy the code
CacheSpan is the core interface to apply for spans from McEntral
func (c *mcentral) cacheSpan(a) *mspan{...// McEntral is locked
lock(&c.lock)
traceDone := false
if trace.enabled {
traceGCSweepStart()
}
sg := mheap_.sweepgen
retry:
var s *mspan
// Find the free span from nonempty
fors = c.nonempty.first; s ! =nil; s = s.next {
if s.sweepgen == sg2 - && atomic.Cas(&s.sweepgen, sg2 -, sg- 1) { / / in accordance with span
c.nonempty.remove(s) // Delete the span in the nonempty list
c.empty.insertBack(s) // Add the span to the empty list
unlock(&c.lock)
s.sweep(true)
goto havespan
}
if s.sweepgen == sg- 1 {
// the span is being swept by background sweeper, skip
continue
}
// we have a nonempty span that does not require sweeping, allocate from it
c.nonempty.remove(s)
c.empty.insertBack(s)
unlock(&c.lock)
goto havespan
}
// Look in empty to see if there are any available spans because some spans may have been marked as idle by garbage collection but have not been cleaned up
fors = c.empty.first; s ! =nil; s = s.next {
if s.sweepgen == sg2 - && atomic.Cas(&s.sweepgen, sg2 -, sg- 1) {
// we have an empty span that requires sweeping,
// sweep it and see if we can free some space in it
c.empty.remove(s)
// swept spans are at the end of the list
c.empty.insertBack(s)
unlock(&c.lock)
s.sweep(true)
freeIndex := s.nextFreeIndex()
iffreeIndex ! = s.nelems { s.freeindex = freeIndexgoto havespan
}
lock(&c.lock)
// the span is still empty after sweep
// it is already in the empty list, so just retry
goto retry
}
if s.sweepgen == sg- 1 {
// the span is being swept by background sweeper, skip
continue
}
// already swept empty span,
// all subsequent ones must also be either swept or in process of sweeping
break
}
if trace.enabled {
traceGCSweepDone()
traceDone = true
}
unlock(&c.lock)
...
}
Copy the code
If mache does not have a span available, mache will apply to McEntral, lock it, find an available span, delete it from NonEmpty, place it in the empty list, return the span to the worker thread, and unlock it. If there is not enough memory, McEntral will continue to apply to the Mheap.
3. Large object memory application
Large objects are applied directly to the Mheap
func mallocgc(size uintptr, typ *_type, needzero bool) unsafe.Pointer {
var s *mspan
shouldhelpgc = true
systemstack(func(a) {
// Apply directly from the Mheap
s = largeAlloc(size, needzero, noscan) // This will be explained below
})
s.freeindex = 1
s.allocCount = 1
x = unsafe.Pointer(s.base())
size = s.elemsize
}
Copy the code
Apply with classsize=0
// Large object request
func largeAlloc(size uintptr, needzero bool, noscan bool) *mspan {
// Determine memory overflow
if size+_PageSize < size {
throw("out of memory")}// The number of pages needed to calculate the object is an integer page for large memory allocations greater than 32K
npages := size >> _PageShift
ifsize&_PageMask ! =0 {
npages++
}
deductSweepCredit(npages*_PageSize, npages)
// Allocate function concrete implementation, using span->sizeclass=0
s := mheap_.alloc(npages, makeSpanClass(0, noscan), needzero) // This will be explained below
if s == nil {
throw("out of memory")
}
s.limit = s.base() + size
// bitmap records the allocated span
heapBitsForAddr(s.base()).initSpan(s)
return s
}
Copy the code
Request a SPAN from mHEAP
func (h *mheap) alloc(npages uintptr, spanclass spanClass, needzero bool) *mspan {
var s *mspan
systemstack(func(a) {
// To prevent excessive heap growth, before allocating n pages
// we need to sweep and reclaim at least n pages.
if h.sweepdone == 0 {
// In order to prevent the memory usage and heap growth, we need to call runtime.mhep. reclaim method before allocating the memory
h.reclaim(npages)
}
// Apply an Mspan from the mheap
s = h.allocSpan(npages, false, spanclass, &memstats.heap_inuse) // This will be explained below
})
ifs ! =nil {
ifneedzero && s.needzero ! =0 {
memclrNoHeapPointers(unsafe.Pointer(s.base()), s.npages<<_PageShift)
}
s.needzero = 0
}
return s
}
Copy the code
func (h *mheap) allocSpan(npages uintptr, manual bool, spanclass spanClass, sysStat *uint64) (s *mspan) {
// Function-global state.
gp := getg() // Get the current coroutine
base, scav := uintptr(0), uintptr(0) // Base address, reclaim address (bit)
// If the allocation is small enough, try the page cache!
pp := gp.m.p.ptr()
// Apply pageCache from P when the page number is less than 8*64/4=128
// Note that pageCache is contiguous
ifpp ! =nil && npages < pageCachePages/4 {
c := &pp.pcache
// If the cache is empty, refill it.
// if the pageCache corresponding to p is empty, apply for cache memory - pageCache is explained below
if c.empty() {
lock(&h.lock)
// Fill the local pageCache
*c = h.pages.allocToCache()
unlock(&h.lock)
}
// Try to allocate from the cache.
// get the base address and size of the memory region from the page cache of p
base, scav = c.alloc(npages) // This will be explained below
// The application was successful
ifbase ! =0 {
/ / span
s = h.tryAllocMSpan()
ifs ! =nil && gcBlackenEnabled == 0&& (manual || spanclass.sizeclass() ! =0) {
goto HaveSpan
}
}
}
// For one reason or another, we couldn't get the
// whole job done without the heap lock.
lock(&h.lock)
// the page cache of P does not have enough memory, so the page heap is allocated memory
if base == 0 {
// Try to acquire a base address.
// mheap global heap area
base, scav = h.pages.alloc(npages)
if base == 0 {
if! h.grow(npages) {// Request a fixed page size memory area from the system
unlock(&h.lock)
return nil
}
base, scav = h.pages.alloc(npages) // Retrieve from mheap again
if base == 0 {
throw("grew heap, but no adequate free space found")}}}.... }Copy the code
Applying from the page cache can only apply for consecutive pages. If not, an error message is displayed
/ / application pageCache
func (c *pageCache) alloc(npages uintptr) (uintptr.uintptr) {
if c.cache == 0 {
return 0.0
}
// The page number is 1
if npages == 1 {
i := uintptr(sys.TrailingZeros64(c.cache))
scav := (c.scav >> i) & 1
c.cache &^= 1 << i // Set bit to mark in-use
c.scav &^= 1 << i // clear bit to mark unscavenged
return c.base + i*pageSize, uintptr(scav) * pageSize
}
// Apply consecutive n pages
return c.allocN(npages) // This will be explained below
}
Copy the code
func (c *pageCache) allocN(npages uintptr) (uintptr.uintptr) {
i := findBitRange64(c.cache, uint(npages))
if i >= 64 {
return 0.0
}
mask := ((uint64(1) << npages) - 1) << i
scav := sys.OnesCount64(c.scav & mask)
c.cache &^= mask // set the page number to 0. Mark in-use bits: 1110&^10=1100
c.scav &^= mask // clear scavenged bits
return c.base + uintptr(i*pageSize), uintptr(scav) * pageSize
}
Copy the code
Here’s how pcache works
Go1.14 uses bitmaps to manage memory pages and maintains a copy of pageCache in each P. PageCache is a bitmap where 1 means unallocated and 0 means allocated
type pageCache struct {
base uintptr // Base address of the chunk
cache uint64 // Bit indicates whether memory is allocated
scav uint64 // Bit indicates whether memory is reclaimed
}
Copy the code
One bit represents a page (8KB), so the maximum represents 8KB*64=512KB cache. If the number of pages to be allocated is less than 512/4 (128KB), the pages must be allocated from the cache
If the page cache does not have enough pages, pages are requested from virtual memory
// Apply the page in heap
func (s *pageAlloc) alloc(npages uintptr) (addr uintptr, scav uintptr) {
// If the searchAddr refers to a region which has a higher address than
// any known chunk, then we know we're out of memory.
if chunkIndex(s.searchAddr) >= s.end {
return 0.0
}
// If npages has a chance of fitting in the chunk where the searchAddr is,
// search it directly.
searchAddr := uintptr(0)
// chunkPages Total Number of pages - Currently in Use >= Number of pages to be applied Description chunk Indicates the number of pages that can be applied
if pallocChunkPages-chunkPageIndex(s.searchAddr) >= uint(npages) {
// npages is guaranteed to be no greater than pallocChunkPages here.
i := chunkIndex(s.searchAddr)
if max := s.summary[len(s.summary)- 1][i].max(); max >= uint(npages) {
j, searchIdx := s.chunkOf(i).find(npages, chunkPageIndex(s.searchAddr))
if j < 0 {
print("runtime: max = ", max, ", npages = ", npages, "\n")
print("runtime: searchIdx = ", chunkPageIndex(s.searchAddr), ", s.searchAddr = ", hex(s.searchAddr), "\n")
throw("bad summary data")
}
addr = chunkBase(i) + uintptr(j)*pageSize
searchAddr = chunkBase(i) + uintptr(searchIdx)*pageSize
goto Found
}
}
// We failed to use a searchAddr for one reason or another, so try
// the slow path.
addr, searchAddr = s.find(npages)
if addr == 0 {
if npages == 1 {
// We failed to find a single free page, the smallest unit
// of allocation. This means we know the heap is completely
// exhausted. Otherwise, the heap still might have free
// space in it, just not enough contiguous space to
// accommodate npages.
s.searchAddr = maxSearchAddr
}
return 0.0}...return addr, scav
}
Copy the code
At this point, memory allocation for large objects is complete. For large objects, mHEAP is directly allocated. If the MHEAP does not have enough memory, the MHEAP applies for several Pages from virtual memory. As you can see, the cost of memory is getting higher and higher
From source code analysis, we can learn
1, compared with previous version go use the tree to manage the memory block, go1.14 use – a much more efficient, and is also easy to manage, in source will see all kinds of bit operation can also achieve the same effect, see when call: great (weak foundation have to do your own re-count of more than 2, see the use of space, in time of operation, provide the efficiency is a big case
Next, we will learn what GC is
reference
[Allocate unit size _Go Memory management trilogy]blog.csdn.net/weixin_3994… [golang memory management] legendtkl.com/2017/04/02/… [deep understanding of the Go – memory allocation] studygolang.com/articles/22…