Original text: medium.com/a-journey-w…
This article is based on go 1.12 and 1.13, and we’ll take a look at the revolutionary changes in Sync/Pool.Go between the two versions.
The Sync package provides a powerful pool of reusable instances to reduce the burden of garbage collection. Before using this package, you need to run your application out of the benchmark data before and after using the pool. In some cases, if you don’t know how the pool works, the application performance will degrade.
Limitations of pool
Let’s take a look at some basic examples to see how it works in a fairly simple case (allocating 1K of memory) :
type Small struct {
a int
}
var pool = sync.Pool{
New: func(a) interface{} { return new(Small) },
}
//go:noinline
func inc(s *Small) { s.a++ }
func BenchmarkWithoutPool(b *testing.B) {
var s *Small
for i := 0; i < b.N; i++ {
for j := 0; j < 10000; j++ {
s = &Small{ a: 1, }
b.StopTimer(); inc(s); b.StartTimer()
}
}
}
func BenchmarkWithPool(b *testing.B) {
var s *Small
for i := 0; i < b.N; i++ {
for j := 0; j < 10000; j++ {
s = pool.Get().(*Small)
s.a = 1
b.StopTimer(); inc(s); b.StartTimer()
pool.Put(s)
}
}
}
Copy the code
Benchmarks indicate that Sync. pool is used and that it is not
name time/op alloc/op allocs/op
WithoutPool-8 3.02Ms + / -1% 160KB -0% 1.05KB -1%
WithPool-8 1.36Ms + / -6% 1.05KB -0% 3.00 ± 0%
Copy the code
Since this traversal has 10K iterations, benchmark with no pool shows that 10K allocations are created on the heap, while pool with only 3. 3 allocations are made by pool, but only one instance of the structure is allocated to memory. So far you can see that using pools is much friendlier in terms of memory handling and memory consumption.
However, in a practical example, when you use pool, your application will have a lot of new memory allocation on the heap. In this case, when memory is high, garbage collection is triggered.
We can force garbage collection to occur by using runtime.gc () to simulate this situation
name time/op alloc/op allocs/op
WithoutPool-8 993Ms + / -1% 249KB -2% 10.9K + / -0%
WithPool-8 1.03S -4% 10.6MB + / -0% 31.0K + / -0%
Copy the code
We can now see that the allocation of memory is higher with pool than without pool. Let’s take a closer look at the package’s source code to understand why this is the case.
Internal workflow
Take a look at the sync/pool.go file to show us an initialization function that explains what we just saw:
func init(a) {
runtime_registerPoolCleanup(poolCleanup)
}
Copy the code
This is registered as a method at run time to clean up pools. And the same method is triggered in garbage collection, in the file Runtime /mgc.go
func gcStart(trigger gcTrigger){[...].// clearpools before we start the GC
clearpools()
Copy the code
This explains why performance degrades when garbage collection is invoked. Pools are cleaned up each time the garbage collection is started. This document actually warns us
Any item stored in the Pool may be removed automatically at any time without notification
Copy the code
Let’s create a workflow to understand how this is managed
For each sync.pool we create, go generates an internal Pool poolLocal to connect to each processer (P in GMP). These internal pools consist of two attributes, private and shared. The former is accessible only to its owner (push and pop operations, and therefore does not require locking), whereas the shared process can be read by any processer and needs to maintain concurrency security itself. In fact, the pool is not a simple local cache and could be used for any coroutines or goroutines in our program
Version 1.13 of Go will improve access to shared. It will also bring a new cache that addresses issues related to the garbage collector and cleanup pool.
New unlocked pool and victim cache
Go 1.13 uses a new bidirectional linked list as a shared pool to remove locks and improve shared access efficiency. The main purpose of this change is to improve cache performance. Here is a process to access shared
In this new linked pool, each procesSPR can push and pop at the head of the list, and then access shared to pop sub-blocks from the tail of the list. Structures are doubled in size when expanded, and are then connected using next/ Prev Pointers. The default structure size is 8 subitems. This means that the second structure can hold 16 subitems, the third 32, and so on. Similarly, locks are no longer needed and code execution is atomic.
Regarding the new cache, the new policy is very simple. There are now two sets of pools: active and archived. When the garbage collector runs, it keeps each pool’s reference to the new properties in that pool, and then copies the pool’s collection into the archive pool before cleaning up the current pool:
// Drop victim caches from all pools.
for _, p := range oldPools {
p.victim = nil
p.victimSize = 0
}
// Move primary cache to victim cache.
for _, p := range allPools {
p.victim = p.local
p.victimSize = p.localSize
p.local = nil
p.localSize = 0
}
// The pools with non-empty primary caches now have non-empty
// victim caches and no pools have primary caches.
oldPools, allPools = allPools, nil
Copy the code
With this strategy, the application will now have one more garbage collector cycle to create/collect new items with backups due to victim caching. In the workflow, the victim cache is requested at the end of the process after the shared pool.