Introduction to the
In programming, we often need to create and destroy objects of the same class frequently. Such operations are likely to have an impact on performance. In this case, the common optimization method is to use object pool. When we need to create an object, we first look it up in the object pool. If there is a free object, it is removed from the pool and returned to the caller for use. A new object is actually created only if there are no free objects in the pool. On the other hand, we do not destroy objects after they are used. Instead, it is put back into the object pool for later use. Using object pools can greatly improve performance when objects are frequently created and destroyed. At the same time, to avoid objects in the object pool occupy too much memory. Object pools typically also have specific cleanup strategies. One such example is the Go library sync.pool. Objects in sync.Pool are cleaned up by garbage collection.
A special class of objects is byte buffering (the underlying layer is typically byte slicing). When doing string concatenation, we usually store the intermediate result in a byte buffer for efficient concatenation. After the concatenation is complete, the result string is generated from the byte buffer. When sending and receiving network packets, it is also necessary to temporarily store incomplete packets in byte buffers.
The type bytes.buffer in the Go library encapsulates slices of bytes and provides some interfaces to use. We know that the capacity of slices is limited, and it needs to be expanded when the capacity is insufficient. Frequent capacity expansion may cause performance jitter. Bytebufferpool implements its own Buffer type and uses a simple algorithm to reduce the performance penalty of capacity expansion. Bytebufferpool has been used in the well-known Web framework Fasthttp and the flexible Go module library QuickTemplate. In fact, all three libraries are the same author: Valyala 😀.
Quick to use
The code in this article uses Go Modules.
Create directory and initialize:
$ mkdir bytebufferpool && cd bytebufferpool
$ go mod init github.com/darjun/go-daily-lib/bytebufferpool
Copy the code
Install byteBufferPool library:
$ go get -u github.com/PuerkitoBio/bytebufferpool
Copy the code
The typical usage is to Get a ByteBufferPool.buffer object through the Get() method provided by ByteBufferPool, call the object’s methods to write data, and then call byteBufferPool.put () to Put the object back into the object pool. Ex. :
package main
import (
"fmt"
"github.com/valyala/bytebufferpool"
)
func main(a) {
b := bytebufferpool.Get()
b.WriteString("hello")
b.WriteByte(', ')
b.WriteString(" world!")
fmt.Println(b.String())
bytebufferpool.Put(b)
}
Copy the code
Call the Get() and Put() methods of the byteBufferPool package directly, and the underlying operation is the default object pool in the package:
// bytebufferpool/pool.go
var defaultPool Pool
func Get(a) *ByteBuffer { return defaultPool.Get() }
func Put(b *ByteBuffer) { defaultPool.Put(b) }
Copy the code
Of course, we can create new object pools as needed, putting objects of the same purpose together (for example, we can create one object pool to assist receiving network packets, and one object pool to assist concatenating strings) :
func main(a) {
joinPool := new(bytebufferpool.Pool)
b := joinPool.Get()
b.WriteString("hello")
b.WriteByte(', ')
b.WriteString(" world!")
fmt.Println(b.String())
joinPool.Put(b)
}
Copy the code
Bytebufferpool does not provide a specific creation function, but can be created using new.
Optimize the details
When objects are put back into the pool, they are processed according to the capacity of the current slice. Bytebufferpool divides the size into 20 ranges:
6 | | < 2 ^ 2 ^ 2 ^ 6 ~ 7-1 |… 25 | | > ^ 2
If it’s less than 2^6, it’s in the first range. If it’s between 2 to the sixth and 2 to the seventh minus 1, it falls in the second interval. And so on. After a sufficient number of callbacks, The ByteBufferPool recalibrates to calculate which range has the most objects. DefaultSize is set to the maximum capacity of the range, 2^6 for the first range, 2^7 for the second range, and 2^26 for the last range. If there are no free objects in the pool when you use Get() to request objects, set the capacity to defaultSize when you create a new object. In this way, slice expansion can be avoided during use, thus improving performance. Here is the code to understand:
// bytebufferpool/pool.go
const (
minBitSize = 6 // 2**6=64 is a CPU cache line size
steps = 20
minSize = 1 << minBitSize
maxSize = 1 << (minBitSize + steps - 1)
calibrateCallsThreshold = 42000
maxPercentile = 0.95
)
type Pool struct {
calls [steps]uint64
calibrating uint64
defaultSize uint64
maxSize uint64
pool sync.Pool
}
Copy the code
As you can see, byteBufferPool internally uses the library object sync.pool.
There are 20 steps in total. The calls array records the number of times the size of the returned object fell in each range.
When we call pool.get () to put the object back, we first calculate the range in which the slice capacity fell and increment the value of the corresponding element in the calls array:
// bytebufferpool/pool.go
func (p *Pool) Put(b *ByteBuffer) {
idx := index(len(b.B))
if atomic.AddUint64(&p.calls[idx], 1) > calibrateCallsThreshold {
p.calibrate()
}
maxSize := int(atomic.LoadUint64(&p.maxSize))
if maxSize == 0 || cap(b.B) <= maxSize {
b.Reset()
p.pool.Put(b)
}
}
Copy the code
Calibrate () if the element in the calls array exceeds the specified calibrateCallsThreshold=42000, then call Pool.Calibrate () to calibrate:
// bytebufferpool/pool.go
func (p *Pool) calibrate(a) {
// Avoid and issue back objects that trigger 'calibrate'
if! atomic.CompareAndSwapUint64(&p.calibrating,0.1) {
return
}
// step 1
a := make(callSizes, 0, steps)
var callsSum uint64
for i := uint64(0); i < steps; i++ {
calls := atomic.SwapUint64(&p.calls[i], 0)
callsSum += calls
a = append(a, callSize{
calls: calls,
size: minSize << i,
})
}
sort.Sort(a)
// step 2. Calculate defaultSize and maxSize
defaultSize := a[0].size
maxSize := defaultSize
maxSum := uint64(float64(callsSum) * maxPercentile)
callsSum = 0
for i := 0; i < steps; i++ {
if callsSum > maxSum {
break
}
callsSum += a[i].calls
size := a[i].size
if size > maxSize {
maxSize = size
}
}
// step 3. Save the corresponding values
atomic.StoreUint64(&p.defaultSize, defaultSize)
atomic.StoreUint64(&p.maxSize, maxSize)
atomic.StoreUint64(&p.calibrating, 0)}Copy the code
Step 1. Make statistics and sort
The Calls array records the number of times an object is put back into the corresponding range. In order of this number from largest to smallest. Note: minSize << I represents the upper limit of the range I.
Step 2. Calculate defaultSize and maxSize
DefaultSize is easy to understand, select the first size after sorting. MaxSize Value Records the maximum capacity of multiple objects that are put back more than 95% of the time. Its purpose is to prevent less used high-volume objects from being put back into the object pool and thus taking up too much memory. You can see the logic behind the second half of the pool.put () method:
// If the size of the object to be put back is greater than maxSize, it is not put back
maxSize := int(atomic.LoadUint64(&p.maxSize))
if maxSize == 0 || cap(b.B) <= maxSize {
b.Reset()
p.pool.Put(b)
}
Copy the code
Step 3. Save the corresponding values
When you use pool.get () to obtain objects, if there are no free objects in the Pool, the default capacity of the newly created object is defaultSize. This capacity can be used in most cases, avoiding slice expansion during use.
// bytebufferpool/pool.go
func (p *Pool) Get(a) *ByteBuffer {
v := p.pool.Get()
ifv ! =nil {
return v.(*ByteBuffer)
}
return &ByteBuffer{
B: make([]byte.0, atomic.LoadUint64(&p.defaultSize)),
}
}
Copy the code
Some other details:
- The minimum size is 2^6 = 64, because this is the size of the CPU cache line on a 64-bit computer. This size of data can be loaded into a CPU cache row at once, and any smaller is meaningless.
- Used multiple times in code
atomic
Atomic operation to avoid performance loss caused by locking.
Of course, the disadvantages of this library are obvious, because most of the capacity used is less than defaultSize, there is some memory waste.
conclusion
Without comments and empty lines, ByteBufferPool implemented a high-performance Buffer object pool in about 150 lines of code. The details are worth savoring. Reading high quality code will help you improve your coding skills and learn the details of coding. I strongly recommend taking time to read!!
If you find a fun and useful Go library, please Go to GitHub and submit issue😄
reference
- Bytebufferpool GitHub:github.com/valyala/byt…
- GitHub: github.com/darjun/go-d…
I
My blog is darjun.github. IO
Welcome to follow my wechat public account [GoUpUp], learn together, progress together ~