Go development cache scenarios typically use a map or cache framework, with sync. map or a thread-safe cache framework used for thread safety.
In the cache scenario, if the amount of data is larger than the million level, special consideration should be given to the impact of the data type on the GC (note that string is pointer +Len+Cap, so it is a pointer type), but not if the cache key and value are non-pointer types. However, in practical application scenarios, it is common for key and value to be pointer type data. Therefore, special attention should be paid to the impact of cache framework on GC. From the perspective of impact on GC, cache framework can be roughly divided into two categories:
- Zero GC overhead: Such as Freecache or BigCache, which is ringbuf based and reduces the number of Pointers;
- There is GC overhead: cache framework implemented directly based on Map.
For a map, gc scans all key/value pairs, and if they are of basic type, then GC does not scan any more.
The following uses Freecache as an example to analyze its implementation principle. The code example is as follows:
func main(a) {
cacheSize := 100 * 1024 * 1024
cache := freecache.NewCache(cacheSize)
for i := 0; i < N; i++ {
str := strconv.Itoa(i)
_ = cache.Set([]byte(str), []byte(str), 1)
}
now := time.Now()
runtime.GC()
fmt.Printf("freecache, GC took: %s\n", time.Since(now))
_, _ = cache.Get([]byte("aa"))
now = time.Now()
for i := 0; i < N; i++ {
str := strconv.Itoa(i)
_, _ = cache.Get([]byte(str))
}
fmt.Printf("freecache, Get took: %s\n\n", time.Since(now))
}
Copy the code
1 the initialization
Freecache.NewCache initializes the local cache. Size indicates the storage space. Freecache initializes 256 segments. Each segment has a ringbuf with an initial size of size/256. Freecache’s claim to zero GC comes from the fact that it has 512 fixed Pointers, two for each segment (rb and slotData).
type segment struct {
rb RingBuf // ring buffer that stores data
segId int
_ uint32 / / placeholder
missCount int64
hitCount int64
entryCount int64
totalCount int64 // number of entries in ring buffer, including deleted entries.
totalTime int64 // used to calculate least recent used entry.
timer Timer // Timer giving current time
totalEvacuate int64 // used for debug
totalExpired int64 // used for debug
overwrites int64 // used for debug
touched int64 // used for debug
vacuumLen int64 // up to vacuumLen, new data can be written without overwriting old data.
slotLens [256]int32 // The actual length for every slot.
slotCap int32 // max number of entry pointers a slot can hold.
slotsData []entryPtr // Index pointer
}
func NewCacheCustomTimer(size int, timer Timer) (cache *Cache) {
cache = new(Cache)
for i := 0; i < segmentCount; i++ {
cache.segments[i] = newSegment(size/segmentCount, i, timer)
}
}
func newSegment(bufSize int, segId int, timer Timer) (seg segment) {
seg.rb = NewRingBuf(bufSize, 0)
seg.segId = segId
seg.timer = timer
seg.vacuumLen = int64(bufSize)
seg.slotCap = 1
seg.slotsData = make([]entryPtr, 256*seg.slotCap) // Each slotData initializes 256 units
}
Copy the code
2 Read/Write Process
Freecache’s key and value are both []byte arrays, which need to be serialized and deserialized by themselves. If you cache complex objects, the serialization and deserialization can not be ignored.
_ = cache.Set([]byte(str), []byte(str), 1)
Copy the code
The Set process hashes the key. HashVal type uint64 the lower 8 bits segID corresponds to the segment array. The lower 8 to 15 bits segID corresponds to slotId and the higher 16 bits entryPtr corresponds to []. There is a search operation. Note that []entryPtr array size is slotCap(initially 1). SlotCap will be doubled during expansion.
Each segment has a lock (sync.mutex), so it can support large amounts of concurrency, unlike sync.map, which has only one lock.
func (cache *Cache) Set(key, value []byte, expireSeconds int) (err error) {
hashVal := hashFunc(key)
segID := hashVal & segmentAndOpVal / / low 8 bits
cache.locks[segID].Lock() / / lock
err = cache.segments[segID].set(key, value, hashVal, expireSeconds)
cache.locks[segID].Unlock()
}
func (seg *segment) set(key, value []byte, hashVal uint64, expireSeconds int) (err error) {
slotId := uint8(hashVal >> 8)
hash16 := uint16(hashVal >> 16)
slot := seg.getSlot(slotId)
idx, match := seg.lookup(slot, hash16, key)
var hdrBuf [ENTRY_HDR_SIZE]byte
hdr := (*entryHdr)(unsafe.Pointer(&hdrBuf[0]))
if match { // There are data update operations
matchedPtr := &slot[idx]
seg.rb.ReadAt(hdrBuf[:], matchedPtr.offset)
hdr.slotId = slotId
hdr.hash16 = hash16
hdr.keyLen = uint16(len(key))
originAccessTime := hdr.accessTime
hdr.accessTime = now
hdr.expireAt = expireAt
hdr.valLen = uint32(len(value))
if hdr.valCap >= hdr.valLen {
// The existing value space can store the value size
atomic.AddInt64(&seg.totalTime, int64(hdr.accessTime)-int64(originAccessTime))
seg.rb.WriteAt(hdrBuf[:], matchedPtr.offset)
seg.rb.WriteAt(value, matchedPtr.offset+ENTRY_HDR_SIZE+int64(hdr.keyLen))
atomic.AddInt64(&seg.overwrites, 1)
return
}
// Delete the corresponding entryPtr, which involves slotsData memory copy. Ringbug only marks deletion
seg.delEntryPtr(slotId, slot, idx)
match = false
// increase capacity and limit entry len.
for hdr.valCap < hdr.valLen {
hdr.valCap *= 2
}
if hdr.valCap > uint32(maxKeyValLen-len(key)) {
hdr.valCap = uint32(maxKeyValLen - len(key))
}
} else { / / no data
hdr.slotId = slotId
hdr.hash16 = hash16
hdr.keyLen = uint16(len(key))
hdr.accessTime = now
hdr.expireAt = expireAt
hdr.valLen = uint32(len(value))
hdr.valCap = uint32(len(value))
if hdr.valCap == 0 { // avoid infinite loop when increasing capacity.
hdr.valCap = 1}}// The actual data length is ENTRY_HDR_SIZE=24 + the length of the key and value
entryLen := ENTRY_HDR_SIZE + int64(len(key)) + int64(hdr.valCap)
slotModified := seg.evacuate(entryLen, slotId, now)
if slotModified {
// the slot has been modified during evacuation, we need to looked up for the 'idx' again.
// otherwise there would be index out of bound error.
slot = seg.getSlot(slotId)
idx, match = seg.lookup(slot, hash16, key)
// assert(match == false)
}
newOff := seg.rb.End()
seg.insertEntryPtr(slotId, hash16, newOff, idx, hdr.keyLen)
seg.rb.Write(hdrBuf[:])
seg.rb.Write(key)
seg.rb.Write(value)
seg.rb.Skip(int64(hdr.valCap - hdr.valLen))
atomic.AddInt64(&seg.totalTime, int64(now))
atomic.AddInt64(&seg.totalCount, 1)
seg.vacuumLen -= entryLen
return
}
Copy the code
Seg. Evacuate evaluates whether ringbuf has enough space to store the key/value. If not, OldOff := seg.rb.end () + seg.vacuumlen-seg.rb.size ()). If the corresponding data has been logically deleted or expired, Then the block memory can be reclaimed directly. If the reclamation condition is not met, the entry is replaced from the ring head to the ring end and the entry index is updated. If this loop does not work for five times, the current oldHdrBuf needs to be reclaimed to meet the memory requirements.
If the number of elements in []entryPtr is greater than seg. SlotCap (initial 1), you need to expand the space. See SEg. Expand for corresponding methods, which will not be described here.
To Write ringbuf, run rb.write.
func (seg *segment) evacuate(entryLen int64, slotId uint8, now uint32) (slotModified bool) {
var oldHdrBuf [ENTRY_HDR_SIZE]byte
consecutiveEvacuate := 0
for seg.vacuumLen < entryLen {
oldOff := seg.rb.End() + seg.vacuumLen - seg.rb.Size()
seg.rb.ReadAt(oldHdrBuf[:], oldOff)
oldHdr := (*entryHdr)(unsafe.Pointer(&oldHdrBuf[0]))
oldEntryLen := ENTRY_HDR_SIZE + int64(oldHdr.keyLen) + int64(oldHdr.valCap)
if oldHdr.deleted { / / deleted
consecutiveEvacuate = 0
atomic.AddInt64(&seg.totalTime, -int64(oldHdr.accessTime))
atomic.AddInt64(&seg.totalCount, - 1)
seg.vacuumLen += oldEntryLen
continue} expired := oldHdr.expireAt ! =0 && oldHdr.expireAt < now
leastRecentUsed := int64(oldHdr.accessTime)*atomic.LoadInt64(&seg.totalCount) <= atomic.LoadInt64(&seg.totalTime)
if expired || leastRecentUsed || consecutiveEvacuate > 5 {
// It can be recycled
seg.delEntryPtrByOffset(oldHdr.slotId, oldHdr.hash16, oldOff)
if oldHdr.slotId == slotId {
slotModified = true
}
consecutiveEvacuate = 0
atomic.AddInt64(&seg.totalTime, -int64(oldHdr.accessTime))
atomic.AddInt64(&seg.totalCount, - 1)
seg.vacuumLen += oldEntryLen
if expired {
atomic.AddInt64(&seg.totalExpired, 1)}else {
atomic.AddInt64(&seg.totalEvacuate, 1)}}else {
// evacuate an old entry that has been accessed recently for better cache hit rate.
newOff := seg.rb.Evacuate(oldOff, int(oldEntryLen))
seg.updateEntryPtr(oldHdr.slotId, oldHdr.hash16, oldOff, newOff)
consecutiveEvacuate++
atomic.AddInt64(&seg.totalEvacuate, 1)}}}Copy the code
The freecache Get process is relatively simple. Use hash to find the segment, slotId to find the index slot, and binary + traversal to find the data. If the segment is not found, ErrNotFound is returned, otherwise some time indicators are updated. The Get process also updates metrics related to cache hit ratio.
func (cache *Cache) Get(key []byte) (value []byte, err error) {
hashVal := hashFunc(key)
segID := hashVal & segmentAndOpVal
cache.locks[segID].Lock()
value, _, err = cache.segments[segID].get(key, nil, hashVal, false)
cache.locks[segID].Unlock()
return
}
func (seg *segment) get(key, buf []byte, hashVal uint64, peek bool) (value []byte, expireAt uint32, err error) {
hdr, ptr, err := seg.locate(key, hashVal, peek) // hash+ Locate search
iferr ! =nil {
return
}
expireAt = hdr.expireAt
if cap(buf) >= int(hdr.valLen) {
value = buf[:hdr.valLen]
} else {
value = make([]byte, hdr.valLen)
}
seg.rb.ReadAt(value, ptr.offset+ENTRY_HDR_SIZE+int64(hdr.keyLen))
}
Copy the code
After locating the data, read ringbuf. Generally, the value read is the newly created memory space, so the operation of copying []byte data is involved.
3 summary
From the perspective of the pressure measurement performance of several common cache frameworks, the difference in Set performance is large but not of an order of magnitude, and the difference in Get performance is small. Therefore, for most scenarios, it is not necessary to pay too much attention to the Set/Get performance, and the focus should be on whether the function meets the business needs and gc impact. The comparison of the performance pressure measurement is as follows: Golang2.eddycjy.com/posts/ch5/0… .
There is a special scenario for caching, that is, all data is cached in memory. When it comes to updating, all data is updated (replaced) in full. In this scenario, if the size is not set properly, some data may be eliminated, which is not expected. To avoid this problem with Freecache, set the size to “large enough”, but be aware of the memory footprint.