This article is modified and organized from the content of the third go-Zero live broadcast of “Go Open Source Talk”. The video content is long and divided into two parts. The content of this article has been deleted and reconstructed.
Hi everyone, I’m glad to come to GO Open Source talk to share some of the stories, design ideas and usage of open source projects. Today’s project is Go-Zero, a Web and RPC framework that integrates various engineering practices. I’m Kevin, author of Go-Zero, and my Github ID is Kevwan.
An overview of the go to zero
Although Go-Zero was opened on August 7, 2000, it has already been tested online on a large scale, which is also my accumulation of nearly 20 years of engineering experience. After it was opened, it received positive feedback from the community and received 6K stars in more than 5 months. It has repeatedly topped the github Go Language list of the day, week and month, and won gitee’s Most Valuable Project (GVP) and the Open Source China’s Most Popular Project of the Year. At the same time, the wechat community is very active, with a community group of more than 3000 people. Go-zero enthusiasts share their experience in using Go-Zero and discuss problems in using it.
How does Go-Zero automatically manage caches?
Cache Design Principles
We only delete the cache, we do not update, once the DB data changes, we will directly delete the corresponding cache, rather than update.
Let’s take a look at the correct order to delete the cache.
- Delete the cache first, then update the DB
We see two concurrent requests, A request need to update the data, to delete the cache first, then B requests to read the data, the cache without the data, from the DB load data and write back cache, then update the DB, so at this point the cache data would have been dirty data, know that there are new updates cache expiration or data requests. As shown in figure
- Update DB first, then delete cache
A requests to update DB first, and then B requests to read data. At this time, old data is returned. At this time, it can be considered that REQUEST A has not finished updating, and the final consistency is acceptable
Let’s look at the normal request flow again:
- The first request updates the DB and removes the cache
- The second request reads the cache, no data, reads the data from DB, and writes it back to the cache
- Subsequent read requests can be read directly from the cache
Let’s take a look at the DB query. Suppose the row contains ABCDEFG columns:
- A request to query only part of the column data, such as ABC, CDE or EFG, as shown in the figure
- Query a single complete row, as shown in the figure
- Query some or all columns of multiple row records, as shown
For the above three cases, firstly, we do not use partial query, because partial query cannot be cached. Once cached, data will be updated and it is impossible to locate which data needs to be deleted. Secondly, for multi-line query, according to the actual scenario and needs, we will establish the corresponding mapping from query conditions to primary keys in the business layer. Go-zero has full cache management built in for single-row full record queries. So the core principle is: Go-Zero must cache complete row records.
Let’s take a closer look at the cache handling for three scenarios built into Go-Zero:
-
Primary key-based caching
PRIMARY KEY (`id`) Copy the code
This cache is relatively easy to handle and only requires redis to cache row records using the primary key as the key.
-
Caching based on unique indexes
When designing index-based cache, I refer to the design method of database index. In database design, when searching data by index, the engine will first find the primary key in the tree of index -> primary key, and then query row records by primary key. A layer of indirection was introduced to address the issue of indexing to row records. The same principle applies to go-Zero’s cache design.
Index-based caches are divided into single-column unique indexes and multi-column unique indexes:
-
The single-column unique index is as follows:
UNIQUE KEY `product_idx` (`product`) Copy the code
-
Multi-column unique index as follows:
UNIQUE KEY `vendor_product_idx` (`vendor`, `product`) Copy the code
But for Go-Zero, single-column and multi-column are just different ways of generating cache keys, and the control logic behind them is the same. Then go-Zero’s built-in cache management can better control data consistency problems, and also built-in to prevent cache breakdown, penetration, avalanche problems (these are discussed in detail in the gopherChina conference sharing, see the gopherChina sharing video).
In addition, Go-Zero has built-in cache traffic and access hit ratio statistics, as shown below:
Dbcache (SQLC) -QPM: 5057, hit_ratio: 99.7%, HIT: 5044, Miss: 13, db_fails: 0Copy the code
You can see more detailed statistics to analyze the cache usage, and in cases where the cache hit ratio is very low or the number of requests is very small, you can remove the cache, which also reduces the cost.
-
Cache code interpretation
1. Cache logic based on primary keys
The specific implementation code is as follows:
func (cc CachedConn) QueryRow(v interface{}, key string, query QueryFn) error {
return cc.cache.Take(v, key, func(v interface{}) error {
return query(cc.db, v)
})
}
Copy the code
If you don’t get it, you can return it. If you don’t get it, you can get the whole row from DB and write it back to the cache. The whole logic is fairly straightforward.
Let’s look at the implementation of Take in detail:
func (c cacheNode) Take(v interface{}, key string, query func(v interface{}) error) error {
return c.doTake(v, key, query, func(v interface{}) error {
return c.SetCache(key, v)
})
}
Copy the code
The logic of Take is as follows:
- with
key
Look up data from the cache - If found, the data is returned
- If you can’t find it, use
query
Method to read data - Call after reading
c.SetCache(key, v)
Set the cache
The doTake code and explanation are as follows:
// v - The data object to read
// key - Cache key
// query - The method used to read the full data from DB
// cacheVal - The method used to write to the cache
func (c cacheNode) doTake(v interface{}, key string, query func(v interface{}) error.
cacheVal func(v interface{}) error) error {
// Barrier is used to prevent cache breakdown and ensure that only one request is required to load the data corresponding to the key
val, fresh, err := c.barrier.DoEx(key, func(a) (interface{}, error) {
// Read data from cache
iferr := c.doGetCache(key, v); err ! =nil {
// If the placeholder is placed in the placeholder (to prevent cache penetration), then the default errNotFound is returned
// If it is an unknown error, then it will return directly, because we can not abandon the cache error and directly send all requests to DB,
// This will break DB in high concurrency scenarios
if err == errPlaceholder {
return nil, c.errNotFound
} else iferr ! = c.errNotFound {// why we just return the error instead of query from db,
// because we don't allow the disaster pass to the DBs.
// fail fast, in case we bring down the dbs.
return nil, err
}
/ / request DB
// If the error returned is errNotFound, then we need to set the placeholder in the cache to prevent cache penetration
if err = query(v); err == c.errNotFound {
iferr = c.setCacheWithNotFound(key); err ! =nil {
logx.Error(err)
}
return nil, c.errNotFound
} else iferr ! =nil {
// Failed to collect database statistics
c.stat.IncrementDbFails()
return nil, err
}
// Write data to the cache
iferr = cacheVal(v); err ! =nil {
logx.Error(err)
}
}
// Returns json serialized data
return jsonx.Marshal(v)
})
iferr ! =nil {
return err
}
if fresh {
return nil
}
// got the result from previous ongoing query
c.stat.IncrementTotal()
c.stat.IncrementHit()
// Write data to the incoming v object
return jsonx.Unmarshal(val.([]byte), v)
}
Copy the code
2. Cache logic based on unique indexes
Because this is a bit complicated, I’ve color-coded the code block and the logic of the response. Block 2 is actually the same as primary key-based caching, so I’m going to focus on the logic of block 1.
The block 1 part of a code block is divided into two cases:
-
The primary key can be found from the cache through the index
The primary key is now used to override block 2’s logic, following the primary key-based caching logic above
-
The primary key cannot be found from the cache by index
- Query complete row records from DB by index, if any
error
To return to - When the full row record is found, both the primary key to full row record cache and the index to primary key cache are written
redis
里 - Returns the desired row record data
- Query complete row records from DB by index, if any
// v - The data object to read
// key - The cache key generated by the index
// keyer - a method of generating a key based on the primary key cache from a primary key
// indexQuery - A method of reading complete data from the DB using an index that returns the primary key
// primaryQuery - Method of getting full data from DB with primary key
func (cc CachedConn) QueryRowIndex(v interface{}, key string, keyer func(primary interface{}) string.
indexQuery IndexQueryFn, primaryQuery PrimaryQueryFn) error {
var primaryKey interface{}
var found bool
// Query the cache through the index to see if there is a cache to the primary key
if err := cc.cache.TakeWithExpire(&primaryKey, key, func(val interface{}, expire time.Duration) (err error) {
// If there is no index to primary key cache, then the full data is queried by index
primaryKey, err = indexQuery(cc.db, v)
iferr ! =nil {
return
}
// Set the value of "found" to "found", and do not need to read data from cache
found = true
// Save the primary key to full data mapping in the cache. The takeWithepire method already saves the index to primary key mapping in the cache
returncc.cache.SetCacheWithExpire(keyer(primaryKey), v, expire+cacheSafeGapBetweenIndexAndPrimary) }); err ! =nil {
return err
}
// The index has already been found
if found {
return nil
}
// Read data from the cache using the primary key, or if the cache is not there, read data from the DB using the primaryQuery method, write back to the cache and return data
return cc.cache.Take(v, keyer(primaryKey), func(v interface{}) error {
return primaryQuery(cc.db, v, primaryKey)
})
}
Copy the code
Let’s look at a practical example
func (m *defaultUserModel) FindOneByUser(user string) (*User, error) {
var resp User
// Generate an index-based key
indexKey := fmt.Sprintf("%s%v", cacheUserPrefix, user)
err := m.QueryRowIndex(&resp, indexKey,
// Generate a full data cache key based on the primary key
func(primary interface{}) string {
return fmt.Sprintf("user#%v", primary)
},
// DB query method based on index
func(conn sqlx.SqlConn, v interface{}) (i interface{}, e error) {
query := fmt.Sprintf("select %s from %s where user = ? limit 1", userRows, m.table)
iferr := conn.QueryRow(&resp, query, user); err ! =nil {
return nil, err
}
return resp.Id, nil
},
// DB query method based on primary key
func(conn sqlx.SqlConn, v, primary interface{}) error {
query := fmt.Sprintf("select %s from %s where id = ?", userRows, m.table)
return conn.QueryRow(&resp, query, primary)
})
Sqlc. ErrNotFound is returned. If so, we use the ErrNotFound defined in this package
// Avoid user perception of cache use and also isolate the underlying dependencies
switch err {
case nil:
return &resp, nil
case sqlc.ErrNotFound:
return nil, ErrNotFound
default:
return nil, err
}
}
Copy the code
All the automatic management of cache code above can be generated automatically by the goctl, our team internal basic CRUD and cache are generated automatically by the goctl, can save a lot of development time, and cache the code itself is very easy to get wrong, even if have a very good code experience, also it is difficult to completely write to every time, So we recommend using automated cache code generation tools whenever possible to avoid errors.
Need more?
If you want to get a better feel for the Go-Zero project, head over to the official website to learn about specific examples.
Video Playback Address
www.bilibili.com/video/BV1Jy…
The project address
Github.com/tal-tech/go…
Welcome to Go-Zero and star support us!