“This is the 12th day of my participation in the Gwen Challenge in November. Check out the details: The Last Gwen Challenge in 2021.”
Hello, I’m looking at the mountains.
This article is a translation from Jakob Jenkov’s Software Architecture on caching technology. Although it is a 2014 article, it is not out of date in terms of software architecture.
The cache
Caching is a technique to speed up data lookup (data reading) by reading directly from locally cached data rather than from data sources, including databases and other remote systems.
A cache is a piece of storage that is closer to the user than the source data, allowing for faster read operations. The cache storage medium is usually memory or disk. In most cases, memory is used as the cache medium. However, data will be lost when the system restarts.
In software system, data cache exists multi-level cache level or multi-level cache system. In web applications, the cache has at least three storage locations, as shown in the following figure:
In Web applications, we use a variety of databases to store data that can be stored in memory so that we can read it directly instead of reading it from disk. The Web server can cache images, CSS files, JS files, and so on in memory without needing to access the file from the hard disk every time it is needed. Web applications can cache data read from the database so that they do not need to read data from the database over the network every time they are used. Finally, the browser may also store static files and data. In HTML5 browsers, there are localstorage space, application data cache, local SQL storage and other technical support caches.
When it comes to caching, there are a few things to consider:
• Write cache • Keep cache and remote system data synchronized • Manage cache size
I’ll discuss these in the next section.
Write cache
The first challenge is to read data from a remote system and write it to the cache, generally in two ways:
• Write ahead cache • Write while cache
Write-ahead caching is when the system starts up, the data is cached. To do this, you need to know in advance what data needs to be cached. But we sometimes don’t know which data needs to be cached at startup.
Use write caching means that the data is cached the first time it is used, and the data in the cache can be used later. The way this works is to first check if there is any data in the cache, then use it, and if there is no data, then read it from the remote system and write it to the cache.
In the following table, I list the advantages and disadvantages of early writes versus time writes:
| | | advantages disadvantages | | — – | — – | — – | | | write cache in advance than the first cache data is written to reduce delay | system startup initialization cache data, need a long time. Also, it is possible that cached data will never be used. | | | write cache cache for data are data that needs to be used, and there is no startup delay | in the cache data for the first time, use time is longer, may lead to inconsistent | user experience
Of course, in real practice, we might use both approaches: we could use pre-caching for hot data and time-caching for other data.
Keep cache and remote system data synchronized
One of the big challenges of caching data is keeping the cached data in sync with the remote system data, which is data consistency. There are generally different ways of doing this, depending on the architecture of the system, and we’ll talk about them.
Direct cache
Direct write caching is a way of allowing read and write to the cache, in which the computer holding the cached data writes the data to the remote system at the same time as it writes to the cache. Simply put, the write operation is written to the remote system.
This works only if the remote system’s data can only be modified by the direct write cache. If all data reads and writes go through a direct cache system, it is easy to update the written data to the remote system, keeping the cache consistent with the remote system data.
Based on expiration time
If the remote system can update data without relying on the remote system, it is difficult to ensure data synchronization between the cache and the remote system through direct write caching.
One way to keep cached data in sync is to set a cache time for the data. When data expires, it is purged from the cache. If the data needs to be read again, the latest data can be read from the remote system and cached.
The data expiration time depends on system requirements. Some types of data (such as articles) may not need to be fully updated at any time. You can set the expiration time to 1 hour. For some articles, you can even tolerate a 24-hour expiration.
It is important to note that if the expiration time is short, the remote system may be read frequently, reducing the usefulness of the cache.
Take the initiative to expire
There is also active expiration, which is actively updating cached data. For example, when remote system data is updated, a message is sent to the caching system indicating that the system data has been updated and the data can be set to expire.
The advantage of active expiration is that it is possible to ensure that cached data is updated as soon as remote system data is updated. An added benefit is that the “expiration based” approach is not implemented because unmodified data is not updated frequently.
The downside of active expiration is that you need to be able to detect changes in remote system data. If the remote system is a relational database and the data can be updated by different mechanisms, then each update mechanism needs to report what data they have updated, otherwise there is no way to notify the system that caches the data of expiration messages.
Managing cache size
Managing cache size is an important aspect. Many systems store so much data that it is impossible to store all of it in a cache. Therefore, a mechanism is needed to manage the amount of data cached. Managing the cache size usually involves clearing unwanted cache data to make enough space. Generally, there are the following ways:
• Time-based cleanup • FIFO • FIFO • FILO • Least used • Minimum access interval
The time-based cleanup approach is similar to the time-based expiration mentioned earlier. In addition to keeping data in sync with remote systems, it also reduces the size of cached data. You can start a separate listening thread, or you can clean up data while reading or writing new values.
First-in, first-out cleanup means that when a new cache is written, the earliest inserted cache value needs to be deleted. If there is enough space, no data can be deleted.
Fifo is the reverse of fifO and is useful for hot data situations where data is stored first.
The least-used cleanup mode cleans the cached data that has been accessed the least frequently first. The purpose of this method is to avoid cleaning up hotspot data. To achieve this method, you need to record the number of times the cache data is accessed. One issue to be aware of is that old values in the cache may have a high number of accesses, which means they will not be cleaned up. For example, the cache of an old article that used to be accessed many times, but has recently been accessed very rarely, will not be cleaned up because of its high traffic, even though the current traffic is low. To avoid this, the number of accesses can be counted for N hours.
Minimum access interval cleanup takes the access interval into account. When accessing cached data, you need to mark the time at which the data is accessed and increase the number of times it is accessed. The second time the cached data is accessed, the number of accesses is increased and the average access time is calculated. The average access time of the data that used to be hot data and was frequently accessed decreases as the access interval becomes long. When the access time drops to a low enough value, the data is cleared.
One variation is to count only the time of the last N accesses. N could be 100, 1, or any number that makes sense. Each time the access count reaches N, the access count is reset to 0, recording the access time. This way, the heat drop can be cleaned up more quickly.
Another variation is to reset the access count periodically and use only minimal access cleanup. For example, for each hour of cached data, the previous hour’s access count is stored in another variable for use in cleanup decisions. The access count is reset to 0 for the next hour. This mechanism has the same effect as the last change.
The difference between the last two variants can be summed up as whether the access count has reached N or the interval has exceeded Y at each cache check. The first way is to access the system clock every N times, while the second way is to read the system clock once on each access (to see if the interval has expired). Because checking an integer is usually faster than reading the system clock, I choose the first option.
Keep in mind that even with a cache size management system, data needs to be cleaned, read, and stored to ensure that they are consistent with remote systems. Although cached data is heavily accessed and resides on the system, it sometimes needs to be synchronized with a remote system.
Caches in server clusters
Caching in a single service is simpler because you can ensure that all writes go through a single server, using direct write caching. However, in a distributed cluster, the situation can be more complicated, as illustrated in the following figure:
Simply using the direct write cache updates the cache on the server that is doing the writing, and the other servers in the cluster are completely unaware of this and do not update the data.
In a server cluster, time-based expiration policies or active expiration policies can be used to synchronize cached data with remote systems.
Cache products
Implementing your own caching system is easy, depending on whether you need deep customization. If you don’t need to implement your own caching system, you can use an existing caching product. Such as:
- Memcached
- Ehcache
- Redis
I don’t know if these products will suffice, but I know they are widely used.
Recommended reading
- What are microservices?
- Microservices programming paradigm
- Infrastructure for microservices
- Feasible solutions for service registration and discovery in microservices
- From singleton architecture to microservice architecture
- How to effectively use Git to manage code in microservices teams?
- Summary of data consistency in microservice systems
- Implementing DevOps in three steps
- System Design Series how to Design a Short Chain Service
- System design series of task queues
- Software Architecture – Caching technology
- Software Architecture – Event-driven architecture
Hello, I’m looking at the mountains. Swim in the code, play to enjoy life. If this article is helpful to you, please like, bookmark, follow. Welcome to follow the public account “Mountain Hut”, discover a different world.