Previously, we have described a high-performance read service solution and its upgrade solution that is easy to implement and low cost, but there are still two problems that have not been completely solved yet:
-
The first problem is the distributed transaction in order to ensure the real-time performance of cache update.
-
The second problem is the burr problem caused by lazy loading.
The following will address these two issues and work with you to build a burn-free read service with an average performance of 100ms or less using full caching.
The basic architecture of full caching
Full cache is an implementation that stores all data in the database in the cache without setting expiration time. The architecture of this implementation is shown in the following figure:
Because all data is stored in the cache, the read service is no longer relegated to the database when querying, and all requests are completely dependent on the cache. At this point, the burr problem caused by demoting to the database is resolved.
However, full caching does not solve the distributed transaction problem at update time, but rather magnifies the problem. Full cache has more stringent requirements on data update, requiring all existing database data and real-time updated data to be fully synchronized to the cache, and no omissions.
To solve this problem, an effective solution is to implement data synchronization using Binlog of subscribing database.
Binlog-based full cache architecture
Before implementing binlog-based architecture solutions, LET me briefly introduce Binlog, which I will discuss with you in more detail in Lecture 05. First, let’s look at the principle of Binlog, as shown below:
Binlog is the master/slave data synchronization solution for MySQL and most mainstream databases. The primary database writes all changes to its native Binlog file in a format. During the master-slave synchronization, the slave database will establish a connection with the master database, read the Binlog file of the master database serially through a specific protocol, and perform Binlog playback in the slave database, and then complete the master-slave replication.
Many open source tools (e.g. Ali’s Canal, MySQL_Streamer, Maxwell, Linkedin’s Databus, etc.) can emulate the master-slave replication protocol. The Binlog file of the master database is read through the emulation protocol to obtain all changes to the master database. In response to these changes, they open up various interfaces for business services to retrieve data.
Binlog-based full cache architecture relies on this kind of middleware to complete data synchronization. When Binlog middleware is mounted to the target database, all the changed data of the database can be obtained in real time. After these changes are parsed, they can be written directly to the cache.
After adopting Binlog synchronization scheme, the architecture of full cache becomes more complete, mainly in the following three aspects.
- Reduced latency
The cache is basically quasi-real-time. The master/slave synchronization of the database is maintained at the millisecond level, and the data changes of the database can be reflected in the cache in real time.
- Solves the problem of distributed transactions
The master/slave replication of binlogs is based on the ACK mechanism. If the synchronous cache fails, the Binlog consumed is not acknowledged. The next time the Binlog is consumed again, the data is eventually written to the cache. This solves the problem of data loss caused by failure to satisfy distributed transactions and ensures the final consistency of data.
- Improved code simplicity and maintainability
Because all changes to the database are eventually reflected in the Binlog, the Binlog data handler remains fixed as long as the database’s table structure remains unchanged. Recall that if you add the active update cache to your code as mentioned in the previous lecture, you need to add the update cache code for every interface you add to the database. The maintenance cost and error probability are much higher than Binlog.
With all the improvements that binlog-based full caching brings, are there any drawbacks? The answer is yes, any solution that improves one aspect must make trade-offs in other aspects, and architecture is an art of balance.
Problems with full caching using Binlog
Two problems arise when using a full Binlog cache.
The first problem is that it increases the overall complexity of the system. When there is only one database middleware in the architecture, the system is relatively simple. When Binlog is used, the entire data synchronization process becomes longer, and the concerns and error points become two instead of one middleware.
Second problem: The size of the cache increases exponentially, and the resource cost increases significantly. In some scenarios that require extreme performance and high real-time performance, there is only a trade-off to be made. In order to obtain these enhanced capabilities, there is a price to pay. In addition to the trade-offs, there are several technical improvements that can be made.
First, the data stored in the cache must be filtered and stored only when it has business meaning and will be queried. For example, some recording fields such as common database modification time, creation time, modification person, data significant bit and so on can not be stored in the cache.
Second, the data stored in the cache can be compressed. Some common compression algorithms, such as Gzip and Snappy, can be used for processing, but the compression algorithm usually consumes CPU. In the actual selection, you can first pressure test and then evaluate whether to choose. If you can’t bear the CPU cost of compression, you want to store JSON data or Redis Hash data directly in the cache. Here I share with you three more cache saving tips.
Tip 1: When serializing data in JSON format, you can add an alternate identifier to the field to indicate that the name of the field is represented by an alternate identifier after serialization. Suppose we have a DemoClass class in the form of an alternate identifier, as follows:
Class DemoClass{
@Field("1")
private field1;
@Field("2")
private field2;
}
Copy the code
The serialized data in this way is as follows:
{"1":field1Value,"2":field2Value}
Copy the code
The data without this identifier is as follows:
{"field1":field1Value,"field2":field2Value}
Copy the code
From the above example, although only the field1 and field2 keys are saved, the data volume is not large. But if you want to store tens of millions, hundreds of millions of pieces of similar data in the cache, the total amount of data is still very significant. In addition, the major JSON serialization tools already support this technique, such as Gson and FastJSON in Java. You can go to the tool’s website to see how to use it.
Tip 2: If your cache is Redis and you use its Hash structure to store data. The Field Field of the Hash structure can also be replaced with a shorter identifier using the same pattern as the JSON identifier above. The data savings are also significant when using full caching.
Tip 3: When a full cache is used to read all requests from the service, the problem of not being aware of cache loss occurs. For example, although Redis and other cache implementations provide functions such as persistence and master-slave backup, they do not provide functions such as ACID, which is similar to database, for the sake of performance. In extreme cases, data will still be lost.
In order to retain the advantages of full cache and solve this extreme problem, asynchronous calibration plus alarm and automatic completion can be used to deal with it. The architecture of this solution is shown below:
When there is no data in the read service query cache, empty data is returned directly to the caller (see mark 1 in Figure 4). At the same time, it sends a message through the MQ middleware (see tag 2 in the figure above). The consumer of this message asynchronously queries the database (see tag 3 above), generates an alarm or a log if data does exist in the database, and automatically flusher the data to the cache (see tag 4 above).
This is a lossy scheme. If the data is real in the database but not in the cache, and the caller gets empty data in the first call, why should we use this scheme?
In fact, the probability of such a situation in real life is extremely low. In my practical experience, this asynchronous calibration scheme has been turned off online, mainly from the following four aspects.
- According to data statistics, the probability that data exists in the database but not in the cache is almost zero.
- A large number of invalid asynchronous calibration queries against the database can lead to poor database performance.
- Even if the data in the cache is lost, Binlog will flush it back to the cache whenever it changes. If the data remains unchanged, it indicates that the data is dead and of little value.
- If you apply this solution to a production environment with asynchronous calibration enabled and there is still a lot of data loss, there is a lot of room for improvement in the use and tuning of caching middleware. After all, most of this data loss is caused by middleware itself. We should not put the cart before the horse and make the business team do too much to compensate for caching middleware problems.
Although in the end we did not use this loss compensation scheme, but this process of thinking and reasoning is worth your learning and reference. When you are faced with a similar problem at work and need to decide whether or not to adopt a technical solution, you can use the same method as above and use reasoning and data verification to make the final decision.
Other optimization points
With the Binlog synchronization scheme, the whole data synchronization becomes very simple. After receiving the data from Binlog, the data synchronization module can directly write the data into the cache after regular data conversion.
Real-time hot backup in multiple equipment rooms
To improve performance and availability, the caches that data synchronization modules write to can be changed from one cluster to two clusters. In terms of deployment, if resources permit, two sets of cache clusters can be deployed to equipment rooms in different cities or different zones in the same city. In addition, read services are deployed to different cities or partitions accordingly. Read services from different machine rooms or partitions rely only on the cache cluster with the same attributes when accepting requests. This scheme has two benefits.
First: improved performance. This approach is consistent with the principle mentioned in the last lecture that the read service should not be layered, and the service should be as close to the data as possible.
Second: increased usability. When a single room fails, all traffic can be seamlessly switched to a viable room or partition. The switching time of this scheme can reach the level of minutes or seconds, and the high availability is undoubted.
This solution improves performance and availability, but at the cost of higher resource costs.
Asynchronous parallelization
The simplest read service scenario is that a request interacts with the store only once, but in reality there are many times more than one interaction. For scenarios requiring multiple interactions with storage, asynchronous parallelization can be adopted. After receiving a read request, the internal read service changes the interaction mode between serial and storage to asynchronous parallelization with storage, as shown in the following figure:
If a read request and storage need to be exchanged for three times and each exchange takes 10ms, the total time is 30ms in serial mode. However, if the asynchronous parallel mode is used, the total time is still 10ms. Overall performance has improved a lot. But asynchronous parallelism also brings some problems and limitations:
- First, asynchronous parallelism increases the consumption of threads, with one thread for each asynchronous parallelism, resulting in CPU consumption.
- Secondly, asynchronous parallel multithreading development also brings about programming complexity and maintenance difficulty;
- Finally, asynchronous parallelization can only be used in scenarios where each interaction with the storage is independent and sequential.
In addition to the above scenarios, asynchronous parallelism can also be applied to scenarios where a batch of data is queried at a time. When a large batch of data is queried, most of the performance is consumed in serial waiting network transfers. This batch can be broken up into multiple sub-batches, with asynchronous parallelization and storage interaction for each sub-batch, and performance can be greatly improved. The specific number of sub-batches can be determined in practice based on pressure testing.