Under the background of the current demand to design a set of underlying services system, provide a series of basic data request interface, the system service here called P, in order to ensure the high reliability, high availability system P least dependent on external middleware, such as databases, message queues, components and services involved in all the data cached in the local cache, Then, other services request interfaces or databases to collect data, store the collected data in Redis, and then notify P system to update the local cached data. The service that collects data is called D. The following is the P, D, Redis diagram.
It should be noted that P does not strongly depend on Redis and D system, because its local cache has a full amount of data, but the real-time performance of the data cannot be guaranteed, and D system needs to send update notifications periodically. This paper discusses the data synchronization of P and D system in Redis, and does not pay attention to other details.
Data synchronization is generally divided into full and incremental
In system initialization phase had better be to full volume load, after being local had complete data, after the update is best increments, it is can improve the network transmission performance, speed up the update progress, 2 it is can be avoided when large amount of data of the full amount of cover too slow affect normal business request, can also lead to received this update has been finished the next update notification, Make the system into update dilemma, resulting in bad cycle drag down performance. The following describes the implementation of the two synchronization methods
Realization of full coverage:
After system D pulls all the data, it directly deletes the old data structure on Redis, such as a Hash, and then transfers all the new data to the new Hash structure. After receiving the update, system P pulls all the data of the Hash to overwrite the local cache.
Implementation of incremental coverage:
There exist three kinds of incremental data inconsistencies, new data, deleting data, modify data, is easy to think of ways, after each D system all pull to the new data, using the data in a data set and Redis set, assuming that the new data set (n, m of old data sets, made difference set, n and m data is n the data of m exists, If the data does not exist, insert the data. The time complexity is n* M. In this way, only the newly added and modified data can be counted.
Optimize incremental synchronization:
Defects based on the scheme, put forward a kind of optimization idea, increase the version number to control the update of data, using Zset or Hash to store data, Score as the version number, namely after D system to get the full amount of the data, do not need to hang down the old data Redis, in direct contrast with the new data set to find, according to the state to describe the three types of data, If the data can be found, it indicates that the data has not been updated, and the score value of the record is increased. If the data may not be updated, the data is directly inserted and the score is set to the same value. When the whole process is completed, the data on Redis is updated again and distinguished by score. The score that has not been updated may have been updated or deleted, and the score that has been updated may be newly inserted or can be found. P After receiving the update notification, the system pulls the data set corresponding to score value to cover the local cache according to the version number score value.