Source: www.cyningsun.com/03-07-2020/…
directory
- background
- Congruent perspective
- Client consistency
- Server consistency
- Replication mechanism
background
Undoubtedly, as long as there are more than one replica/copy of data, there must be a problem of consistency. Multiple copies of data generally have the following functions:
-
Fault tolerance: When one copy fails, data can be read from other copies to ensure fault tolerance and avoid single points of failure.
-
Improved performance: Having the same data in multiple locations reduces data access latency
- Storing copies of data closer to the user, typical example: CDN
- Storing data in a high-performance storage medium, typical example: caching
-
Load sharing: Because there are multiple copies of data, each copy can handle part of the query requests.
In general, access to the copy should be consistent with access to the original data, and the copy itself should be transparent to external users, which is commonly understood as consistency. Everyone talks about consistency, but consistency is not necessarily the same thing.
Congruent perspective
In terms of usage, after data is removed from the storage system, it is processed by the service system and presented to common users.
Therefore, data consistency can be viewed from two perspectives:
- V1: server view
- V2: user perspective
Client consistency
First, define the following scenarios:
- Storage system: The storage system stores user data.
- User A: Writes data to the storage system and reads its own data and that of others.
- Users B and C read their own data and that of others.
From the user’s point of view, consistency includes the following three cases:
- Strong consistency: If USER A writes A value to the storage system first, the storage system ensures that subsequent read operations of USER A, USER B, and user C return the latest value. Of course, if the write operation “timed out”, success or failure is possible, and A should not assume anything.
- Weak consistency: If USER A writes A value to the storage system first, the storage system cannot ensure that the subsequent read operations of USER A, USER B, and user C can read the latest value.
- Final consistency: Final consistency is a special case of weak consistency. If A writes A value to the storage system first, the storage system guarantees that if no subsequent writes update the same value, the reads of A, B, and C will “eventually” read the latest value written by A. “Final” consistency has the concept of an “inconsistency window”, which specifically refers to the time between A writing and subsequent A, B, or C reading the latest value. The size of the “inconsistency window” depends on several factors: interaction latency, the load on the system, and the number of copies required to synchronize by the replication protocol.
The final consistency description is rough, but other common variations are as follows:
- Read-your-writes consistency: If client A writes the latest value, all subsequent operations on client A will Read the latest value. But other users (such as B or C) may not see it for a while.
- Session consistency: Read and write consistency is required during the entire Session between the user and the storage system. If the original session is invalid for some reason and a new session is created, the operation between the original session and the new session cannot ensure read and write consistency.
- Monotonic read consistency: If client A has already read A value of an object, subsequent operations will not read an earlier value.
- Monotonic write Consistency: The write operations of client A are completed in sequence. This means that multiple copies of the storage system must be completed in the same order for the operations of the same client.
From the perspective of users, the service system is generally required to support read/write consistency, session consistency, monotonic read, and monotonic write to provide high availability.
Server consistency
Define some definitions before you start: N = number of nodes where copies of data are stored W = number of updated copies that need to be acknowledged before the update is complete R = Number of copies obtained when accessing a data object through a read operation
If W + R > N, the write and read sets always overlap and strong consistency is guaranteed. N = 2, W = 2, and R = 1 in the primary backup Mysql scheme that implements synchronous replication. No matter which copy the client reads from, it will always get the same result. In asynchronous replication with reading from backup enabled, N = 2, W = 1, and R = 1. In this case, R + W = N, consistency is not guaranteed.
In distributed storage systems that need to provide high performance and high availability, the number of replicas is usually greater than two. Systems that focus only on fault tolerance typically use N = 3 (W = 2 and R = 2 configurations). Wechat’s early QuorumKV used this configuration
When W + R <= N, there is weak/final consistency, which means that the read and write sets may not overlap. The ability to achieve read and write, session, and monotonic consistency usually depends on the “stickiness” of the client to the server executing the distributed protocol. If you have the same server every time, it’s easier to guarantee read and write and monotonic reads. It also makes load balancing management and fault tolerance slightly more difficult, but it’s a simple solution.
From the perspective of a service system, a storage system can support strong consistency or only final consistency for performance. Systems that do not provide consistency are cumbersome to use.
Replication mechanism
In terms of storage system availability, the combination of storage structure and replication mechanism has the following three modes:
- Single master, asynchronous/asynchronous replication
- Automatic master selection and synchronous replication
- Multiple master available, synchronous replication
The three modes are from low to high in terms of implementation difficulty. The “in-memory database + disk database” type of storage architecture, if the two as a whole, it is easy to understand its “single-master” form:
- Most of the read requests hit the in-memory database and a few fall to the disk database
- Write requests are written to the disk database and then synchronized from the disk database to the in-memory database
Considering a generic architecture, because of the risk of data loss in an in-memory database, data is typically written to a disk database and then written or synchronized to an in-memory database. As a result, many companies (such as Facebook and our company) use asynchronous replication to update the cache. In the case of Mysql, the Commit Log (in Mysql’s case, binlog) of the disk database is used to automatically update the cache asynchronously
Of course, asynchronous replication can only solve the final consistency, not the scenario of strong consistency from the user perspective. This scenario can be solved by “Cache Aside Pattern”.
Reference links:
- Transaction processing in distributed systems
- Eventually Consistent
- Introduction to wechat background architecture and infrastructure
- En.wikipedia.org/wiki/Replic…
- Scaling Memcache at Facebook
- Design and implementation of Shopee Data Event Center
- Cache update routines