CAP, BASE, final consistency and the five-minute rule
Blog.csdn.net/u013613428/…
Ignition three weeksThe 2017-02-16 16:30:34 13802 Collect 5
Classified column: Big data
CAP, BASE, and final consistency are the three cornerstones of NoSQL database existence. The five-minute rule is the theoretical basis for in-memory data storage. This is where it all started.
CAP
- C: Consistency
- A: Availability (rapid data Availability)
- P: Tolerance of network Partition
Ten years ago, Professor Eric Brewer pointed out the famous CAP theory, which was later proved correct by Seth Gilbert and Nancy Lynch. CAP theory tells us that it is impossible for a distributed system to satisfy the three requirements of consistency, availability and fault tolerance of partition, at most two at the same time.
Since the current network hardware is bound to have problems such as delayed packet loss, partition tolerance is something we must implement. So there is a trade-off between consistency and availability, and no NoSQL system can guarantee all three. If you are concerned with consistency, then you need to deal with operation failures due to system unavailability, and if you are concerned with availability, then you should know that the system’s read operation may not be able to accurately read the latest value written by the write operation. Therefore, different systems focus on different strategies. Only by truly understanding the requirements of the system can we make good use of CAP theory.
As an architect, there are generally two directions to take advantage of CAP theory
Key-value storage, such as Amaze Dynamo, can flexibly choose different database products according to the CAP three principles. Domain model + distributed cache + storage (Qi4j and NoSql movement), according to the three principles of CAP combined with their own projects to customize flexible distributed solutions, high difficulty. I plan to offer a third option: implementing a database that can be configured with CAPS and dynamically deploying caps.
- CA: traditional relational database
- AP: key-value database
On the other hand, for large websites, the priority of availability and partition tolerance is higher than data consistency, and they will try to design in the direction of A and P, and then ensure the business requirements for consistency through other means. Instead of wasting energy on designing a perfectly distributed system that satisfies all three, architects should make trade-offs.
Different data have different requirements for consistency. For example, user reviews are insensitive to inconsistencies and can tolerate inconsistencies for relatively long periods of time without affecting the transaction or user experience. Product price data, on the other hand, is very sensitive and usually cannot tolerate price discrepancies of more than 10 seconds.
Proof of CAP theory: Brewer’s CAP Theorem
Final consistency
In a word: the process is loose, the result is tight, and the end result must be consistent
To better describe client-side consistency, we proceed through the following scenario, which has three components:
- The storage system
A storage system can be thought of as a black box that provides us with a guarantee of availability and durability.
- Process A
ProcessA implements write and read operations from the storage system
- The Process and ProcessC B
ProcessB and C are independent of A, and B and C are independent of each other, and they also implement write and read operations on the storage system.
The following scenario describes the different levels of consistency:
- Strong consistency
Strong consistency (Instant consistency) If A writes A value to the storage system first, the storage system guarantees that the subsequent read operations of A,B, and C will return the latest value
- Weak consistency
If USER A writes A value to the storage system first, the storage system cannot ensure that subsequent operations performed by user A, USER B, and user C can obtain the latest value. In this case, there is the concept of an “inconsistency window”, which refers specifically to the time between A writing and the latest reading of A,B, and C in subsequent operations.
- Final consistency
Final consistency is a special case of weak consistency. If A writes A value to the storage system first, the storage system guarantees that if no other writes update the same value before A,B, or C’s subsequent reads, all reads will eventually read the latest value written by A. In this case, if no failures occur, the size of the “inconsistency window” depends on several factors: The most famous system in terms of consistency is the DNS system. After updating the IP address of a domain name, depending on the configuration policy and cache control policy, Eventually all customers will see the latest values.
variant
- Causal consistency
If Process A notifies Process B that it has updated the data, subsequent reads by Process B read the latest value written by A, while C, which has no causal relationship with A, can finally be consistent.
- Read-your-index consistency
If Process A writes the latest value, the latest value will be read by subsequent operations of Process A. But other users may not see it for a while.
- Session consistency
This consistency requires read-your-writes consistency throughout the session between the client and the storage system.Hibernate’s session provides such consistency.
- Monotonic read consistency
This consistency requires that if Process A has already read A value of an object, subsequent operations will not read an earlier value.
- Monotonic write consistency
This consistency guarantees that the system will serialize all writes in a Process.
BASE
The word “BASE” means “BASE” or “ACID”. It’s a real feud.
- Basically Available– Basically Available
- Soft-state — Soft state/flexible transaction
Soft state is connectionless. Hard state is connection-oriented
- Eventual Consistency — Final Consistency
The ultimate consistency is also the ultimate goal of ACID.
BASE model Anti-ACID model, completely different ACID model, sacrifice high consistency, obtain availability or reliability: Basically Available Support partition failure (e.g. sharding fragmentation database) Soft state the Soft state can be asynchronously asynchronous for a period of time. Eventually consistent, rather than consistent all the time.
The main realization of BASE idea is 1. Divide database according to function 2. Sharding fragment
BASE focuses on basic usability. If you want high availability, pure high performance, at the expense of consistency or fault tolerance, BASE solutions have potential in performance.
other
The five-minute rule for I/O
In 1987,Jim GrayIn short, if a record is frequently accessed, it should be stored in memory, otherwise it should be kept on hard disk and accessed as needed. The tipping point is five minutes. While it may seem like an empirical rule, the five-minute estimate was based on the cost of input. According to the state of hardware at the time, the cost of holding 1KB of data in memory was equivalent to the cost of accessing 400 seconds of data on hard disk (close to five minutes). There was a review of the rule around 1997, which proved that the five-minute rule was still valid (hard drives and memory didn’t actually make a significant leap forward), and this review was for the possible impact of the “new old hardware” of SSDS.
With the advent of flash memory, the five-minute rule split into two: use SSDS as slow extended buffer pools or as fast extended disks. Movement of small pages between memory and flash vs. movement of large pages between flash and disk. Twenty years after it was first proposed, the 5-minute rule still holds true in the flash memory era, but for larger pages of memory (64KB pages, whose size reflects advances in computer hardware technology, bandwidth, and latency).
Don’t delete data
Oren Eini (also known as Ayende Rahien) advises developers to avoid soft deletes from databases, so readers might think hard deletes are a reasonable option. In response to Ayende’s article, Udi Dahan strongly recommends avoiding data deletion altogether.
Soft delete advocates adding an IsDeleted column to the table to keep the data intact. A row is considered deleted if the IsDeleted flag column is set. Ayende found this approach “simple, easy to understand, easy to implement, easy to communicate,” but “often wrong.” The question is:
Deleting a row or an entity is almost always not a simple event. It affects not only the data in the model, but also the appearance of the model. That’s why we have a foreign key to make sure that an order line doesn’t have a parent order. And this is the simplest case. …
When soft delete is used, whether we like it or not, it is easy to have data damage. For example, a small adjustment that no one cares about may cause “customer” “latest order” to point to an order that has been soft deleted.
If the developer is asked to delete data from the database, if soft delete is not recommended, then hard delete is the only option. To ensure data consistency, developers should delete related rows in a cascade as well as directly related rows. But Udi Dahan reminds readers that the real world is not cascading:
Suppose the Marketing Department decides to remove an item from its catalog. Does that mean that all old orders containing that item disappear? Should all invoices corresponding to these orders be deleted as well? Should the profit and loss statement of our company be redone after this step by step deletion?
There’s no sense in it.
The problem seems to be the interpretation of the word “delete”. Dahan gives an example like this:
When I say “deleted”, I actually mean that the product is “discontinued”. We won’t sell this kind of product any more, and we won’t buy any more after we clear our stock. Customers will no longer see the items when they search for them or flip through catalogues, but the warehouse keeper will continue to manage them for the time being. “Delete” is a convenient term.
He went on to offer some correct interpretations from the user’s perspective:
Orders are not deleted, they are “cancelled”. Cancelling orders too late can incur costs.
Employees are not deleted, they are “fired” (or perhaps retired). There are compensations to deal with.
Jobs are not deleted, they are “filled” (or the application is withdrawn).
In these examples, our focus should be on the task the user wants to accomplish, not the technical action that happens to an entity. In almost all cases, there is more than one entity to consider.
To replace the IsDeleted flag, Dahan recommends a field that represents the status of the relevant data: active, disabled, cancelled, abandoned, and so on. Such a status field allows users to review past data as a basis for decisions.
Deleting data has other negative consequences besides damaging data consistency. Dahan suggests keeping all data in the database: “Don’t delete it. Just don’t delete it.”
RAM is the hard disk and the hard disk is the tape
“Memory is the new hard drive, hard drive is the new tape,” Jim Gray famously said over the past 40 years. As “real-time” Web applications continue to emerge and more and more systems reach mass scale, how does this development model affect hardware and software?
Long before grid computing became a hot topic, Tim Bray discussed the advantages of RAM – and network-centric hardware architectures that could be used to build faster RAM clusters than disk clusters.
For random access to data, memory is orders of magnitude faster than hard disk (even the most advanced disk storage systems barely manage 1,000 seeks per second). Second, as network speeds increase in data centers, the cost of accessing memory falls further. Accessing another machine’s memory over the network is cheaper than accessing disk. As I write this, Sun’s Infiniband product line has a switch with nine fully connected non-blocking ports, each capable of 30Gbit/ SEC! Voltaire has even more ports; I can’t imagine. (If you want to keep up with the latest developments in these ultra-high-performance networks, follow Andreas Bechtolsheim’s classes in Stamford.)
Various operating times, based on summer 2001, typically configured 1GHz PC as standard:
Execute a single instruction | 1 nanosecond | Cache a word from L1 | 2 nanoseconds | Access a word from within | 10 nanoseconds | Retrieves a contiguous word from a disk | 200 nanoseconds | The disk addresses and fetches words | 8 millisecond | Ethernet | 2GB/s |
Tim also pointed out the truth in the second half of Jim Gray’s famous quote: “Hard drives are unendurably slow for random access; But if you use a hard disk like tape, it can swallow continuous data at an astonishing rate. It’s a natural fit for logging and journaling for RAM-based applications.”
Flash forward a few years, and we see hardware trends continuing in RAM and networking, but stagnating in hard drives. Bill McColl mentioned the emergence of massive memory systems for parallel computing:
Memory is the new hard drive! The speed of hard disks increases slowly, the capacity of memory chips increases exponentially, and the in-memory software architecture is expected to bring an order of magnitude performance improvement to all kinds of data-intensive applications. Small rack servers (1U, 2U) will soon have terabytes or more of memory, which will change the balance between memory and hard disk in server architectures. Hard disks will become the new magnetic tape, used as sequential storage media like tape (sequential access is fast) rather than random storage media (very slow). There’s a lot of opportunity for new products to be 10 times, 100 times better.
Dare Obsanjo points out what can happen if this mantra is not taken seriously — and that’s the trouble Twitter is in right now. When it comes to Twitter’s content management, Obsanjo said, “If a design simply reflects the problem description, you can fall into disk I/O hell to implement it. Whether you use Ruby on Rails, Cobol on Cogs, C++, or handwriting assembly, read and write loads will kill you.” In other words, random operations should be pushed to RAM, leaving only sequential operations to the hard disk.
Tom White is the committer of the Hadoop Core project and a member of the Hadoop Project Management Committee. He delves further into Gray’s mantra that the hard drive is the new tape. In discussing the MapReduce programming model, White points out why hard disks are still a viable application data storage medium for tools like Hadloop:
In essence, MapReduce works by streaming data to and from hard drives. MapReduce continuously sorts and merges data at the same rate as the hard drives. Data in a relational database, by contrast, is accessed at a hard disk’s seek rate (the process of moving a magnetic head to a specific location on the disk to read or write data). Why emphasize this point? Take a look at the evolution of seek time and disk transfer rates. Seek times increase by about 5% per year, while data transfer rates increase by about 20% per year. Seek time improves more slowly than data transfer rate — so it is advantageous to use a model where data transfer rate determines performance. MapReduce is exactly that.
While it remains to be seen whether solid-state drives (SSDS) can change the seek time/transfer ratio comparison, many in White’s accompanying post see SSDS as a balancing factor in the RAM/ hard drive debate.
Nati Shalom gives a well-reasoned review of the role of memory and hard disks in database deployment and usage. Shalom highlighted the use of database clustering and partitioning to address performance and scalability limitations. “Database replication and database partitioning share the same basic problems,” he says. “They both depend on file system/hard disk performance, and database clustering is very complex.” His proposed solution is to switch to in-memory Data Grid (IMDG) backed by technologies such as Hibernate secondary caching or GigaSpaces Spring DAO, Provide Persistence as a Service to your application. Shalom explains that IMDG
Provides object-based database capabilities in memory, supporting core database functions such as advanced indexes and queries, transaction semantics, and locking. IMDG also abstracts the topology of the data from the application code. In this way, the database doesn’t disappear completely, it just moves to the “right” location.
The advantages of IMDG over direct RDBMS access are listed below:
- In memory, speed and concurrency are much better than file systems
- Data can be accessed by reference
- Perform data operations directly on objects in memory
- Reduce contention for data
- Parallel aggregated queries
- Local in-process caching
- Eliminates object-relational mapping (ORM)
Whether you need to change the way you think about apps and hardware ultimately depends on what you’re trying to accomplish with them. But there seems to be a general consensus that it’s time for a change in the way developers approach performance and scalability.
Amdahl’s law and Gustafson’s law
Here, we use S(n) to represent the acceleration ratio of n core system to specific program, and K represents the ratio of serial part calculation time.
Acceleration ratio of Amdahl’s Law: S(n) = serial computing time using 1 processor/parallel computing time using N processors S(n) = 1/(K+(1-K)/n) = N /(1+(n-1)K) Acceleration ratio of Gustafson’s law: S(n) = parallel computation using n processors/serial computation using 1 processor S(n) = K+(1-k)n a little cold, isn’t it?
Generally speaking, Amdahl’s law regards the workload as 1, and n cores can only share 1-K workload. Gustafson’s law regards the workload of single core as 1, and n(1-k) workload can be increased if there are N cores.
This does not take into account the overhead of introducing distribution, such as networking and locking. Costs should be calculated carefully, not distributed better.
The complexity of the control algorithm is within the constant range.