What is CAP theory?

In July 2000, Professor Eric Brewer of University of California, Berkeley proposed CAP conjecture at ACM PODC conference. Two years later, Seth Gilbert and NancyLynch from MIT proved CAP theoretically, and CAP theory became the accepted theorem in the field of distributed computing.

CAP theory is composed of the following three concepts, which can only satisfy two conditions at the same time in a distributed system.

Consistency (C)

All nodes see the same data at the same time

All database cluster nodes see the same data at the same point in time, that is, all nodes can synchronize data in real time.

Availability (A)

Reads and writes always succeed

Read and write operations are always successful. That is, the service is always available. Even if some nodes in the cluster fail, the cluster can still respond to read/write requests from clients.

Partition fault tolerance (P)

The system continues to operate despite arbitrary message loss or failure of part of the system

The system continues to operate despite arbitrary information loss or failure in the system. In practical terms, partitioning is a time-bound requirement for communication. If the system cannot achieve data consistency within the time limit, it means that A partitioning situation has occurred and that it must choose between C and A for the current operation.

CAP tradeoff

1. Keep CA and abandon P

If you want to avoid partition fault tolerance problems, one way to do this is to put all the data (transaction-related) on one machine. While there is no 100% guarantee that the system will be error-free, you will not encounter the negative effects of partitioning. Of course, this choice will seriously affect the scalability of the system.

As a distributed system, giving up P is equivalent to giving up distribution. Once the concurrency is high, the stand-alone service cannot bear the pressure at all.

Like many banking services, P is really abandoned, only a single minicomputer +ORACLE to ensure service availability.

2. Keep CP and abandon A

The opposite of giving up “partition fault tolerance” is giving up availability. Once a partition fault tolerance fault occurs, the affected services need to wait for a certain period of time. Therefore, the system cannot provide external services during the waiting period.

As a distributed system, it is very likely that some partitioned services will fail, and if the entire service fails because some services fail, it is not a good distributed system at all.

3. Keep AP and discard C

The abandonment of consistency here is not to abandon data consistency completely, but to abandon strong consistency of data. That is, the consistency of the data at the same time is abandoned and the final consistency of the data is retained.

Take online shopping as an example. If two orders are received for an item with only one remaining in stock, the later order will be notified that the item is sold out.

Generally, many distributed service systems adopt this scheme to ensure availability. Distributed service, because some partition services have problems, will first tolerate, and finally achieve the final data consistency through some compromise methods.

Recommended reading

Dry goods: 2TB architect four-stage video tutorial

Interview: the most complete Java multithreaded interview questions and answers

Interview: the most comprehensive ali advanced Java interview questions in history

Interview: The most complete Spring interview questions in history

Tutorial: The most complete Spring Boot complete video tutorial

Books: 15 must-read books for advanced Java architects

Tools: Recommended an online creation flow chart, mind mapping software

Share Java dry goods, high concurrency programming, hot technology tutorials, microservices and distributed technology, architecture design, blockchain technology, artificial intelligence, big data, Java interview questions, and cutting-edge hot news.