This is the 16th day of my participation in the August More Text Challenge. For details, see:August is more challenging

  • 📢 welcome to like: 👍 collect ⭐ message 📝 If there is any mistake please correct it, send roses, hand left fragrance!
  • 📢 This article is written by Webmote.
  • 📢 author’s motto: life lies in tossing about, when you do not toss about life, life will start tossing you, let us work together! 💪 💪 💪

Preface 🎏

“If there’s an off-the-shelf middleware or open source library, use it. Don’t reinvent the wheel.” A former colleague once said so.

It’s easy to have something ready to use, but when something isn’t available or isn’t suitable enough, we need to know how to build it and even build similar components ourselves.

Of course, learn more about others’ underlying ideas and even principles. When you encounter problems, you can open your mind and perhaps solve the problems in your hand properly.

🎏 01. Origin of Quorum?

In 1979, David K.Gifford published a paper called Weighted Voting for Replicated Data, detailing an algorithm called Quorum for maintaining consistency of Replicated copies in distributed systems.

In distributed systems, in order to keep data available, more nodes are added to process the same data in parallel and hide the failure of parts of the system.

🎏 02. Quorum’s logic?

  • Each node that owns a copy of the file has a vote, with the summary point n
  • We set the number of votes received for each read transaction to R, and the number of votes received for each write transaction to w, and r+w > n
  • Then, every read/write pair whose intersection is non-empty will see at least one copy of the latest written value

If we assume that r=1,w=n, then the cluster becomes a full copy of the WARO (Write All Read One) mode, but this mode is not highly available, because if one of the nodes fails, then there will be no writes, only reads.

The best performance mode is 1<r<w<n, which means 2 reads, 3 writes, for a total of 4 nodes. This is mainly because most apps are more read than write.

🎏 03. general process for Quorum

This introduces the concept of timestamps, which record the version number of the data, and each node needs to maintain it itself. As long as the number of read and write the number of votes is considered successful. Internal synchronization is not required.

  1. How to read the latest data? A maximum of R copies can be read to read the latest data, provided that the timestamp of the most recent successful commit is known.

  2. How do YOU determine if the latest timestamp data is a successful commit? Continue to read the other copies until the read timestamp appears w times.

🎏 04. Quorum optimization

The above general process, for the most part, is fine. But when the data volume is very large, it is not suitable. This is the Hash value that can be introduced into the data.

  • Store the Hash along with the data on each server
  • Return hash values and metadata (non-data) during read operations
  • Vote according to the Hash
  • After determining the correct Hash, query the data object to a single server
  • And computes its Hash to verify data integrity

🎏 05. System analysis

Read performance: average response time of read nodes. Write performance: the average response time of the node requiring a write;

Although a cluster using the Quorum protocol is not internally strongly consistent (all nodes write), it retains high availability. Viewed as a black box, it achieves strong consistency and high availability externally.

🎏 06. Summary

Routine summary, rational view!

Knot is what, knot is I want you to like but can not get the loneliness. 😳 😳 😳

👓 all see this, still care to point a like?

👓 all points like, still care about a collection?

Do you care about a comment when you have collected 👓?