“This is the second day of my participation in the Gwen Challenge in November. See details: The Last Gwen Challenge in 2021”
Source of background
Etcd was used as storage when apisix was used, and snowball’s past experience with ETCD was rarely used as KV storage on the infrastructure team side, but as a registry role
Apisix: Why did we choose ETCD as the configuration center?
For the configuration center, configuration storage is only the most basic function. APISIX also requires the following features:
- Clustering support
- The transaction
- Historical Version Management
- Change notification
- A high performance
APISIX requires a configuration center, and many of the features mentioned above are not available in traditional relational and KV databases. Etcd is comparable to Consul, ZooKeeper, etc. For more details, please refer to etCD WHY. Other configuration storage solutions may be supported in the future.
Etcd according to the md
About etcd
The main character of this article is ETCD. The name “ETCD” comes from two ideas, namely the Unix “/etc” folder and the “D” distributed system. The “/etc” folder is used to store configuration data for a single system, while etCD is used to store large-scale distributed configuration information. Thus, “/etc” with “D” assigned is “etcd”.
Etcd is designed as a universal substrate for large distributed systems. These large systems need to avoid splintering and are willing to sacrifice usability to achieve this. Etcd stores metadata in a consistent and fault-tolerant manner. Etcd clusters are designed to provide key-value storage with stability, reliability, scalability, and performance.
Distributed systems use ETCD as a consistent key-value store component for configuration management, service discovery, and coordination of distributed efforts. Many organizations use ETCD on production systems, such as container schedulers, service discovery services, and distributed data stores. Common distributed patterns using ETCD include leader election, distributed locking, and monitoring of machine activity.
Use case
- Container Linux for CoreOS: Applications running on Container Linux get automatic Linux kernel updates with zero downtime. Container Linux uses locks to coordinate updates. Locksmith implements a distributed semaphore on ETCD to ensure that only a subset of the cluster is being restarted at any given time.
- Kubernetes stores configuration data in ETCD for service discovery and cluster management; Consistency in ETCD is critical to container orchestration. The Kubernetes API server persists the cluster state to ETCD. It uses ETCD’s Watch API to monitor the cluster and roll back critical configuration changes.
Multidimensional contrast
Etcd may already look like a good fit, but as with all technology selection, we need to proceed with caution. Although an objective comparison of technologies and features would be ideal, the author’s expertise and bias is clearly skewed towards ETCD (experiments and documentation are written by etCD authors).
The table below is a quick, at-a-glance reference to the differences between ETCD and its most popular alternatives. Further instructions and details for each column are provided in the sections that follow the table.
etcd | ZooKeeper | Consul | NewSQL (Cloud Spanner, CockroachDB, TiDB) | |
---|---|---|---|---|
Concurrency Primitives | [Lock RPCs][etcd-v3lock], [Election RPCs][etcd-v3election], [command line locks][etcd-etcdctl-lock], [command line elections][etcd-etcdctl-elect], [recipes][etcd-recipe] in go | External [curator recipes][curator] in Java | [Native lock API][consul-lock] | [Rare][newsql-leader], if any |
Linearizable Reads | [Yes][etcd-linread] | No | [Yes][consul-linread] | Sometimes |
Multi-version Concurrency Control | [Yes][etcd-mvcc] | No | No | Sometimes |
Transactions | [Field compares, Read, Write][etcd-txn] | [Version checks, Write][zk-txn] | [Field compare, Lock, Read, Write][consul-txn] | SQL-style |
Change Notification | [Historical and current key intervals][etcd-watch] | [Current keys and directories][zk-watch] | [Current keys and prefixes][consul-watch] | Triggers (sometimes) |
User permissions | [Role based][etcd-rbac] | [ACLs][zk-acl] | [ACLs][consul-acl] | Varies (per-table [GRANT][cockroach-grant], per-database [roles][spanner-roles]) |
HTTP/JSON API | [Yes][etcd-json] | No | [Yes][consul-json] | Rarely |
Membership Reconfiguration | [Yes][etcd-reconfig] | [> 3.5.0] [zk – reconfig] | [Yes][consul-reconfig] | Yes |
Maximum reliable database size | Several gigabytes | Hundreds of megabytes (sometimes several gigabytes) | Hundreds of MBs | Terabytes+ |
Minimum read linearization latency | Network RTT | No read linearization | RTT + fsync | Clock barriers (atomic, NTP) |
And they are
ZooKeeper addresses the same issues as ETCD: distributed system coordination and metadata storage. However, ETCD is a step on the shoulders of predecessors, which references the design and implementation experience of ZooKeeper. Lessons learned from Zookeeper undoubtedly underpins etCD’s design to support large systems such as Kubernetes. Etcd improvements to Zookeeper include:
- Dynamically reconfigure cluster members
- Stable read and write under high load
- Multi-version concurrency control data model
- Reliable key value monitoring
- The lease primitive decouples the connections in the session
- API for distributed shared locks
In addition, ETCD supports multiple languages and frameworks right out of the box. Zookeeper has its own custom Jute RPC protocol, which is completely unique to Zookeeper and limits the language bindings it supports, while etCD’s client protocol is built on gRPC, a popular RPC framework with go, C ++, Java and other languages support. Similarly, gRPC can be serialized to JSON over HTTP, so even general-purpose command-line programs (such as curl) can communicate with it. Because systems can choose from many options, they are built on ETCD with native tools, rather than around ETCD based on a fixed set of technologies.
When considering functionality, support, and stability, ETCD is a more suitable component for consistent key-value storage than Zookeeper.
### Consul
Consul is an end-to-end service discovery framework. It provides built-in health check, fault detection, and DNS services. Consul also exposes the key value store using the RESTful HTTP API. In Consul 1.0, storage systems cannot scale in key-value operations like other components such as ETCD or Zookeeper. Systems with millions of keys will suffer from high latency and memory stress. Consul most notably lacks multi-version keys, conditional transactions, and reliable flow monitoring.
Etcd and Consul address different issues. If you are looking for distributed consistent key-value storage, ETCD is a better choice than Consul. If you are looking for end-to-end cluster service discovery, ETCD will not have sufficient functionality. You can choose from Kubernetes, Consul or SmartStack.
NewSQL(Cloud Spanner, CockroachDB, TiDB)
Etcd and NewSQL databases (e.g. Cockroach, TiDB, Google Spanner) all offer powerful data consistency guarantees with high availability. However, different system design approaches result in significantly different client apis and performance characteristics.
NewSQL databases are designed to scale horizontally across data centers. These systems typically partition data across multiple consistent replication groups (shards) that may be far apart and store data sets in terabytes or higher. This scaling makes them poor candidates for distributed coordination because they require long wait times and are expected to be updated using most localized dependency topologies. NewSQL data is organized into tables, including SQL-style query tools with richer semantics than ETCD, but at the cost of additional complexity in processing and optimizing queries.
In short, choose ETCD to store metadata or coordinate distributed applications. If more than GB of data is stored or you need a full SQL query, select NewSQL Database.
Use ETCD to store metadata configuration data
Etcd replicates all data in a single replication group. This is the most efficient way to store up to a few gigabytes of data in a consistent order. Each change in the cluster state, which may change multiple keys, assigns a globally unique ID (called a revision in ETCD) from a monotonically increasing counter for sorting. Since there is only one replication group, the modification request only needs to be submitted via the RAFT protocol. By limiting consensus to a replication group, ETCD uses a simple protocol to achieve distributed consistency while achieving low latency and high throughput.
Replication behind ETCD cannot scale horizontally because it lacks data sharding. In contrast, NewSQL databases typically shard data across multiple consistent replication groups and store datasets at terabytes or higher. However, to assign a globally unique and increasing ID to each change, each request must go through an additional coordination protocol between replication groups. This additional coordination step may cause conflicts on global ids, forcing orderly request retries. As a result, the performance of the NewSQL method is generally more complex than that of ETCD for strict consistency.
Choose ETCD if your application is primarily driven by metadata or sorting of metadata (for example, coordinating processes). If your application requires large data stores across multiple data centers and is largely independent of powerful global sorting properties, choose the NewSQL database.
Use ETCD as a distributed coordination component
Etcd has distributed coordination primitives such as event monitoring, leases, elections, and distributed locks out of the box. These primitives are maintained and supported by ETCD developers; Leaving these functions to external libraries that undertake the development of the underlying distributed software essentially leaves the system incomplete. NewSQL databases typically expect these distributed coordination primitives to be written by a third party. Also, ZooKeeper has a separate coordination library. Consul, which provides the local lock API, even apologizes for being “not bulletproof.” (Once one client releases the lock, other clients can’t immediately get the lock, possibly due to the lock-delay setting.) .
In theory, these primitives can be built on any storage system that provides strong consistency. But algorithms are often subtle. It is easy to develop a locking algorithm that appears to work, but breaks down due to boundary and timing deviations. In addition, other primitives supported by ETCD, such as transactional storage, depend on ETCD’s MVCC data model; Simply strong consistency is not enough.
For distributed coordination, choosing ETCD can help avoid operational headaches and reduce workload.
Experience with
In the construction of Apisix, there is a great concern about the compatibility of the etCD version. After all, the strong binding attribute of the ETCD version will cause an extra workload for the subsequent upgrade. This is the problem caused by ETCD, but how does apisix on the user side break the game? Another is that if you are looking for end-to-end cluster service discovery, ETCD will not have sufficient functionality in Consul above
So how to design in the multi – room and multi – live architecture?