The introduction
With the increasing popularity of cloud native technology today, running stateless applications on Kubernetes has been very mature, smooth expansion ability is also very strong, but for stateful applications, data need persistent storage, which still has a lot of room for improvement, facing many challenges.
Challenges of cloud native storage
Above is a CNCF survey of “challenges encountered in using/deploying containers.” According to the results of the report, challenges encountered by izumo native storage can be summarized in the following aspects:
Ease of use: Complex storage service deployment, operation and maintenance, low degree of cloud primitive, lack of integration with mainstream orchestration platforms.
High performance: A large number of applications have I/O access, resulting in high IOPS requirements and low latency.
High availability: The cloud native storage has been applied to the production environment and must be highly reliable and available without single points of failure.
Agility: RAPID PV creation, destruction, smooth expansion/contraction, RAPID PV migration with Pod migration, etc.
Common cloud native storage solutions
Rook-ceph: Rook-Ceph is an Operator that provides Ceph cluster management capabilities to perform its responsibilities using the capabilities provided by the underlying cloud native container management, scheduling, and choreography platform.
OpenEBS: The OpenEBS storage controller itself runs in a container. OpenEBS Volume consists of one or more containers that operate as microservices.
advantage
1. Integration with cloud native choreography system, with good container data volume access capability.
2. Fully open source, relatively active community, rich network resources, use materials, easy to start.
disadvantage
Rook – Ceph inadequate:
Poor performance: Poor I/O performance, throughput, and latency, which is difficult to apply to high-performance service scenarios.
High maintenance costs: although deployment and entry are simple, there are many components, complex architecture and difficult troubleshooting. Once problems occur in operation, it is very difficult to solve, requiring a strong technical team to ensure.
Openebs-hostpath insufficient: No high availability function, single point of failure.
Openebs-zfs-localpv insufficient: Install ZFS on disk, then create VOL on ZFS, also does not have high availability.
Therefore, it is mostly used in internal enterprise test environments, but rarely used to persist critical application data and deploy it to production environments.
Why Is NeonIO good for cloud native storage
NeonIO profile
NeonIO is an enterprise-class distributed block storage system that supports containerized deployment, Kubernetes platform provides Dynamic Provisioning for Persistent volumes, supports Clone, Snapshot, Restore, Resize, and other functions. NeonIO looks like this:
NeonIO includes the following service components:
Zk/ETCD: provides services such as cluster discovery, distributed coordination, and master selection
Mysql: provides metadata storage services, such as PV storage volume metadata
Center: provides logical management services, such as creating PV volumes and snapshots
Monitor: Provides monitoring services that expose collection monitoring metrics to Promethus
Store: a storage service that processes I/O functions of applications
Portal: provides UI services
CSI: provides CSI standard IO access services
Here’s why NeonIO is a good fit for cloud native storage:
Ease of use
1. Component containerization: containerization of service components, CSI, and Portal
2. CSI support: provides standard IO access capability and can create PV statically and dynamically
3.UI interface, convenient operation and maintenance
Storage operation and maintenance interface, alarm and monitoring visual management;
Performance monitoring based on PV granularity, such as IOPS and throughput, can quickly locate hotspot PVS.
Qos based on PV granularity can ensure high priority quality of service for users.
4. Highly integrated with cloud native:
Promethus is supported. Acquisition indicators of NeonIO are exposed to Promethus and Grafana for graphical display through ServiceMonitor.
At the same time, Promethus can be connected to the UI to display other cloud native monitoring indicators, such as Node-Exporter’s DISK I/O load and bandwidth.
Platform-based operation and maintenance, storage expansion, upgrade, disaster recovery operation and maintenance operations, only need Kubernetes some commands can be achieved, do not need to master too much storage related operation and maintenance knowledge.
Service discovery, distributed coordination support ETCD, metadata management, using CRD.
5. One-click deployment: helm install neonio./neonio –namespace kube-system
6. Compared with Rook-CEPh, simple and flexible deployment:
A high performance
Performance single PV IOPS 100K, sub-millisecond delay.
1. Distributed storage architecture with full flash
The I/O performance increases linearly with the number of nodes in a cluster.
Storage media support NVME SSDS.
RDMA support: Nodes are connected via high-speed RDMA technology.
2. Extremely short I/O paths: Discard the file system and develop the metadata management system to make the I/O paths extremely short.
3. Use HostNetwork mode
Benefits:
Store CSI Pod uses HostNetwork, directly using physical network, reducing network layers
The management network, front-end network, and data synchronization network are separated to avoid network competition.
High availability
1. Reliability and availability of service components
The management service uses 3 copies of Pod by default. The number of copies can be configured. It is recommended to use 3/5 copies
Use the probe to detect whether the Pod service is available and alive. When the Pod service department is detected, the component service can be removed. When the Pod dies, restart the Pod to restart the service
2. Data reliability and availability
The Volume fragment is Shard
Each Shard selects a storage location independently
Three copies of each Shard are stored on different physical nodes
When writing, write three copies synchronously, strong consistency
Read only from the master copy
The number of copies can be set by volume
agility
Pod cross-node reconstruction efficiency: 16s for mount/unload of 2000PV.
Batch PV creation capability: creates 2000 PVS in 5 minutes.
NeonIO performance
Teststand: NeonIO hyper-converged all-in-one cluster (3 nodes, 192.168.101.174-192.168.101.176).
Note: All tests use NVMe SSDs. Volume size = 1tib. Performance Tool: github.com/leeliu/dben… .
The yellow graph is NeonIO. The first graph is IOPS and the second graph is milliseconds. NeonIO has a clear advantage in IOPS and latency regardless of whether it is a single copy or three copies.
NeonIO application scenarios
Devops scenario: Rapid batch creation and destruction of PVS (five minutes) for 2000 PVS.
Database scenario: the WEB back-end database, such as MySQL, provides stable persistent storage with high IOPS and low latency.
Big data application analysis scenario: provides large capacity (PV can be expanded to 100TB).