preface

CSI Snapshot is a storage feature developed by Huawei in the Kubernetes community and entered Alpha in K8S 1.12. In the next two parts, we will cover the snapshot create/delete API and restore data volumes from Snapshot, using the CSI HostPath plug-in to demonstrate how to use both features.

Kubernetes CSI Snapshot

background

Many storage systems provide the ability to create snapshots of storage volumes to prevent data loss. Snapshots can replace traditional backup systems to back up and restore primary and critical data. Snapshots can quickly back up data (for example, create A GCE PD snapshot in a fraction of a second) and provide a fast recovery time target (RTO) and recovery point target (RPO). Snapshots can also be used for data replication, distribution, and migration. As early as Kubernetes 1.8, a prototype of the volume snapshot system was released and its implementation is located at external-storage(github.com/kubernetes-…) In the library. The prototype is based on the CRD implementation and provides two binaries, external Controller and Provisioner, supporting GCE PD, AWS EBS, OpenStack Cinder, Storage volumes such as GlusterFS and Kubernetes hostPath.

This article adds snapshot support for the CSI storage plug-in, which is the trend of Kubernetes community storage. The trend in Kubernetes is to keep the core API as small as possible, so we adopted the CRD implementation and added an external snapshot controller to handle volume snapshots. The External Provisioner will also be upgraded to support creating volumes from snapshots. CSI snapshot specification details at https://github.com/container-storage-interface/spec/pull/224.

The target

For the first snapshot support version in Kubernetes, we only supported the CSI volume plug-in to create snapshots on demand.

  • Goal 1: Standardize snapshot operations and support REST apis for creating, listing, and deleting snapshots. Currently, the API will use CRD (CustomResourceDefinitions) implementation.
  • Goal 2: Supports CSI volume snapshot.external-snapshotterWill work with other external components of the CSI volume plug-in (for example,external-attacher, external-provisioner) together.
  • Goal 3: Provide a convenient way to create new storage volumes from snapshots and restore existing volumes. The following objectives will not be achieved at this stage but will be considered at a later stage.
  • Goal 4: by providingpre/postSnapshot hook to freeze/unfreeze applications and/or unmount/mount file systems to provide a snapshot of application consistency.
  • Goal 5: Provide higher levels of management, such as backing up and restoring PODS and StatefulSets, and creating consistent snapshot groups.

The detailed design

In this proposal, volume snapshots are treated as another storage resource managed by Kubernetes. As a result, the snapshot API and controller follow existing volume management design patterns.

  • VolumeSnapshot
  • VolumeSnapshotContent
  • VolumeSnapshotClass

Three apis that are similar in structure to PersistentVolumeClaim and PersistentVolume and storageClass. The external snapshot controller is similar to the IN-tree PV controller. It is also recommended to add a new data source structure to the PersistentVolumeClaim (PVC) API to support restoring data volumes from snapshots. The following sections detail the API and controller design.

The Snapshot API design

The VolumeSnapshot and VolumeSnapshotContent apis are modeled after PersistentVolumeClaim and PersistentVolume. In the first version, the VolumeSnapshot lifecycle is completely independent of its source (PVC). When the PVC/PV is deleted, the corresponding VolumeSnapshot and VolumeSnapshotContent objects continue to exist. However, for some volume plug-ins, the snapshot is dependent on its storage volume. In future releases, we plan to do full lifecycle management to better handle the relationship between snapshots and their volumes. (For example, add Finalizer to prevent storage volumes from being deleted when snapshots depend on them.)

VolumeSnapshot object

VolumeSnapshotContent object

VolumeSnapshotClass object

We will add a new API object, VolumeSnapshotClass, rather than reuse the existing StorageClass to avoid mixing parameters between snapshot and volume. Each CSI volume plug-in can have its own default VolumeSnapshotClass. If VolumeSnapshotClass is not provided, the default value is used. VolumeSnapshotClass adds new parameters to the snapshot.

Snapshot Controller Design essentials

As shown in the following figure, the CSI snapshot Controller architecture consists of external-Snapshotter, External-snapshotter communicates with the out-of-tree CSI volume plug-in through a socket (the default is/run/CSI/socket and can be configured by -csi-address). External-snapshotter is part of the Kubernetes implementation of the Container Storage Interface (CSI). It is an external controller that monitors VolumeSnapshot and VolumeSnapshotContent objects and creates/deletes snapshots.

  • usuallyexternal-snapshotteruseControllerGetCapabilitiesTo verify that the CSI driver is supportedCREATE_DELETE_SNAPSHOTThe call.
  • external-snapshotterCreate/delete snapshots and bind VolumeSnapshot and VolumeSnapshotContent objects. It follows the Kubernetes controller pattern and uses Informer to monitor VolumeSnapshot and VolumeSnapshotContent for create/update/delete events. Filter out nonconforming VolumeSnapshot instances with Snapshotter == <CSI volume plug-in name > and process these events using a work queue with exponential retreat.
  • For dynamically created snapshots, it should be associated with a certain VolumeSnapshotClass. You can explicitly specify VolumeSnapshotClass in the VolumeSnapshot API object. If VolumeSnapshotClass is not specified, the default VolumeSnapshotClass created by admin is used. This is similar to using the default StorageClass to configure PersistentVolumeClaim.
  • For statically bound snapshots,user/adminYou must correctly specify bidirectional Pointers for VolumeSnapshot and VolumeSnapshotContent so that the controller knows how to bind them. Otherwise, if VolumeSnapshot points to a nonexistent VolumeSnapshotContent, or if VolumeSnapshotContent does not point to VolumeSnapshot, The VolumeSnapshot status is set to error.
  • External-snapshotter, external-attacher, and external-provisioner run in the same sidecar for each CSI volume plug-in.
  • In the current design, when the storage system fails to create a snapshot, retry will not be performed in the controller. This is because when the time of snapshot creation is important, users may not want to retry when a consistency snapshot or scheduled snapshot is taken. In future versions, the maxRetries flag or retry termination timestamp will be added to allow the user to control whether a retry is needed.

This article focuses on the SNAPSHOT API objects and the architecture design and implementation of external-Snapshotter. In the next article, we will cover restoring data volumes from Snapshot and demonstrate how to use both features.