Persistent volume PersistentVolume

PersistentVolume (PV) is a storage resource in a cluster that is actively created by an administrator or dynamically provided using StorageClass. Like node resources, it does not belong to any namespace and has its own independent life cycle. PersistentVolumeClaim (PVC) enables the user to claim the required storage resources.

1. Life cycle

In the Kubernetes cluster, PV exists as a storage resource and Pod uses PV through PVC. The interaction between PV and PVC has its own life cycle, which is divided into five stages:

  • Provisioning: Provisioning is the creation of PVS. PVS can be created either statically or dynamically using the StorageClass.
  • Binding: Allocate PV to PVC;
  • Using: Pod uses this Volume through PVC.
  • Releasing: Releasing Volume and removing PVC
  • Reclaiming: Reclaiming the PV. You can keep the PV for future use or delete it from the cloud storage.

According to the above five phases, a storage volume can exist in the following four states:

  • Available: indicates that PV and PVC are ready for use.
  • Bound: indicates that the PV is assigned to the PVC;
  • Released: indicates that PVCS have been unbound, but no recycling strategy has been implemented;
  • Failed: Indicates that an error occurs in the PV.

1.1 Provisioning

Provisioning is the provision of available storage volumes for a cluster. In Kubernetes, there are two ways to provide persistent storage volumes: static or dynamic.

1.1.1 Static

PV is created by the cluster administrator of Kubernetes to represent real storage and can be used by Pod as real storage. In the case of static provisioning, the CLUSTER administrator pre-creates the PV, the developer creates the PVC and the Pod, which uses the STORAGE provided by the PV through the PVC. The static provisioning process is shown below:

2.1.2 Dynamic

For the dynamic provisioning mode, when the static PV created by the administrator cannot match the PVC of the user, the cluster tries to automatically provide a storage volume for the PVC, which is based on the StorageClass. On the dynamic provisioning side, the PVC needs to request a storage class, but this storage class must be pre-created and configured by the administrator. The cluster administrator needs to enable the DefaultStorageClass controller in apI-server. The dynamic supply process is shown in the figure below:

1.2 Binding

Kubernetes dynamically binds PVCS to available PVS. If a PV has been dynamically supplied to a new PVC, then the PVC binding is exclusive. In addition, users always get the storage they ask for, but the volume may exceed their request. Once bound, PVC bindings are proprietary, regardless of their binding mode.

If there is no matching PV, the PVC will remain unbound indefinitely until a matching PV exists. For example, even if there are a lot of 50G PVS in the cluster, PVCS that require 100G capacity will not match the required PVS, and PVCS will not be bound until 100G PVS are in the cluster. PVC will bind PV based on the following conditions. If both of the following conditions exist, PV meeting all requirements will be selected for binding:

  1. If the PVC specifies a storage class, only PVCS that specify the same storage class are bound;

  2. If the PVC is configured with a selector, the selector matches the PV;

  3. If the storage class and Settings picker are not specified, the PVC will match the appropriate PV based on the storage capacity and access mode.

1.3 Using

The Pod uses the PVC as a volume through which the cluster looks for bound PVS and attaches them to the Pod. For volumes that support multiple access modes, users can specify the required access mode when using PVC as a volume. Once the user has a bound PVC, the bound PV is owned by the user. The user can access the occupied PV through the PVC contained in the storage volume of the Pod.

1.4 Releasing

After the PVC is deleted, the user can request apI-server to delete the PVC. After the PVC is deleted, the corresponding persistent storage volume is regarded as “released”, but cannot be used by other PVCS at this time. The previous PVC data is still stored in the volume and is subject to subsequent processing based on the policy.

1.5 Reclaiming

The PV recycling strategy explains to the cluster how to follow up when the PVC releases volumes. There are currently three strategies available: retain, recycle, or delete. The retention policy allows you to reapply for this resource. In cases where PVC is able to support it, the delete policy deletes both the volume and the storage content in the AWS EBS/GCE PD or Cinder volumes. If the plugin supports it, the recycle policy performs the basic erase operation (rm -rf /thevolume/*) and thevolume can be reapplied.

1.5.1 retain

The retention reclamation policy allows manual reclamation of resources. When the PVC is deleted, the PV continues to store the existing data, and although the storage volume is in the freed state, it is still not available to other PVCS. An administrator can manually reclaim a storage volume by performing the following steps:

  1. Delete A PV: After a PV is deleted, the storage assets associated with it remain in the external facility.

  2. Manually delete data left in external storage.

  3. Manually delete storage assets. To reuse these storage assets, you need to create new PVS.

1.5.2 cycle

This policy will be discarded and a dynamic provisioning model is recommended for future use.

Recycling will execute the basic erase command rm -rf /thevolume/ on the storage volume to make the data available to the new PVC.

1.5.3 delete

For storage volume plug-ins that support deletion of reclamation policies, deletion removes PV from Kubernetes and storage assets from associated external facilities, such as AWS EBS, GCE PD, Azure Disk, or Cinder storage volumes.

2. Create a persistent storage volume

When creating a persistent volume PV, you need to specify the actual storage type. The type is implemented as a plug-in. The following plug-ins are supported:

  • GCEPersistentDisk
  • AWSElasticBlockStore
  • AzureFile
  • AzureDisk
  • CSI
  • FC (Fibre Channel)
  • FlexVolume
  • Flocker
  • NFS
  • iSCSI
  • RBD (Ceph Block Device)
  • CephFS
  • Cinder (OpenStack block storage)
  • Glusterfs
  • VsphereVolume
  • Quobyte Volumes
  • HostPath (Single node testing only — local storage is not supported in any way and WILL NOT WORK in a multi-node cluster)
  • Portworx Volumes
  • ScaleIO Volumes
  • StorageOS

Here is a persistent volume declaration YAML profile. In this configuration file, 5Gi storage space is required, the storage mode is Filesystem, the access mode is ReadWriteOnce, the Recycle Recycle policy is used to Recycle persistent storage volumes, the storage class is slow, and the NFS plug-in type is used.

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv
spec:
  capacity: # capacity
     storage: 5Gi 
  volumeMode: Filesystem Storage volume mode
  accessModes: Access mode
  - ReadWriteOnce
  persistentVolumeReclaimPolicy: Recycle  Persistent volume reclamation policy
  storageClassName: slow   # storage class
  mountOptions: # mount options
   - hard
   - Nfsvers = 4.1
  nfs:
     path:/tmp
     Server: 172.17.0.2 
Copy the code

2.1 Capacity

PV requires the capCity property to specify the storage capacity. Currently, the capCity property can only be set to storage.

2.2 Volume Mode

The default value of the storage volume mode is Filesystem. After Kubernetesv1.9 users can specify the value of volumeMode, which supports raw block devices as well as files system.

2.3 Access Modes

The possible range of access modes is as follows:

  • ReadWriteOnce (RWO for short) : Allows only a single node to be mounted for reading and writing.
  • ReadOnlyMany (ROX for short) : Allows multiple nodes to be mounted and read-only.
  • ReadWriteMany (RWX for short) : allows multiple nodes to be mounted for reading and writing.

Even if a storage volume plug-in supports multiple access modes, only one access mode can be configured at a time. The following is a list of access modes supported by each storage volume plug-in:

Storage volume plug-in ReadWriteOnce ReadOnlyMany ReadWriteMany
AWSElasticBlockStore
AzureFile
AzureDisk
CephFS
Cinder
FC
FlexVolume
Flocker
GCEPersistentDisk
Glusterfs
HostPath
iSCSI
PhotonPersistentDisk
Quobyte
NFS
RBD
VsphereVolume — (Works when Pods are collocated)
PortworxVolume
ScaleIO
StorageOS

2.4 Class (Class)

The storageClassName attribute specifies the storage category, that is, the name of the StorageClass resource object. PVS with a specific class can only be bound to PVCS that request that class.

2.5 Recycling Policy

Available values of the current reclamation policy include:

  • Retain — After the persistent volume is released, data is retained and needs to be reclaimed manually.
  • Recycle— Reclaim space, delete PVC, execute base erase commandrm-rf /thevolume/*(NFS and HostPath storage support);
  • Delete — After the PVC is deleted, pV-related storage data (AWSEBS, GCE PD storage support) will be deleted;

2.6 Mounting Parameters (Mount Options)

When a PV is mounted to a node, you may need to set additional mount parameters. You can set the mountOptions field. However, only the following storage volume types support mount parameters:

  • GCEPersistentDisk
  • AWSElasticBlockStore
  • AzureFile
  • AzureDisk
  • NFS
  • iSCSI
  • RBD (Ceph Block Device)
  • CephFS
  • Cinder (OpenStack block storage)
  • Glusterfs
  • VsphereVolume
  • Quobyte Volumes
  • VMware Photon

3. Persistent volume declaration

Here is a persistent volume declaration YAML configuration file named MyClaim, which has ReadWriteOnce access mode, Filesystem storage mode, 8Gi storage required, slow storage class specified, and label selector and matching expression set.

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: pvc
spec:
  accessModes: Access mode
    - ReadWriteOnce
  volumeMode: Filesystem Storage volume mode
  resources: # resources
    requests:
      storage: 8Gi
  storageClassName: slow # storage class
  selector: # selector
    matchLabels:
      release: "stable"
    matchExpressions: # match expression
      - {key: environment.operator: In.values: [dev]}
Copy the code

3.1 the selector

In PVC, PV can be further filtered through label selectors. There are two kinds of selectors:

  • MatchLabels: PVPS are selected only if they have the same labels as here;
  • matchExpressions: A match expression consists of keys, values, and operators, includingIn, NotIn, ExistsDoesNotExist, only PV conforming to the expression can be selected;

If matchLabels and matchExpressions are set at the same time, and (&) will be evaluated, that is, only PVS that meet the matching requirements above will be selected.

3.2 storage class

In addition to filtering PV with labels, the storageClassName attribute specifies the storage class. Only the STORAGE class of PV can be bound to the PVC.

4. PVC is used in Pod

The Pod accesses the PV using PVC, which is in the same namespace as the PVC, which binds to the appropriate PV in the cluster and mounts the PV to the host and Pod.

kind: Pod
apiVersion: v1
metadata:
  name: mypod
spec:
  containers:
  - name: nginx
    image: nginx
    volumeMounts: 
    - mountPath: "/var/www/html"
    	name: mypd  			  # Name of the mounted storage volume
 	volumes: 
 	- name: mypd
   	persistentVolumeClaim: 
     	claimName: myclaim  # PVC name
Copy the code

5. Local persistent storage

The Local Persistent Volume is to store data to the host where Pod runs, and it must ensure that Pod is scheduled to the node with Local Persistent storage.

Why do you need this type of storage? Sometimes your application has high disk I/O requirements, and the performance of network storage is not as good as that of local storage, especially when SSD disks are used locally.

In the scenario of non-local persistent storage, PV is created first, then PVC is created, and if the two match, they are automatically bound. Even when a DYNAMIC PV is created, the Pod is scheduled to a node, then the PVCS are created from the PVC and bound to the Pod.

One problem with local persistent storage, however, is that the PV must be prepared in advance, and not every cluster node has the PV, so pods cannot be scheduled at will. How to ensure that the Pod is scheduled to the PV node? In this case, node affinity needs to be declared in PV, and volume distribution needs to be considered when Pod is scheduled.

  1. Here’s the definition of PV, even thoughnode02It’s also on the node/data/vol1Directory, but PV doesn’t have to be therenode2becausenodeAffinityProperty sets the affinity hostnode01.
apiVersion: v1
kind: PersistentVolume
metadata:
  name: example-pv
spec:
  capacity:
    storage: 5Gi
  volumeMode: Filesystem
  accessModes:
  - ReadWriteOnce
  persistentVolumeReclaimPolicy: Delete
  storageClassName: local-storage
  local: # local type
    path: /data/vol1  # The specific path on the node
  nodeAffinity: Node affinity is set here
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - node01 # Here we use node01, which has /data/vol1 paths
Copy the code
  1. Define the storage class:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
Copy the code

Here,volumeBindingMode: WaitForFirstConsumerCritical, represented as delayed binding (waiting for the first consumer to bind), this configuration affects the timing of pvC-PV binding and only binds pvC-PV when the first Pod using the specified PVC appears.

Because PVCS are usually bound immediately when a suitable PV is available, if a Pod is scheduled to a node that does not have that PV, it will cause the Pod to hang all the time. The purpose of delayed binding is to schedule Pod with reference to volume distribution. When the scheduler starts scheduling Pod, it will see where the LPV it requires is, and then it will schedule to the node, perform PVC binding, and finally mount to Pod, ensuring that the node where Pod is located must be the node where LPV is located. Therefore, to delay binding of PVC is to wait until the Pod using this PVC appears on the scheduler, and then bind the PVC according to the comprehensive evaluation.

  1. Definition of PVC:
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: local-claim
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
  storageClassName: local-storage
Copy the code

You can see that the PVC is in a pending state, which is delayed binding because there is no Pod yet.

  1. Define the Pod:
apiVersion: apps/v1
kind: Deployment
metadata:
  name: tomcat-deploy
spec:
  replicas: 1
  selector:
    matchLabels:
      appname: myapp
  template:
    metadata:
      name: myapp
      labels:
        appname: myapp
    spec:
      containers:
      - name: myapp
        image: Tomcat: 8.5.38 - jre8
        ports:
        - name: http
          containerPort: 8080
          protocol: TCP
        volumeMounts:
          - name: tomcatedata
            mountPath : "/data"
      volumes:
        - name: tomcatedata
          persistentVolumeClaim:
            claimName: local-claim
Copy the code

At this point, since PV is on node01 node, this Pod is scheduled to node01 node. Even if the Pod is deleted and created, it will still be scheduled to Node01 node. And the PVC is already in the binding state.