Use Velero in TKE to migrate replication cluster resources

An overview of the

Velero (formerly known as Heptio Ark) is an open source tool that can safely backup and restore, perform disaster recovery, and migrate Kubernetes cluster resources and persistent volumes. Velero can be deployed in TKE clusters or self-built Kubernetes clusters for:

Back up the cluster and restore it if it is lost.
Migrate cluster resources to other clusters.
Copy the production cluster to the development and test cluster.

For more information about Velero, see the Velero website. This article describes how to seamlessly migrate replicated cluster resources between TKE clusters using Velero.

Principle of the migration

Install Velero instances on both the cluster to be migrated and the target cluster, and the Velero instances of the two clusters point to the same object storage location of Tencent Cloud COS. Use Velero to perform backup operations on the cluster to be migrated to generate backup data and store it to Tencent Cloud COS. Then use Velero to restore the data on the target cluster to realize the migration. The migration principle is as follows:

The premise condition

A Tencent cloud account has been registered.
Tencent Cloud COS service is available.
The TKE cluster to be migrated (called cluster A) and the TKE cluster to be migrated (called cluster B) have been created. For details about how to create A TKE cluster, see Creating A Cluster.
Both clusters A and B need to install Velero instances (version 1.5 or higher) and share the same Tencent Cloud COS bucket as Velero back-end storage. For the installation steps, refer to Configuring Storage and Installing Velero.

Matters needing attention

Starting with version 1.5, Velero can back up all pod volumes using Restic without having to annotate each pod individually. By default, this feature allows the user to back up all POD volumes with restic, except for the following:
- Mount the defaultService Account SecretThe volume of
- The mounthostPathType volume
- Mount Kubernetessecrets 和 configmapsThe volume of
This example requires Velero 1.5 or later and Restic to back up persistent volume data. Ensure that the –use-restic and –default-volumes to-restic parameters are enabled during Velero installation. See Configuring storage and Installing Velero for installation steps.
Do not perform ANY CRUD operations on the cluster resources on the two sides during migration. Otherwise, data inconsistency may occur after migration.
Ensure that the CPU and memory configurations of the working nodes in cluster B and cluster A are the same or not too different. Otherwise, the migrated Pods cannot be scheduled due to resource problems.

steps

Create A backup in cluster A

You can manually perform the backup operation or set up regular automatic backups for Velero using Velero schedule -h. This example compares the resources in the default and default2 namespaces. The following figure shows the Pods and PVC resources in the two namespaces of cluster A:

Tip: You can specify some custom Hook operations to be performed during a backup. For example, data in memory where the application is running needs to be persisted to disk prior to backup. For more information about backup hooks, see Backup Hooks.

The miniO object storage service in the cluster uses persistent volumes and has uploaded some image data, as shown in the following figure:

Run the following command to backup all other resources in the cluster that do not contain velero namespace resources (the default namespace for velero installation). If you want to customize the range of cluster resources that need to be backed up, use velero create backup -h to view supported resource filtering parameters.

velero backup create <BACKUP-NAME> --exclude-namespaces <NAMESPACE>
Copy the code

In this example, we create a “default-all” cluster backup, as shown in the following figure:

Backup task status display is “Completed”, explain the backup task is complete, can pass velero backup logs | grep error command to check whether there is a backup operation error, not the output is the backup process has no error occurs, as shown in the figure below:

Note: Please ensure that no error occurs during the backup process. If any error occurs during velero’s backup, please troubleshoot and perform the backup again.

After the backup is complete, temporarily update the backup storage location to read-only mode (optional, this prevents Velero from creating or deleting backup objects in the backup storage location during restore) :

kubectl patch backupstoragelocation default --namespace velero \
    --type merge \
    --patch '{"spec":{"accessMode":"ReadOnly"}}'
Copy the code

Perform the restore in cluster B

Before the restore operation, no workload resources exist in the default and default2 namespaces of cluster B. The following figure shows the result:

Temporarily update the Velero backup storage location in cluster B to read-only mode as well (optional, this prevents Velero from creating or deleting backup objects in the backup storage location during restore) :

kubectl patch backupstoragelocation default --namespace velero \
    --type merge \
    --patch '{"spec":{"accessMode":"ReadOnly"}}'
Copy the code

Tip: You can optionally specify custom Hook operations to be performed during restore or after the resource is restored. For example, you might need to perform a custom database restore operation before the database application container starts. For more information about restoring hooks, see Restoring Hooks.

Before restoring, ensure that Velero resources in cluster B are synchronized with backup files in the cloud storage. The default synchronization interval is 1 minute. You can use –backup-sync-period to set the synchronization interval. You can use the following command to check whether the backup of cluster A has been synchronized:

velero backup get <BACKUP-NAME>
Copy the code

After the check is successful, run the following command to restore all data to cluster B:

velero restore create --from-backup <BACKUP-NAME>
Copy the code

This example performs the restore process as shown below:

After the restore task is complete, view the restore log. You can use the following command to check whether the restore has errors and skipped messages:

# to see if the migration have the reduction of error information velero restore logs < BACKUP - NAME > | grep velero error # check migration to skip restore operation restore logs < BACKUP - NAME > | grep  skipCopy the code

To be deflected or deflected. To be deflected or deflected is to be deflected or deflected. To be deflected or deflected is to be deflected or deflected. When there is a resource conflict during restore, Velero skips restore. To be deflected or skipped about. To be deflected or skipped is normal in fact. We can ignore these “skipped” logs, or analyze them in special circumstances.

Verify migration results

Check the migration of cluster resources in verifying cluster B. The PODS and PVC resources in the default and default2 namespaces have been migrated as expected:

Log in to the Monio service in cluster B on the Web management page. The image data in the miniO service is not lost, indicating that persistent volume data is successfully migrated as expected.

After the migration, do not forget to restore the backup storage location to read/write mode (cluster A and cluster B), so that the next backup task can be used successfully:

kubectl patch backupstoragelocation default --namespace velero \
   --type merge \
   --patch '{"spec":{"accessMode":"ReadWrite"}}'
Copy the code

conclusion

This paper mainly introduces the principle, precautions and operation methods of migrating cluster resources between TKE clusters by Velero, and successfully migrates cluster resources from sample cluster A to cluster B seamlessly. The whole migration process is very simple and convenient, and it is A very friendly migration scheme of cluster resources.