KubeSphere log backup and recovery practices

Why log backups

KubeSphere log system uses the Fluent Bit + ElasticSearch log collection and storage scheme, and realizes the life cycle management of Index through the Curator, and cleans the remote log regularly. KubeSphere’s default 7-day log retention policy is not enough for log audit and DISASTER recovery scenarios. Backing up ElasticSearch disks does not guarantee data recovery and integrity.

The ElasticSearch open source community provides the SnapShot API to help with long-term storage snapshots and recovery. This article describes how to implement log backup for KubeSphere (version 2.1.0) built-in ElasticSearch (version 6.7.0) to meet audit and disaster recovery requirements.

Note: You can use KubeSphere or ElasticSearch-dump to export logs with a small amount of data and query conditions. KubeSphere users of ElasticSearch can also directly enable the SnapShot Lifecycle Management feature provided by ElasticSearch X-Pack.

The premise condition

Before you can store the snapshot, you need to register the repository for the snapshot file in the ElasticSearch cluster. Snapshot repositories can use shared file systems, such as NFS. Other storage types, such as AWS S3, require separate repository plug-in support.

Let’s take NFS as an example. The shared snapshot repository needs to be mounted to all primary nodes and data nodes of ElasticSearch, and path.repo must be configured in ElasticSearch. yaml. NFS supports the ReadWriteMany access mode, so using NFS is a good fit.

The first step is to prepare an NFS server, such as the QingCloud vNAS service used in this tutorial, with a shared directory path of/MNT /shared_dir.

In KubeSphere environment, we will prepare NFS StorageClass. We will apply Persistent Volume for snapshot repository later. The environment in this article has been configured with NFS storage at installation time, so no additional operations are required. For installation requirements, please refer to the official KubeSphere documentation, modify conf/common.yaml, and run the install.sh script again.

1. ElasticSearch Setup

For KubeSphere, the primary node of ElasticSearch is a stateful replica of ElasticSearch-logging-discovery, and the data node is elasticSearch-logging-data.

$ kubectl get sts -n kubesphere-logging-system
NAME                              READY   AGE
elasticsearch-logging-data        2/2     18h
elasticsearch-logging-discovery   1/1     18hCopy the code

Create a snapshot Repository for ElasticSearch

cat <<EOF | kubectl create -f - --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: elasticsearch-logging-backup namespace: kubesphere-logging-system spec: accessModes: - ReadWriteMany volumeMode: Filesystem resources: requests: storage: 100Gi # Populate the storageClassName field based on your environmentCopy the code

Step 2 Modify the elasticSearch. yml configuration file to register the NFS directory paths with each primary and secondary node. In KubeSphere, the elasticSearch. yml configuration can be found in ConfigMap ElasticSearch-logging. In the last line, add the path. Repo: [“/usr/share/elasticsearch/backup “]

Step 3 Modify StatefulSet YAML, mount the storage volume to each ElasticSearch node, and run the chown command to initialize the owner user and user group of the snapshot repository folder as ElasticSearch when initContainer starts.

Note that kubectl edit could not be used to modify Stateful, which was implemented by kubectl apply.

kubectl get sts -n kubesphere-logging-system elasticsearch-logging-data -oyaml > elasticsearch-logging-data.yml
kubectl get sts -n kubesphere-logging-system elasticsearch-logging-discovery -oyaml > elasticsearch-logging-discovery.ymlCopy the code

Modify the yamL file. For example, modify the elasticSearch-logging-data. yml file on the active node.

ApiVersion: apps/v1 kind: StatefulSet Metadata: Labels: # apiVersion: apps/v1 kind: StatefulSet Metadata: Labels: # name: elasticsearch-logging-data namespace: Kubesphere - logging system # -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- # comments or delete the labels, name, and namespace yuan # information field ------------------------------------------------- # resourceVersion: "109019" # selfLink: /apis/apps/v1/namespaces/kubesphere-logging-system/statefulsets/elasticsearch-logging-data # uid: 423adffe-271f-4657-9078-1a75c387eedc spec: # ... template: # ... spec: # ... containers: - name: elasticsearch # ... volumeMounts: - mountPath: /usr/share/elasticsearch/data name: Data # -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - # add backup Volume mount # -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- mountPath: /usr/share/elasticsearch/backup name: backup - mountPath: /usr/share/elasticsearch/config/elasticsearch.yml name: config subPath: elasticsearch.yml # ... initContainers: - name: sysctl # ... - name: Chown # -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- # modify command, Adjust the snapshot folder owner warehouse # -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- the command: - / bin/bash - | set - e - c; set -x; chown elasticsearch:elasticsearch /usr/share/elasticsearch/data; for datadir in $(find /usr/share/elasticsearch/data -mindepth 1 -maxdepth 1 -not -name ".snapshot"); do chown -R elasticsearch:elasticsearch $datadir; done; chown elasticsearch:elasticsearch /usr/share/elasticsearch/logs; for logfile in $(find /usr/share/elasticsearch/logs -mindepth 1 -maxdepth 1 -not -name ".snapshot"); do chown -R elasticsearch:elasticsearch $logfile; done; chown elasticsearch:elasticsearch /usr/share/elasticsearch/backup; for backupdir in $(find /usr/share/elasticsearch/backup -mindepth 1 -maxdepth 1 -not -name ".snapshot"); do chown -R elasticsearch:elasticsearch $backupdir; done # ... volumeMounts: - mountPath: /usr/share/elasticsearch/data name: Data # -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - # add backup Volume mount # -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- mountPath: /usr/share/elasticsearch/backup name: backup # ... tolerations: - key: CriticalAddonsOnly operator: Exists - effect: NoSchedule key: dedicated value: log volumes: - configMap: defaultMode: 420 name: elasticsearch-logging name: Config # -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- # specified in the first step to create PVC # -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- name: backup persistentVolumeClaim: claimName: elasticsearch-logging-backup volumeClaimTemplates: - metadata: name: data spec: accessModes: - ReadWriteOnce resources: requests: storage: 20Gi storageClassName: nfs-client volumeMode: Filesystem # -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- # comments or delete the status field # -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- # status:  # phase: Pending # status: # ...Copy the code

You can delete ElasticSearch StatefulSet and re-apply the new YAML:

kubectl delete sts -n kubesphere-logging-system elasticsearch-logging-data
kubectl delete sts -n kubesphere-logging-system elasticsearch-logging-discovery

kubectl apply -f elasticsearch-logging-data.yml
kubectl apply -f elasticsearch-logging-discovery.ymlCopy the code

Call the Snapshot API to create a repository named Ks-log-snapshots and enable the compression function.

curl -X PUT "elasticsearch-logging-data.kubesphere-logging-system.svc:9200/_snapshot/ks-log-snapshots? pretty" -H 'Content-Type: application/json' -d' { "type": "fs", "settings": { "location": "/usr/share/elasticsearch/backup", "compress": true } } 'Copy the code

Return “acknowledged”: true for success. The ElasticSearch cluster snapshot function is ready. You only need to periodically call the Snapshot API to implement incremental backup. ElasticSearch automated incremental backup can be done with the aid of Curator.

2. Use a timing snapshot for a Curator

ElasticSearch Curator helps manage ElasticSearch indexes and snapshots. Next, we use Curator for automated timed log backups. The KubeSphere logging component includes by default a Curator (deployed as a CronJob and executed every day at 1am) to manage the index, the same Curator we can use. The execution rules for Curator can be found in ConfigMap.

Here we need to add two actions to actionfile.yml: snapshot and deletesnapShots. Increase the priority of this rule to delete_indices. Snapshots -%Y%m%d%H% m% S snapshots -%Y%m% D %H% m% S snapshots are saved for 45 days. For details, see Exhibit Reference.

actions: 1: action: snapshot description: >- Snapshot ks-logstash-log prefixed indices with the default snapshot name pattern of 'snapshot-%Y%m%d%H%M%S'. options:  repository: ks-log-snapshots name: 'snapshot-%Y%m%d%H%M%S' ignore_unavailable: False include_global_state: True partial: False wait_for_completion: True skip_repo_fs_check: False # If disable_action is set to True, Curator will ignore the current action disable_action: False filters: - filtertype: pattern kind: prefix # You may change the index pattern below to fit your case value: ks-logstash-log- 2: action: delete_snapshots description: >- Delete snapshots from the selected repository older than 45 days (based on creation_date), for 'snapshot' prefixed snapshots. options: repository: ks-log-snapshots ignore_empty_list: True # If disable_action is set to True, Curator will ignore the current action disable_action: False filters: - filtertype: pattern kind: prefix value: snapshot- exclude: - filtertype: age source: name direction: Older timeString: '%Y% M% d%H%M%S' Unit: days unit_count: 45 3: Action: delete_indicesCopy the code

3. Restore and view logs

When we need to go back a few days, we can use snapshots to recover the logs, such as those from November 12. First we need to check the latest Snapshot:

curl -X GET "elasticsearch-logging-data.kubesphere-logging-system.svc:9200/_snapshot/ks-log-snapshots/_all? pretty"Copy the code

Then restore the indexes on the specified date through the latest Snapshot (optionally all). This API will restore the log index to the data disk, so make sure the data disk has sufficient storage space. In addition, you can directly back up the PVS and mount them to other ElasticSearch clusters to restore logs to other ElasticSearch clusters.

curl -X POST "elasticsearch-logging-data.kubesphere-logging-system.svc:9200/_snapshot/ks-log-snapshots/snapshot-20191112010008/_resto re? Pretty "-h 'Content-type: application/json' -d' {"indices":" ks-logstuck-log-2019.11.12 ", "ignore_UNAVAILABLE ": true, "include_global_state": true, } 'Copy the code

Depending on the log volume, the waiting time varies from several minutes. The KubeSphere Log Dashboard allows you to view the logs.

Reference documentation

ElasticSearch Reference: Snapshot And Restore

Curator Reference: the snapshot

Meetup forecast

KubeSphere (https://github.com/kubesphere/kubesphere) is an open source application centered container management platform, support for deployment on any infrastructure, and provides a simple and easy to use UI, greatly reduces the complexity of the development, testing, and operational daily, It aims to solve Kubernetes’ existing pain points such as storage, network, security and ease of use, and help enterprises easily cope with agile development and automated monitoring operation and maintenance, end-to-end application delivery, micro-service governance, multi-tenant management, multi-cluster management, service and network management, mirror warehouse, AI platform, edge computing and other business scenarios.

KubeSphere log backup and recovery practices

Why log backups

The premise condition

1. ElasticSearch Setup

2. Use a timing snapshot for a Curator

3. Restore and view logs

Reference documentation

Meetup forecast

Related Posts

【Linux attacked 】 In the etc directory, a Linux attack file could not be deleted.

Tell your girlfriend about 15 locks in Java and instantly make her adore you

Basic algorithm – search | binary search