Elasticsearch snapshot

background

Although Elasticsearch has good disaster recovery, you still need a backup mechanism in the following cases:

Data DISASTER recovery: If a cluster fails, data can be recovered from the backup in a timely manner.
Archiving data: Back up and archiving the increasing data that is not used for the time being but not deleted for later viewing.
Migrating data: You can also use backup to migrate data from one cluster to another. You can back up Elasticsearch data in either of the following ways: Export data to a text file. For example, you can export data stored in Elasticsearch to a file using tools such as ElasticDump and ESM. Create a snapshot by backing up files in the Elasticsearch data directory. The snapshot interface of Elasticsearch supports this function. The first method is relatively simple and practical when the data volume is small, but the efficiency is greatly reduced when dealing with large data volume scenarios. Next up is the use of the Snapshot API.

Where to Back Up

In Elasticsearch, you can use repository to define the backup storage type and location. The backup storage type can be shared file system, AWS S3 storage, and HDFS storage.

Either way, you need to specify in your elasticSearch. yml configuration file that you can use as a backup path pathe.repo:

Path. Repo: / usr/local/elasticsearch - 7.3.0 / data/backupCopy the code

Once configured, you can use the Snapshot API to create a repository. Here we create a repository named ES_backup.

PUT /_snapshot/es_backup {"type": "fs", "Settings ": {"location":" /usr/local/elasticSearch-7.3.0 /data/backup"}}Copy the code

Once created, data can be backed up in the Repository.

How to backup

Snapshot_0910: snapshot_0910: snapshot_0910:

PUT /_snapshot/es_backup/snapshot_0910
Copy the code

If wait_for_completion is true, the API will return the result after the backup is completed. Otherwise, it will be executed asynchronously by default. To see the result immediately, we set this parameter here.

Run the following command to check the snapshot execution status:

GET _snapshot/es_backup/snapshot_0910
Copy the code

The results are as follows:

{ "snapshots" : [ { "snapshot" : "snapshot_0910", "uuid" : "1rkfrgJsQMSWmVT0WePQ-Q", "version_id" : Indices, "version" : "7.3.0", "indices" : [ "blogs", "address_xc", "pattern", "student", "student2", "kibana_sample_data_flights", ".kibana_task_manager", "p_sc", ".kibana_1", "my_index" ], "include_global_state" : true, "state" : "SUCCESS", "start_time" : "2019-09-10T09:14:11.455z ", "start_time_in_millis" : 1568106851455, "end_time" : 2019-09-10t09:14:11.455z ", "start_time_in_millis" : 1568106851455, "end_time" : "2019-09-10T09:15:17.168z ", "end_time_in_millis" : 1568106917168, "duration_in_millis" : 65713, "failures" : [ ], "shards" : { "total" : 10, "failed" : 0, "successful" : 10 } } ] }Copy the code

When the backup

We successfully created a backup through the above steps, but as new data is added, we need to back up the new data, so how do we do? Create a snapshot snapshot_new.

PUT /_snapshot/es_backup/snapshot_new? wait_for_completion=trueCopy the code

After the execution, the es_backup directory becomes larger, indicating that new data has been backed up. Select * from the repository where the segment file is stored and select it from the repository where the segment file is stored. Select * from the repository where the segment file is stored and select * from the repository where the segment file is stored. This is essentially incremental backup.

If you are an experienced user of Elasticsearch, you should know about force Merge, which allows you to merge an indexed segment file into a specified number of segments. The incremental backup function of the Snapshot function is disabled because all segment files in the data directory have changed after the API call.

Another issue is the timing of the backup. Although Snapshot does not use much CPU, disk, or network resources, it is recommended that you do backup when you are free.

How to restore

Using the following API, we can restore index_1 to restored_INDEX_1. This recovery process is completely file-based and therefore efficient.

POST /_snapshot/my_backup/snapshot_1/_restore? wait_for_completion=true { "indices": "index_1", "rename_replacement": "restored_index_1" }Copy the code

other

The version of Elasticsearch can be backed up and restored between versions 5.1 and 5.6. For example, you can back up and restore Elasticsearch between versions 5.1 and 5.6. But you cannot restore a higher version of a backup to a lower version, such as a 6.x backup to 5.x. The backup of a lower version has certain requirements for the recovery of a higher version:

5.x can be restored in 6.x
X can be recovered in 5.x
1.x can be recovered in 2.x

reference

If you need to learn more about, can go to the website: www.elastic.co/guide/en/el…

background

Where to Back Up

How to backup

When the backup

How to restore

other

reference

Related Posts

Hive parsing Json arrays

ElasticSearch basic concepts

This article makes sure you understand the Java NIO Selector event Selector thoroughly