In my previous article, I covered index lifecycle management for Elasticsearch:
- Elasticsearch: An introduction to Index lifecycle management
- Elastic: Use index lifecycle management to implement hot/cold architecture
Index lifecycle management is useful for Time Series Data (TSD). So what is Time Series Data?
Application of Data Stream in index cycle management
What is Time Series Data?
TSD is always associated with a timestamp that identifies the point-in-time event of the data when the event was created. For example, it could be sensor data (temperature measurements) or security equipment logs. What do these data have in common? Over time, its importance tends to loosen, and old documents related to past events are less important than documents related to new events. You may no longer be interested in sensor-related data from the last month, especially very precise data.
Therefore, in ES, the best choice for handling this data in an elastic search is to use a time-based index.
Time Series Data has the following characteristics:
- It could be logs from some server or metrics from some facility, social media streams, time-based events
- It consists of timestamp + data
- Usually searches for recent events
- Old documents become less important
- Time-based indexes are the best choice
-Every day, every week, every month, every year… Create a new index
Tackle the challenges of Time Series Data
When we process TSD data, we also face a number of challenges:
We can see that in the picture above. In terms of data validity, data will lose its importance over time, and the query rate for old data will become lower. We need to use a new storage mode to save data, such as deleting old data, or storing older data on some cheaper storage device to save money, and make the index read only. We even close/frozen some older indexes to further save running resources.
In addition, in terms of controlling the size of the index, it is difficult to predict the size of the index. When we started collecting data:
We only have a very small amount of data, and as the amount of data increases:
Eventually, we may see more data being collected:
This posed a challenge in planning the size of the cluster and the number of shards. We need to create indexes on a daily or weekly basis to meet our business needs. We even delete some index data that doesn’t need to be older. This is what is commonly known as ILM (Index Lifecycle Management).
Data stream
Data Stream is a new feature in Elastic Stack 7.9. Data Stream allows you to store appending only time series Data across multiple indexes while providing a unique named resource for requests. Data Stream is great for logging, events, metrics, and other continuously generated data.
You can submit index and search requests directly to the Data Stream. Stream automatically routes requests to the backup index that stores the stream data. You can use index Lifecycle Management (ILM) to manage these backup indexes automatically. For example, you can use ILM to automatically move older backup indexes to cheaper hardware and remove indexes that are not needed. As your data grows, ILM can help you reduce costs and overhead.
Backing indices
The data stream consists of one or more automatically generated Backing indices for hidden.
Each data flow requires a matching index template. This template contains the mapping and Settings for the backup index used to configure the flow.
Each document indexed to the data stream must contain an @TIMESTAMP field that maps to the date or date_nanos field type. If the index template does not specify a mapping for the @TIMESTAMP field, Elasticsearch maps @timestamp to a date field with default options.
The same index template can be used for multiple data streams. You cannot delete the index template that the data stream is using.
Read requests
When you submit a read request to a data stream, the flow routes the request to all of its backup indexes.
Write the index
The most recently created backup index is the write index of the data flow. The flow only adds new documents to the index.
You cannot add a new document to another supported index, even if the request is sent directly to the index.
You can also do nothing like this on write indexes that might block an index:
- Clone
- Close
- Delete
- Freeze
- Shrink
- Split
Rollover
When you create a data stream, Elasticsearch automatically creates a backup index for that stream. This index also acts as the first write index for the stream. Rollover creates a new backup index that becomes the new write index for the stream.
We recommend using ILM to automatically flip the data stream when the write index reaches a specified lifetime or size. You can also manually rollover the data if needed.
generate
Each data stream tracks its generation: a six-digit, zero-padded integer that is used as a cumulative count of the stream’s rollover, starting at 000001.
When creating a supported index, the following convention is used to name the index:
.ds-<data-stream>-<generation>
Copy the code
Backup indexes with higher generation contain more recent data. For example, the web-server-logs data stream has a generation of 34. The latest backup index for this flow is named.ds-web-server-logs-000034.
Certain operations, such as shrink or restore, can change the name of the backup index. These name changes do not remove the backup index from its data flow.
Only add
Data flows are designed for use cases that rarely, if ever, update existing data. You cannot send an update or delete request to an existing document directly to the data stream. Instead, delete using update by Query and delete by Query.
If desired, you can update or delete a document by submitting a request directly to the document’s backup index.
Tip: If you frequently update or delete existing documents, use index aliases and index templates instead of data streams. You can still manage indexes for aliases using ILM.
In the next connection, we will use data Stream in ILM.
Install the Elastic Stack
In today’s experiment, we will create a two-node Elasticsearch cluster just like the previous tutorial “Elasticsearch: Index Lifecycle Management Primer”.
See the article “Elasticsearch: Using Shard Filtering to control which node to assign indexes to to run a two-node cluster.” After installing Elasticsearch, open a terminal and run the following command:
./bin/elasticsearch -E node.name=node1 -E node.attr.data=hot -Enode.max_local_storage_nodes=2
Copy the code
It will run a node called node1. At the same time, run the following command in another terminal:
./bin/elasticsearch -E node.name=node2 -E node.attr.data=warm -Enode.max_local_storage_nodes=2
Copy the code
It runs another node called node2. You can run the following command to check:
GET _cat/nodes? vCopy the code
Display two nodes:
We can check the properties of both nodes with the following command:
GET _cat/nodeattrs? v&s=nameCopy the code
Obviously one of the nodes is hot and the other is warm.
Now we have created our Elasticsearch cluster.
The test environment is as follows:
Create a data stream
In the following hands-on practice, our operation is very simple. To automate rollover and manage timing indexes with ILM, we take the following new steps:
- Create a Lifecycle Policy
- Create an index template that applies to a data stream
- Create a data stream
- Send data to the index and validate that the index goes through the Lifecycle phase
As shown above, a typical ILM typically has four phases: Hot, Warm, Cold, and Delete. You can initiate phases that are tailored to your business needs. For the following exercise, we will omit the Cold phase.
Create an Index Lifecycle Policy
Above, I named the Policy as demo. At the same time, I also defined the rollover conditions of Hot Phase. When it satisfies any of the following conditions:
- The index size is greater than 1G
- The number of documents is greater than 5
- The index spans more than 30 days
The index will automatically rollover to another index.
We then define Warm Phase:
Above, we start the Warm Phase. In this phase, the data will be stored on nodes with warm tags. Since we only have one warm node, IN this exercise, I set the number of replica to 0. In practice, more replicas represent more read capacity. This can be set according to your business needs and configuration. I also enable Shrink index, which means that it will compress all the primary shards into one in the warm phase. Typically, the primary shard represents the ability to import data. In warm Phase, we usually don’t need to import data, we only import data in hot nodes.
Let’s define Delete Phase next:
Above, we started the Delete Phase. It shows that the index will be deleted automatically after 3 minutes after our document enters the Warm phase. Click the Save as New Policy button above. We can obtain the defined Policy through the following API:
GET _ilm/policy/demo
Copy the code
{"demo" : {"version" :3, "modified_date" : "2020-12-03T14:33:30.508z ", "policy" : {" Phases" : {"warm" : { "min_age" : "0ms", "actions" : { "allocate" : { "number_of_replicas" : 0, "include" : { }, "exclude" : { }, "require" : { "data" : "warm" } }, "shrink" : { "number_of_shards" : 1 }, "set_priority" : { "priority" : 50 } } }, "hot" : { "min_age" : "0ms", "actions" : { "rollover" : { "max_size" : "1gb", "max_age" : "30d", "max_docs" : 5 }, "set_priority" : { "priority" : 100 } } }, "delete" : { "min_age" : "3m", "actions" : { "delete" : { "delete_searchable_snapshot" : true } } } } } } }Copy the code
The results shown above are for version 7.9.1. For 7.9.3 and 7.10 (the latest version), there are several reasons. In order for DELETE to work properly, we need to set it up using the following API:
PUT _ilm/policy/demo { "policy": { "phases": { "hot": { "min_age": "0ms", "actions": { "rollover": { "max_age": "30d", "max_size": "1gb", "max_docs": 5 }, "set_priority": { "priority": 100 } } }, "warm": { "actions": { "allocate": { "number_of_replicas": 0, "include": {}, "exclude": {}, "require": { "data": "warm" } }, "shrink": { "number_of_shards": 1 }, "set_priority": { "priority": 50 } } }, "delete": { "min_age": "3m", "actions": { "delete" : {}}}}}}Copy the code
Notice the DELETE section above. We have “delete”: {} in actions.
To define the index template
We enter the following command in Kibana’s Console:
# Create tje template to apply the policy to every new backing index of the data stream
PUT _index_template/template_demo
{
"index_patterns": ["demo-*"],
"data_stream": {},
"priority": 200,
"template": {
"settings": {
"number_of_shards": 2,
"index.lifecycle.name": "demo",
"index.routing.allocation.require.data": "hot"
}
}
}
Copy the code
Above, we created an index template called template_demo. Note that we defined data_stream as an empty object above. Notice the tags used by the hot and warm tags you defined earlier. Above, we used data. We define two primary shards.
Create a data stream
Creating a data stream is very simple:
# Create a data stream
PUT _data_stream/demo-ds
Copy the code
Run the command above. Since we have created an index template with demo-* as index_pattern above, the above creation is successful. Otherwise if we use the following command:
PUT _data_stream/demo
Copy the code
It will be a failure. The error code tells you that there is no corresponding index template.
# Check the shards allocation GET _cat/shards/demo-ds? vCopy the code
It read:
Index shard Prirep State Docs Store IP node. Ds-demo-ds-000001 1 P STARTED 0 208b 127.0.0.1 node1.ds-Demo-DS-000001 1 r UNASSIGNED. Ds-demo-ds-000001 0 P STARTED 0 208b 127.0.0.1 node1. ds-Demo-DS-000001 0 r UNASSIGNEDCopy the code
Since we have allocated two primary shards, but we only have one hot node, what we see above are two replica Shards that are not allocated.
To check the index of a data stream, run the following command:
# Verify data stream indexes
GET _data_stream/demo-ds
Copy the code
The command above shows:
{
"data_streams" : [
{
"name" : "demo-ds",
"timestamp_field" : {
"name" : "@timestamp"
},
"indices" : [
{
"index_name" : ".ds-demo-ds-000001",
"index_uuid" : "sPN5JEW8SVuTFNh4UqK9zw"
}
],
"generation" : 1,
"status" : "YELLOW",
"template" : "template_demo",
"ilm_policy" : "demo"
}
]
}
Copy the code
It shows that an index named.ds-demo-DS-000001 has been created. We don’t have to do anything.
We can also check the setting of the created index by using the following command:
GET .ds-demo-ds-000001/_settings
Copy the code
The command above shows:
{
".ds-demo-ds-000001" : {
"settings" : {
"index" : {
"lifecycle" : {
"name" : "demo"
},
"routing" : {
"allocation" : {
"require" : {
"data" : "hot"
}
}
},
"hidden" : "true",
"number_of_shards" : "2",
"provided_name" : ".ds-demo-ds-000001",
"creation_date" : "1606989315969",
"priority" : "100",
"number_of_replicas" : "1",
"uuid" : "sPN5JEW8SVuTFNh4UqK9zw",
"version" : {
"created" : "7100099"
}
}
}
}
Copy the code
As you can see from above, it is on the Hot node, and hidden is displayed as true, which means it will not be returned using wildcard expressions.
Send data to a data stream
We then execute the following command:
PUT _ingest/pipeline/add-timestamp
{
"processors": [
{
"set": {
"field": "@timestamp",
"value": "{{_ingest.timestamp}}"
}
}
]
}
Copy the code
The add-timestamp pipeline above is to add an imported timestamp. We first execute the pipeline and then execute the following command:
POST demo-ds/_doc? pipeline=add-timestamp { "user": { "id": "liuxg" }, "message": "This is so cool!" }Copy the code
We can see the following output:
{
"_index" : ".ds-demo-ds-000001",
"_type" : "_doc",
"_id" : "3JsVKHYBtwZVzHJGZXRc",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1
}
Copy the code
It indicates that our data is sent to the.ds-demo-DS-000001 index. We can use the following command to search:
# Search the data stream
GET demo-ds/_search
Copy the code
The command above shows the result:
{ "took" : 351, "timed_out" : false, "_shards" : { "total" : 2, "successful" : 2, "skipped" : 0, "failed" : , "hits" : {0} "total" : {" value ": 1, the" base ":" eq "}, "max_score" : 1.0, "hits" : [{" _index ": ". Ds - demo - ds - 000001 ", "_type" : "_doc", "_id" : "3 jsvkhybtwzvzhjgzxrc", "_score" : 1.0, "_source" : {" @ timestamp ": "The 2020-12-03 T10:10:59. 541518 z", "message" : "This is so cool!", "user" : {" id ":" liuxg "}}}}}]Copy the code
It shows that a document has been searched and its index name is.ds-demo-DS-000001.
We then execute the following command four more times:
POST demo-ds/_doc? pipeline=add-timestamp { "user": { "id": "liuxg" }, "message": "This is so cool!" }Copy the code
We should see output similar to the following:
{
"_index" : ".ds-demo-ds-000001",
"_type" : "_doc",
"_id" : "4JueKHYBtwZVzHJGfnRv",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 2,
"_primary_term" : 1
}
Copy the code
So far, we have generated five documents. They are all in index. Ds-demo-ds-000001. Remember the Policy we defined earlier in ILM? When the number of documents exceeds 5, it automatically rollover and moves the previous documents to the warm node. We then execute the following command:
POST demo-ds/_doc? pipeline=add-timestamp { "user": { "id": "liuxg" }, "message": "This is so cool!" }Copy the code
This brings the total number of documents to six. We can check the status of ILM using the following commands:
# Check ILM status per demo-ds data stream
GET demo-ds/_ilm/explain
Copy the code
The command above shows:
{ "indices" : { ".ds-demo-ds-000001" : { "index" : ".ds-demo-ds-000001", "managed" : true, "policy" : "Demo ", "lifecycle_date_millis" : 1606989315969, "age" : "2.86h", "phase" : "hot", "phase_time_millis" : 1606989317057, "action" : "rollover", "action_time_millis" : 1606989760360, "step" : "check-rollover-ready", "step_time_millis" : 1606989760360, "phase_execution" : { "policy" : "demo", "phase_definition" : { "min_age" : "0ms", "actions" : { "rollover" : { "max_size" : "1gb", "max_age" : "30d", "max_docs" : 5 }, "set_priority" : { "priority" : 100 } } }, "version" : 1, "modified_date_in_millis" : 1606988416663}}}}Copy the code
It shows action as rollover. Let’s wait a while for the rollover to happen. This time is determined by the following parameters:
indices.lifecycle.poll_interval
Copy the code
We can find the setting of this parameter in the address. By default, this parameter is 10 minutes. We need to wait a while. We can also change the time by using the following command:
PUT _cluster/settings
{
"transient": {
"indices.lifecycle.poll_interval": "10s"
}
}
Copy the code
Elasticsearch queries every 10 seconds and executes ILM policy. We can execute the following command again:
GET demo-ds/_ilm/explain
Copy the code
The output from the command above shows:
{ "indices" : { ".ds-demo-ds-000002" : { "index" : ".ds-demo-ds-000002", "managed" : true, "policy" : "Demo ", "lifecycle_date_millis" : 1606999959439, "age" : "5.22m", "phase" : "hot", "phase_time_millis" : 1606999961410, "action" : "rollover", "action_time_millis" : 1607000204426, "step" : "check-rollover-ready", "step_time_millis" : 1607000204426, "phase_execution" : { "policy" : "demo", "phase_definition" : { "min_age" : "0ms", "actions" : { "rollover" : { "max_size" : "1gb", "max_age" : "30d", "max_docs" : 5 }, "set_priority" : { "priority" : 100 } } }, "version" : 1, "modified_date_in_millis" : 1606988416663 } }, ".ds-demo-ds-000001" : { "index" : ".ds-demo-ds-000001", "managed" : true, "policy" : "Demo ", "lifecycle_date_millis" : 1606999959458, "age" : "5.22m", "phase" : "warm", "phase_time_millis" : 1606999962564, "action" : "shrink", "action_time_millis" : 1607000207450, "step" : "shrink", "step_time_millis" : 1607000271877, "phase_execution" : { "policy" : "demo", "phase_definition" : { "min_age" : "0ms", "actions" : { "allocate" : { "number_of_replicas" : 0, "include" : { }, "exclude" : { }, "require" : { "data" : "warm" } }, "shrink" : { "number_of_shards" : 1 }, "set_priority" : { "priority" : 50 } } }, "version" : 1, "modified_date_in_millis" : 1606988416663 } } } }Copy the code
There is a newly added Warm phase, accompanied by a shrink action.
We execute the following command again:
GET demo-ds/_search
Copy the code
It shows:
{ "took" : 2, "timed_out" : false, "_shards" : { "total" : 3, "successful" : 3, "skipped" : 0, "failed" : 0 }, "hits" : {" total ": {" value" : 8, "base" : "eq"}, "max_score" : 1.0, "hits" : [{" _index ": "The shrink - ds - demo - ds - 000001", "_type" : "_doc", "_id" : "3 zuekhybtwzvzhjgdhru", "_score" : 1.0, "_source" : {"@timestamp" : "2020-12-03t12:40:41.812609z ", "message" : "This is so cool!", "user" : {"id" : "liuxg" } } }, { "_index" : "shrink-.ds-demo-ds-000001", "_type" : "_doc", "_id" : "3pueKHYBtwZVzHJGd3Sd", "_score" : 1.0, "_source" : {"@timestamp" : "2020-12-03t12:40:42.653342z ", "message" : "This is so cool!", "user" : {"id" : "liuxg" } } }, { "_index" : "shrink-.ds-demo-ds-000001", "_type" : "_doc", "_id" : "4JueKHYBtwZVzHJGfnRv", "_score" : 1.0, "_source" : {"@timestamp" : "2020-12-03t12:40:44.399382z ", "message" : "This is so cool!", "user" : {"id" : "liuxg" } } }, { "_index" : "shrink-.ds-demo-ds-000001", "_type" : "_doc", "_id" : "4puiKHYBtwZVzHJGeHT3", "_score" : 1.0, "_source" : {"@timestamp" : "2020-12-03t12:45:05.143514z ", "message" : "This is so cool!", "user" : {"id" : "liuxg" } } }, { "_index" : "shrink-.ds-demo-ds-000001", "_type" : "_doc", "_id" : "45unKHYBtwZVzHJGK3RB", "_score" : 1.0, "_source" : {"@timestamp" : "2020-12-03t12:50:12.929209z ", "message" : "This is so cool!", "user" : {"id" : "liuxg" } } }, { "_index" : "shrink-.ds-demo-ds-000001", "_type" : "_doc", "_id" : "3JsVKHYBtwZVzHJGZXRc", "_score" : 1.0, "_source" : {"@timestamp" : "2020-12-03T10:10:55.541518z ", "message" : "This is so cool!", "user" : {"id" : "liuxg" } } }, { "_index" : "shrink-.ds-demo-ds-000001", "_type" : "_doc", "_id" : "35ueKHYBtwZVzHJGe3S4", "_score" : 1.0, "_source" : {"@timestamp" : "2020-12-03t12:40:43.704291z ", "message" : "This is so cool!", "user" : {"id" : "liuxg" } } }, { "_index" : "shrink-.ds-demo-ds-000001", "_type" : "_doc", "_id" : "4ZuhKHYBtwZVzHJGb3Si", "_score" : 1.0, "_source" : {"@timestamp" : "2020-12-03t12:43:57.217932z ", "message" : "This is so cool!", "user" : {"id" : "liuxg" } } } ] } }Copy the code
As shown above, all eight documents have been placed in the shrink-. Ds-demo-ds-000001 index. This is what we defined in the Warm Phase: shrink to a primary index.
We execute the following command again:
POST demo-ds/_doc? pipeline=add-timestamp { "user": { "id": "liuxg" }, "message": "This is so cool!" }Copy the code
The command output above shows:
{
"_index" : ".ds-demo-ds-000002",
"_type" : "_doc",
"_id" : "C5uzKHYBtwZVzHJGDHW6",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1
}
Copy the code
From the above we can see that the new document is stored in an index called.ds-demo-DS-000002. This is a new index, different from the previous.ds-demo-DS-000001.
When we perform:
# Search the data stream
GET demo-ds/_search
Copy the code
It will display all nine documents:
{ "took" : 1, "timed_out" : false, "_shards" : { "total" : 3, "successful" : 3, "skipped" : 0, "failed" : 0 }, "hits" : {" total ": {9," value ":" base ":" eq "}, "max_score" : 1.0, "hits" : [{" _index ": "The shrink - ds - demo - ds - 000001", "_type" : "_doc", "_id" : "3 zuekhybtwzvzhjgdhru", "_score" : 1.0, "_source" : {"@timestamp" : "2020-12-03t12:40:41.812609z ", "message" : "This is so cool!", "user" : {"id" : "liuxg" } } }, { "_index" : "shrink-.ds-demo-ds-000001", "_type" : "_doc", "_id" : "3pueKHYBtwZVzHJGd3Sd", "_score" : 1.0, "_source" : {"@timestamp" : "2020-12-03t12:40:42.653342z ", "message" : "This is so cool!", "user" : {"id" : "liuxg" } } }, { "_index" : "shrink-.ds-demo-ds-000001", "_type" : "_doc", "_id" : "4JueKHYBtwZVzHJGfnRv", "_score" : 1.0, "_source" : {"@timestamp" : "2020-12-03t12:40:44.399382z ", "message" : "This is so cool!", "user" : {"id" : "liuxg" } } }, { "_index" : "shrink-.ds-demo-ds-000001", "_type" : "_doc", "_id" : "4puiKHYBtwZVzHJGeHT3", "_score" : 1.0, "_source" : {"@timestamp" : "2020-12-03t12:45:05.143514z ", "message" : "This is so cool!", "user" : {"id" : "liuxg" } } }, { "_index" : "shrink-.ds-demo-ds-000001", "_type" : "_doc", "_id" : "45unKHYBtwZVzHJGK3RB", "_score" : 1.0, "_source" : {"@timestamp" : "2020-12-03t12:50:12.929209z ", "message" : "This is so cool!", "user" : {"id" : "liuxg" } } }, { "_index" : "shrink-.ds-demo-ds-000001", "_type" : "_doc", "_id" : "3JsVKHYBtwZVzHJGZXRc", "_score" : 1.0, "_source" : {"@timestamp" : "2020-12-03T10:10:55.541518z ", "message" : "This is so cool!", "user" : {"id" : "liuxg" } } }, { "_index" : "shrink-.ds-demo-ds-000001", "_type" : "_doc", "_id" : "35ueKHYBtwZVzHJGe3S4", "_score" : 1.0, "_source" : {"@timestamp" : "2020-12-03t12:40:43.704291z ", "message" : "This is so cool!", "user" : {"id" : "liuxg" } } }, { "_index" : "shrink-.ds-demo-ds-000001", "_type" : "_doc", "_id" : "4ZuhKHYBtwZVzHJGb3Si", "_score" : 1.0, "_source" : {"@timestamp" : "2020-12-03t12:43:57.217932z ", "message" : "This is so cool!", "user" : {"id" : "liuxg" } } }, { "_index" : ".ds-demo-ds-000002", "_type" : "_doc", "_id" : "C5uzKHYBtwZVzHJGDHW6", "_score" : 1.0, "_source" : {"@timestamp" : "2020-12-03T13:03:11.546240z ", "message" : "This is so cool!", "user" : {"id" : "liuxg" } } } ] } }Copy the code
They are located in the shrink-. Ds-demo-ds-000001 and.DS-Demo-DS-000002 indexes respectively.
We can also use the following command to view the current index information of the data stream:
GET _data_stream/demo-ds
Copy the code
{
"data_streams" : [
{
"name" : "demo-ds",
"timestamp_field" : {
"name" : "@timestamp"
},
"indices" : [
{
"index_name" : "shrink-.ds-demo-ds-000001",
"index_uuid" : "SMlpBdzdTSala5hMt0XmpQ"
},
{
"index_name" : ".ds-demo-ds-000002",
"index_uuid" : "1uZk3ug0SfmD-1UUgaeqDw"
}
],
"generation" : 2,
"status" : "YELLOW",
"template" : "template_demo",
"ilm_policy" : "demo"
}
]
}
Copy the code
Shrink -. Ds-demo-ds-000001 and.ds-Demo-DS-000002. Shrink -.
Let’s wait a while and execute the following command:
GET demo-ds/_ilm/explain
Copy the code
The command above shows:
{ "indices" : { "shrink-.ds-demo-ds-000001" : { "index" : "shrink-.ds-demo-ds-000001", "managed" : true, "policy" : "Demo ", "lifecycle_date_millis" : 1606999959458, "age" : "13.83m", "phase" : "delete", "phase_time_millis" : 1607000275318, "action" : "complete", "action_time_millis" : 1607000275157, "step" : "complete", "step_time_millis" : 1607000275318, "phase_execution" : { "policy" : "demo", "phase_definition" : { "min_age" : "3m", "actions" : { } }, "version" : 1, "modified_date_in_millis" : 1606988416663 } }, ".ds-demo-ds-000002" : { "index" : ".ds-demo-ds-000002", "managed" : true, "policy" : "demo", "lifecycle_date_millis" : 1606999959439, "age" : "13.83m", "phase" : "hot", "phase_time_millis" : 1606999961410, "Action" : "rollover", "action_time_millis" : 1607000204426, "step" : "check-rollover-ready", "step_time_millis" : 1607000204426, "phase_execution" : { "policy" : "demo", "phase_definition" : { "min_age" : "0ms", "actions" : { "rollover" : { "max_size" : "1gb", "max_age" : "30d", "max_docs" : 5 }, "set_priority" : { "priority" : 100 } } }, "version" : 1, "modified_date_in_millis" : 1606988416663}}}}Copy the code
We can see that shrink-. Ds-demo-ds-000001 is in the delete stage.
This was also defined in our previous ILM policy. The system starts the deletion after three minutes. We then execute the following command until we have more than 11 documents:
POST demo-ds/_doc? pipeline=add-timestamp { "user": { "id": "liuxg" }, "message": "This is so cool!" }Copy the code
Again, we use the following command:
# Search the data stream
GET demo-ds/_search
Copy the code
We can see the following results:
{ "took" : 1, "timed_out" : false, "_shards" : { "total" : 3, "successful" : 3, "skipped" : 0, "failed" : 0 }, "hits" : {" total ": {" value" : 6, "base" : "eq"}, "max_score" : 1.0, "hits" : [{" _index ": "The shrink - ds - demo - ds - 000002", "_type" : "_doc", "_id" : "PO34KHYB_Lwe3F0spf66", "_score" : 1.0, "_source" : {"@timestamp" : "2020-12-03T14:19:12.698241z ", "message" : "This is so cool!", "user" : {"id" : "liuxg" } } }, { "_index" : "shrink-.ds-demo-ds-000002", "_type" : "_doc", "_id" : "Pe34KHYB_Lwe3F0sp_7i", "_score" : 1.0, "_source" : {"@timestamp" : "2020-12-03t14:19:13.249898z ", "message" : "This is so cool!", "user" : {"id" : "liuxg" } } }, { "_index" : "shrink-.ds-demo-ds-000002", "_type" : "_doc", "_id" : "Ou34KHYB_Lwe3F0soP6E", "_score" : 1.0, "_source" : {"@timestamp" : "2020-12-03T14:19:11.364617z ", "message" : "This is so cool!", "user" : {"id" : "liuxg" } } }, { "_index" : "shrink-.ds-demo-ds-000002", "_type" : "_doc", "_id" : "O-34KHYB_Lwe3F0so_54", "_score" : 1.0, "_source" : {"@timestamp" : "2020-12-03t14:19:12.120209z ", "message" : "This is so cool!", "user" : {"id" : "liuxg" } } }, { "_index" : "shrink-.ds-demo-ds-000002", "_type" : "_doc", "_id" : "Pu34KHYB_Lwe3F0sqv4S", "_score" : 1.0, "_source" : {"@timestamp" : "2020-12-03t14:19:13.809864z ", "message" : "This is so cool!", "user" : {"id" : "liuxg" } } }, { "_index" : "shrink-.ds-demo-ds-000002", "_type" : "_doc", "_id" : "P-34KHYB_Lwe3F0srP4N", "_score" : 1.0, "_source" : {"@timestamp" : "2020-12-03t14:19:14.317471z ", "message" : "This is so cool!", "user" : {"id" : "liuxg" } } } ] } }Copy the code
We can see that the previous shrink-. Ds-demo-ds-000001 index is missing. It was deleted three minutes later. We can obtain data stream information by using the following command:
# Get data stream information
GET _data_stream/demo-ds/_stats
Copy the code
{
"_shards" : {
"total" : 5,
"successful" : 3,
"failed" : 0
},
"data_stream_count" : 1,
"backing_indices" : 2,
"total_store_size_bytes" : 29508,
"data_streams" : [
{
"data_stream" : "demo-ds",
"backing_indices" : 2,
"store_size_bytes" : 29508,
"maximum_timestamp" : 1607000591546
}
]
}
Copy the code
Delete a data stream
We can delete a data stream with the following command:
DELETE _data_stream/demo-ds
Copy the code
Another way is to delete it via the Kibana interface:
We can delete the data stream by clicking the button above.