ElasticSearch: Cold/hot separation architecture for ElasticSearch

preface

This article mainly introduces the architecture and implementation of ElasticSearch cold and hot separation.

Introduction to hot and cold separation architecture

Hot and cold separation is a very popular architecture in ES, which makes full use of the advantages and disadvantages of cluster machines to achieve resource scheduling and allocation. The index writing and query speed of AN ES cluster mainly depends on the I/O speed of disks. The key point for separating hot and cold data is to use solid state disks to store data. If all solid state is used, the cost is too high, and the storage of cold data is more waste, so the use of ordinary mechanical disk and solid state disk mix, can achieve full utilization of resources, performance significantly improved. Therefore, real-time data (within 5 days) can be stored in the hot node, and historical data (5 days ago) can be stored in the cold node, and the data of the hot node can be migrated to the cold node according to the time by utilizing the characteristics of ES itself. Here, since we build the index library by day, data migration will be more convenient.

Architecture diagram:

A case in point

When using hot and cold separation, we need to build the index library in the hot node, and wait for a certain period of time to migrate the index library to the cold node. So here we need more hot nodes to set the number of shards.

For example, if we have 6 hot nodes and 9 cold nodes, and the data volume of the main shard of the index library is about 500G, then the index library establishes 18 shards and all of them are in the hot node, the distribution of the shard of the index library is as follows: hot node :18, cold node: 0; After the data is not hot, all the fragments of the index database are migrated to the cold nodes. The distribution of the fragments of the index database is hot node :0, and cold node: 18.

Example of hot and cold node fragment distribution for a single index library:

time	Index library name	Number of hot node fragments	Number of cold node fragments
20190707	TEST_20190703	18	0
20190708	TEST_20190703	0	18

The final realization of the effect picture, here I use cerebro interface screenshots to expressCerebro sample diagram:It is written into ES index library, and the fragments are distributed in hot nodesAfter some time, the shard data is migrated to the cold node:

More ElasticSearch related introduction can check my this post: www.cnblogs.com/xuwujing/p/…

ElasticSearch cold and hot separation architecture implementation

ElasticSearch is an idea that uses the ElasticSearch route completion, sets the corresponding route on the data node, and then specifies which server to distribute to when creating the index library. After a period of time, According to business requirements, the data of these index libraries will be migrated to other data nodes.

ElasticSearch node configuration

The node to be changed is the data node, and the configuration of other nodes does not need to be changed. ElasticSearch cluster +Kibana installation tutorial: ElasticSearch cluster +Kibana

Select * from elasticsearch.yml;

cluster.name: pancm node.name: data1 path.data: /home/elk/datanode/data path.logs: Host: 0.0.0.0 network. Publish_host: 192.169.0.23 transport.tcp.port: 9300 HTTP. 9200 discovery. Zen. Ping. Unicast. Hosts: [" 192.169.0.23:9301 ", "192.169.0.24:9301", "192.169.0.25:9301"] node. The master: false node.data: true node.ingest: false index.number_of_shards: 5 index.number_of_replicas: 1 discovery.zen.minimum_master_nodes: 1 bootstrap.memory_lock: true http.max_content_length: 1024mbCopy the code

Compared with ordinary data nodes, these two configurations are mainly added:

node.attr.rack: r1
node.attr.box_type: hot
Copy the code

Hot node configuration example:

cluster.name: pancm node.name: data1 path.data: /home/elk/datanode/data path.logs: Host: 0.0.0.0 network. Publish_host: 192.169.0.23 transport.tcp.port: 9300 HTTP. 9200 discovery. Zen. Ping. Unicast. Hosts: [" 192.169.0.23:9301 ", "192.169.0.24:9301", "192.169.0.25:9301"] node. The master: false node.data: true node.ingest: false index.number_of_shards: 5 index.number_of_replicas: 1 discovery.zen.minimum_master_nodes: 1 bootstrap.memory_lock: true http.max_content_length: 1024mb node.attr.rack: r1 node.attr.box_type: hotCopy the code

The cold node configuration is basically the same, except that the later values are changed

node.attr.rack: r9
node.attr.box_type: cool
Copy the code

Cold node configuration example:

cluster.name: pancm node.name: data1 path.data: /home/elk/datanode/data path.logs: Host: 0.0.0.0 network. Publish_host: 192.169.0.23 transport.tcp.port: 9300 HTTP. 9200 discovery. Zen. Ping. Unicast. Hosts: [" 192.169.0.23:9301 ", "192.169.0.24:9301", "192.169.0.25:9301"] node. The master: false node.data: true node.ingest: false index.number_of_shards: 5 index.number_of_replicas: 1 discovery.zen.minimum_master_nodes: 1 bootstrap.memory_lock: true http.max_content_length: 1024mb node.attr.rack: r1 node.attr.box_type: hotCopy the code

ElasticSearch index library Settings

When creating an index library, you need to specify the fragment ownership of the default index library. If the fragment ownership is not specified, the default value of ElasticSearch is evenly distributed. Here, we create the index library in the hot node by default, and then use the command or code to set the index library in the cold node after meeting the service conditions.

Index example:

PUT TEST_20190717 { "index":"TEST_20190717", "settings": { "number_of_shards" :18, "number_of_replicas" : 1, "refresh_interval" : "10s", "index.routing.allocation.require.box_type":"hot" }, "mappings": { "mt_task_hh": { "properties": { "accttype": { "type": "byte" }, .... }}Copy the code

Cold node Settings for the index library

Depending on business requirements, we can migrate the index library data, either using DSL statements executed on Kibana or using Java code.

The DSL line:

PUT TEST_20190717/_settings
{
  
    "index.routing.allocation.require.box_type":"cool"
  
}
Copy the code

Java code implementation:

public static void setCool(String index) throws IOException { RestClient restClient = null; try { Objects.requireNonNull(index, "index is not null"); restClient = client.getLowLevelClient(); String source = "{\"index.routing.allocation.require.box_type\": \"%s\"}"; source = String.format(source, "cool"); HttpEntity entity = new NStringEntity(source, ContentType.APPLICATION_JSON); restClient.performRequest("PUT", "/" + index + "/_settings", Collections.<String, String>emptyMap(), entity); } catch (IOException e) { throw e; } finally { if (restClient ! = null) { restClient.close(); }}}Copy the code

Full code address: github.com/xuwujing/ja…

other

In fact, the first draft of this article should have been completed in 2019, but it has been delayed due to other things. Fortunately, I found it after checking the draft, so I made up for it. Because of the long interval, there is a certain gap in details compared with the previous article. But fortunately it is written out, later words to write the article or as soon as possible, or behind will forget. So far I have written 10 episodes of ElasticSearch. Although there is a long time between, I will update this series gradually and try to write about what I have learned and what I have learned. If there is anything wrong, I hope you can point out and discuss it.

ElasticSearch Combat Series

Kinaba for ElasticSearch
ElasticSearch DSL statement for ElasticSearch
ElasticSearch: JAVA API for ElasticSearch
ElasticSearch: ElasticSearch
Metric Aggregations for ElasticSearch
ElasticSearch: Logstash Quick start
ElasticSearch: Logstash: ElasticSearch
Filebeat: ElasticSearch, ElasticSearch, ElasticSearch, ElasticSearch
Install the ELK log system for ElasticSearch

Music to recommend

Original is not easy, if you feel good, I hope to give a recommendation! Your support is the biggest motivation for my writing! Copyright: www.cnblogs.com/xuwujing CSDN blog.csdn.net/qazwsxpcm Nuggets: juejin.cn/user/365003… Personal blog: www.panchengming.com