preface

As we learned earlier, the master node has the following responsibilities:

  • Responsible for deciding which node to allocate the current shard to.
  • Move shards between nodes to keep the cluster balanced. And so on.
Sharding allocation – Cluster based configuration

Fragment allocation refers to the process of allocating fragments to a host node. The triggered scenarios are:

  • Initial recovery
  • Shard copy allocation
  • Cluster balance
  • A cluster node is added or removed

The distribution of fragments has an important influence on the whole ES cluster, so how to control it is a very important knowledge point. The ES cluster provides the ability to set which types of shards the cluster allows to be allocated, including all shards (default), master shards, master shards for new indexes, and disallow all index shards. These are the choices.

You might be wondering, how do you play with sharding?

Existence is reasonable, this attribute in the operations cluster (scroll to restart, to maintain the cluster, delay the fragmentation distribution) is useful when, for instance, in manual shut off after a node, a cluster will be found in fixed a time window after the loss of the node, and start the data balance, this operation in a number of large amount of data fragmentation on balance is quite time-consuming, In this case, you can disable index sharding at the cluster level, maintain it, and then set it back.

Summary: It can be used to delay the movement of a replica shard when you know the node will recover quickly from a failure.

index.unassigned.node_left.delayed_timeout

This gives your cluster time to detect whether the node will rejoin before triggering redistribution.

  • Delayed allocation does not prevent copies from being promoted to master shards. The cluster will immediately make the necessary promotions to bring the cluster back to yellow. The reconstruction of missing copies is the only thing this configuration can control for latency. In other words, this delayed_timeout only works on replica reconstruction.

  • What is not mentioned in the figure is that if the node that went down is the Master node, the cluster will take an extra step because it must select a Master node from the Master candidate (node. Master: true) for the cluster to work properly.

Prevents multiple instances from running on a host

When CPU and memory resources are surplus, it is possible to start multiple instances on a single host, which can make full use of resources to some extent, but also brings risks. Take starting two instances on one host as an example.

As shown in the figure, two ES instances, Node-1 and Node-2, are deployed on 192.161.11.12.

Shard 1 and its copy are shard to Node-1 and node-2 respectively.

In this case, availability is low, and if the host goes down, all shard 1 data for the index is lost.

How can this be avoided? , of course, is to avoid in the host start multiple instances, or by setting: cluster. Routing. Allocation. Same_shard. Host: true. To forcibly prevent this from happening.

Slice allocation – Based on disk awareness

Elasticsearch takes into account environmental factors such as available disk space as well as the overall cluster allocation.

  • Elasticsearch takes into account available disk space when allocating shards to the nodes of a host. Disk.watermark. Low represents the low watermark for disk usage, which defaults to 85%. This configuration means that ES does not allocate shards to nodes that exceed this value, and this setting has no effect on the primary shards of newly created indexes, but prevents their copies from being allocated.

Similarly, disk.watermark.high. The default is 90%, which means that Elasticsearch will try to separate shards from nodes whose disk usage exceeds 90%. This configuration also affects cluster balancing.

Configuration of the last resort to prevent a node from running out of disk space, disk.waterstory. flood_stage (default value: 95%).

Rack perception

Rack awareness allows ElasticSearch to take physical hardware configuration into account when allocating shards. The benefits of physical hardware configuration awareness are as follows: When a hardware problem occurs, such as a physical server or an equipment room on the same rack, physical awareness is used to find an optimal deployment configuration to maximize cluster availability.

However, simple deployment cannot make it know the corresponding physical deployment, so we need to specify the value of the relevant configuration to tell ES. There are two specific configuration modes:

  • /bin/elasticSearch -enode.attr. Rack_id =rack_one ‘specified on startup.

  • Cluster. Routing. Allocation. Awareness. The attributes: rack_id specified in the configuration file.

  • Es can consider configuring different nodes on different physical machines.

  • Different nodes are allocated to different physical racks.

  • Different nodes are assigned to different network zones.

Theory of CAP

Rack awareness is what needs to be considered (weighing reliability, availability, bandwidth consumption, and so on). As elasticSearch is a distributed component, the P(partition fault tolerance) in the CAP feature must be considered. To put it in popular language, we know that under normal circumstances, each node of es cluster should be connected to each other. However, both environmental and human factors may cause node failures, and the network may be divided into several areas. If data is stored on only one node, then the data can no longer be accessed after being disconnected, then it is said to be partition impermissible.

Slice allocation filtering

Allows certain nodes or groups of nodes to be excluded from an allocation so that they can be deactivated. The application scenario here is: when we plan to take the node offline, we can avoid the new index shard falling on the node.

conclusion
  • The smallest unit of work for ElasticSearch is shard, so it is important to understand how shards are allocated on nodes.
  • Slice allocation based on disk awareness.
  • Fragmentation allocation based on rack awareness.
  • A brief understanding of P (partition fault tolerance) in CAP theory.
  • Exclude nodes, do not allocate fragments, flexible offline nodes.

Welcome to the public account [Notes on the development of Xia Dream] to exchange progress together