You can use custom node properties as awareness properties to make Elasticsearch take physical hardware configuration into account when allocating shards. If Elasticsearch knows which nodes are on the same physical server, in the same rack, or in the same region, it can distribute master shards and their replica shards to minimize the risk of losing all shard copies in the event of a failure.

Through cluster. Routing. Allocation. Awareness. The attributes set enable fragmentation distribution after perception, shard only assigned to has set up a value for a given perception attribute node. If you use more than one aware attribute, Elasticsearch will consider each attribute individually when allocating shards.

Allocation awareness Settings can be configured in ElasticSearch.yml and dynamically updated using the Cluster-update-Settings API.

By default, Elasticsearch uses Adaptive Replica Selection to route search or GET requests. However, Elasticsearch would prefer to process these requests with shards at the same location (with the same perception property value) because of the allocation awareness property. You can disable this behavior by specifying the export ES_JAVA_OPTS = “$ES_JAVA_opts-des.search.ignore_awareness_Attributes = true” system attribute on each node of the cluster.

 

Unbalanced Shard distribution

Suppose your hardware is distributed across two different physical racks:

 

We can see that our my_index shard is distributed in two different physical racks, rACK1 and rack2. From the above we can see that P0 and R0 are distributed in RACK1, while P1 and R1 are distributed in RACK2. If there is a situation where either our RACK1 or rack2 is unavailable due to some kind of accident, then our my_index will be unavailable. This is because shard 0 or shard 1 will not exist.

 

To avoid this we can let our Elasticsearch know the physical allocation of our hardware. This is called Shard Allocation Awareness in Elasticsearch. This solution is very useful when multiple nodes of our Elasticsearch share the same resource: disk, host Mache, Netowork switch, rack, etc.

We can do this through the following two steps:

  1. Label our node
  2. Update our cluster configuration

 

Step1: label the node

We can use node.attr to tag our nodes.

  • Node attributes can be any name and value you like

  • You can also use the -e command line argument on the command line, for example

    ./bin/elasticsearch -Enode.attr.my_rack_id=rack1
    Copy the code
  • Or you can define it directly in elasticSearch.yml

Step2: configure the cluster

We must tell Elasticsearch which attribute or properties we use for our Shard Allocation Awareness. We can through the use of cluster. The routing. The allocation. The awareness of the configuration of the cluster level to tell us Elasticsearch:

PUT _cluster/settings
{
  "persistent": {
    "cluster.routing.allocation.awareness.attributes": "my_rack_id"
  }
}
Copy the code

An equilibrium Shard distribution

After configuring step1 and step2 above, you can now ensure that at least one copy of all shards will exist in each rack of each index.

If one of the racks in RACK1 or RACK2 is corrupted, we can ensure that our data access is uninterrupted. Of course, if both racks are damaged at the same time, there is nothing we can do.

 

Forced awareness

By default, if one location fails, Elasticsearch allocates all lost copies to the remaining locations. Although you may have enough resources at all locations to hold your master and replica shards, a single location may not be able to hold all of them.

In order to prevent overload failure occurs a single location, you can set the cluster. The routing. Allocation. Awareness. The force, so that the other location of nodes available before, don’t assign any copy.

For example, if you have an awareness attribute named my_rack_id and have nodes configured in rack1 and rack2, you can use force awareness to prevent Elasticsearch from allocating copies if only one region is available:

cluster.routing.allocation.awareness.attributes: "my_rack_id"
cluster.routing.allocation.awareness.force.zone.values: "rack1,rack2"
Copy the code

Using this example configuration, if you start two nodes and set node.attr. My_rack_id to rack1 and create an index with 5 shards and 1 replica, Elasticsearch will create an index and allocate 5 master shards but no replica. Copies are allocated only if the node whose node.attr.my_rack_id is set to rack2 is available.

Reference:

【 1 】 www.elastic.co/guide/en/el…