“This is my 10th day of the August Challenge. Check out the details:August is more challenging”

1. ElasticSearch Node type

There are two types of ElasticSearch nodes: Master nodes and Datanodes.

1. The Master node

When a node is started, the Discovery mechanism is then used to find other nodes in the cluster and establish connections. And select a Master node from the candidate Master node.

Seed_hosts: [“s201”, “s202”, “s203”]

Candidate master node: cluster. Initial_master_nodes: [” MOE-ES-node1 “, “MOE-ES-node2 “,” MOE-ES-node3 “]

Main responsibilities of the Master node

  • Manage indexes: create indexes, delete indexes, and allocate fragments
  • Maintaining metadata
  • Manages cluster node status
  • Not responsible for data writing and queries

2. The DataNode node

There are N Datanodes in the ElasticSearch cluster.

The DataNode is responsible for

  • Data is written to
  • Data retrieval

Most of ElasticSearch’s load is on Datanodes. In a production environment, it is better to have a larger memory.

2. Sharding and replica mechanisms

1. The shard shard

ElasticSearch is a distributed search engine. The index data is divided into several parts and distributed across different server nodes. The index data distributed among different server nodes is called a shard. ElasticSearch automatically manages shards and migrates them if they are not balanced.

An index consists of multiple shards, which are distributed across different server nodes.

2. Copy of up

To ensure the high availability and fault tolerance of ElasticSearch Shards, ElasticSearch introduces the replica mechanism for shards. Each shard has a replica shard.

Each Shard has a Primary Shard and several Replica shards.

The Primary Shard and Replica Shard are on different server nodes.

3. Specify fragments and replicas when creating indexes

Example Create an index for the specified number of fragments and copies

PUT /moe_article
{
  "mappings": {
    "properties": {}},"settings": {
    "number_of_shards": 3."number_of_replicas": 2}}Copy the code

Suppose you have three server nodes

Number_of_shards: indicates three primary shards

Number_of_replicas: indicates two replicas

Important workflow for ElasticSearch

1. How ElasticSearch writes documents

  1. If Node2 (DataNode) is selected to send the request, then Node2 is called the coordinating Node.
  2. Calculates the fragments to which the document is writtenshard = hash(routing) % number_of_primary_shardsRouting is a mutable value,The default is the _id of the document
  3. The coordinating node routes the requests to other Datanodes (corresponding to a primary shard, if the primary shard is on Node1).
  4. The Primary Shard on Node1 processes requests, writes data to the database, and synchronizes data to other Replica Shards
  5. After the files are saved by the Primary Shard and Replica Shard, the files are returned to the client.

2. Search principle of ElasticSearch

  1. If Node2 is selected, Node2 will be coordinating node.
  2. The Coordinating nodes broadcast the query request to each data Node whose sharding processes the query request.
  3. Data query is carried out for each shard, data that meets the conditions are placed in a priority queue, and the document ID, node information and shard information of these data are returned to the coordination node.
  4. The coordination node aggregates all the results and sorts them globally.
  5. The coordination node sends a GET request to the shards containing these document ids, and the corresponding shards return the document data to the coordination node, and the coordination node returns the data to the client.

Four,

The ElasticSearch node types are Master and DataNode. Shards are classified into Primary Shards and Replica Shards. The Shard supports massive data, resolving the capacity problem of single disks, and ensuring that data is not lost.

Welcome to follow the wechat official account (MarkZoe) to learn from each other and communicate with each other.