I. Cluster management
1.1 Single Machine & Cluster
- A single Elasticsearch server usually has the maximum load capacity. If the load exceeds the threshold, the performance of the Elasticsearch server will be greatly reduced or even unavailable. Therefore, in a production environment, the Elasticsearch server is usually run in a specified server cluster
- In addition to load capacity, single point servers have other issues
- A single machine has limited storage capacity
- A single server is prone to single point of failure and cannot achieve high availability
- Single-service concurrency is limited
- When configuring a server cluster, there is no limit on the number of nodes in the cluster. If the number of nodes is greater than or equal to two, the cluster is considered as a cluster
- Generally, the number of nodes in a cluster is more than 3 for high performance and high availability.
1.2 the Cluster Cluster
- A cluster is an organization of one or more server nodes that hold the entire data and provide indexing and search functions together
- An Elasticsearch cluster is identified by a unique name, which defaults to “Elasticsearch”
- This name is important; a node can only join a cluster by specifying the name of the cluster
1.3 the Node Node
- A cluster contains many servers, and a node is one of the servers
- As part of a cluster, it stores data and participates in the cluster’s indexing and search capabilities.
- A node is also identified by a name, which by default is the name of a random Marvel comic character that is assigned to the node at startup. This name is important for administration because it is the process of determining which servers in the network correspond to which nodes in the Elasticsearch cluster
- By default, each node is scheduled to join an “ElasticSearch” cluster. This means that if you start several nodes in your network and assume they can find each other, They will automatically form and join a cluster called “ElasticSearch”
- You can have as many as you want in an Elasticsearch cluster. If you don’t currently have any Elasticsearch nodes running on your network, starting an Elasticsearch node will create and join an Elasticsearch cluster by default
- Cluster health Status
- Green: All primary allocations and replica sharding are working properly
- Yellow: The primary fragment is normal, but a copy fragment is abnormal
- Red: An existing primary assignment is not running properly
2. Windows cluster
- Make a copy of the ES installation file and name it as follows based on the port number
- Node Description This cluster tests one primary node with two data nodes
- 9200: indicates the master node
- 9201: data node
- 9202: data node
- A cluster operation is performed on a single node. Incremental configuration is performed on the following clusters. For the preceding configuration, see installing the JDK on a single node
2.1 the master node
- Modify the config/ Elasticsearch. yml file to add the following configuration
Configuration information of cluster node 1
The cluster name must be the same among nodes
cluster.name: elasticsearch
The node name must be unique within the cluster
node.name: node-9200
The master # cluster
node.master: true
# Data node
node.data: true
# the IP address
network.host: localhost
HTTP port #
http.port: 9200
# TCP listening port
transport.tcp.port: 9300
The new node is used to join the master node list of the cluster. Note that the port number is a TCP port, which is specified when there are multiple master nodes
#discovery.seed_hosts: ["localhost:9300", "localhost:9301","localhost:9302"]
#discovery.zen.fd.ping_timeout: 1m
#discovery.zen.fd.ping_retries: 5
# List of nodes in the cluster that can be selected as the master node
#cluster.initial_master_nodes: ["node-9200", "node-9201","node-9202"]
Cross-domain configuration
#action.destructive_requires_name: true
http.cors.enabled: true
http.cors.allow-origin: "*"
Copy the code
- Start the primary node and check whether it is started properly
2.2 Data Nodes
- The configuration of the two data nodes is the same, only the HTTP and Transport ports need to be changed according to their respective ports
- Modify the config/ Elasticsearch. yml file to add the following configuration
- Note: Node. master: false, only as a data node, if the master node needs to enable
- Note: Discovery.seed_hosts needs to configure the master node list
- Configure elasticsearch – 9201
Configuration information of cluster node 2
The cluster name must be the same among nodes
cluster.name: elasticsearch
The node name must be unique within the cluster
node.name: node-9201
The master # cluster
node.master: false
# Data node
node.data: true
# the IP address
network.host: localhost
HTTP port #
http.port: 9201
# TCP listening port
transport.tcp.port: 9301
# The new node is used to join the primary node list of the cluster
discovery.seed_hosts: ["localhost:9300"]
discovery.zen.fd.ping_timeout: 1m
discovery.zen.fd.ping_retries: 5
# List of nodes in the cluster that can be selected as the master node
#cluster.initial_master_nodes: ["node-9200", "node-9201","node-9202"]
Cross-domain configuration
#action.destructive_requires_name: true
http.cors.enabled: true
http.cors.allow-origin: "*"
Copy the code
- Configure elasticsearch – 9202
Configuration information of cluster node 3
The cluster name must be the same among nodes
cluster.name: elasticsearch
The node name must be unique within the cluster
node.name: node-9202
The master # cluster
node.master: true
# Data node
node.data: true
# the IP address
network.host: localhost
HTTP port #
http.port: 9202
# TCP listening port
transport.tcp.port: 9302
The new node is used to join the master node list of the cluster. Note that the port number is a TCP port, which is specified when there are multiple master nodes
discovery.seed_hosts: ["localhost:9300"]
discovery.zen.fd.ping_timeout: 1m
discovery.zen.fd.ping_retries: 5
# List of nodes in the cluster that can be selected as the master node
#cluster.initial_master_nodes: ["node-9200", "node-9201","node-9202"]
Cross-domain configuration
#action.destructive_requires_name: true
http.cors.enabled: true
http.cors.allow-origin: "*"
Copy the code
- Start up the two data nodes separately and see if all three are normal
2.3 Cluster Test
- Obtain cluster health information
- GET _cluster/health
- Number_of_nodes: The value of cluster nodes is 3
- Number_of_data_nodes: Data nodes are also 3
{
"cluster_name" : "elasticsearch"."status" : "green"."timed_out" : false."number_of_nodes" : 3."number_of_data_nodes" : 3."active_primary_shards" : 7."active_shards" : 14."relocating_shards" : 0."initializing_shards" : 0."unassigned_shards" : 0."delayed_unassigned_shards" : 0."number_of_pending_tasks" : 0."number_of_in_flight_fetch" : 0."task_max_waiting_in_queue_millis" : 0."active_shards_percent_as_number" : 100.0
}
Copy the code
- Write data to the primary node
PUT cluster/_doc/1
{
"name": "Cluster test"."message": "Write to primary node"
}
Copy the code
- The two slave nodes query the data, and you can see that the data is successfully synchronized
- http://localhost:9201/cluster/_search
- http://localhost:9202/cluster/_search
{
"took": 25."timed_out": false."_shards": {
"total": 1."successful": 1."skipped": 0."failed": 0
},
"hits": {
"total": {
"value": 1."relation": "eq"
},
"max_score": 1."hits": [{"_index": "cluster"."_type": "_doc"."_id": "1"."_score": 1."_source": {
"name": "Cluster test"."message": "Write to primary node"}}}Copy the code
3. Linux cluster
- Prepare three Linux servers
- Prepare an ES service file, modify the config/ Elasticsearch. yml file, and configure the following information
- Notice The pre-jdk environment has been configured. For details, see the core module single-node installation
# Cluster name
cluster.name: cluster-es
# Node name. Each node name must be unique
node.name: node-1
The IP address of each node must not be the same
network.host: linux1
# master node
node.master: true
node.data: true
http.port: 9200
Cross-domain configuration
http.cors.allow-origin: "*"
http.cors.enabled: true
http.max_content_length: 200mb
# es7.x: this configuration is required to elect the master when initializing a new cluster
cluster.initial_master_nodes: ["node-1"]
# es7.x after the new configuration, node discovery
discovery.seed_hosts: ["linux1:9300"."linux2:9300"."linux3:9300"]
gateway.recover_after_nodes: 2
network.tcp.keep_alive: true
network.tcp.no_delay: true
transport.tcp.compress: true
Number of data tasks to be started at the same time in the cluster, default is 2
cluster.routing.allocation.cluster_concurrent_rebalance: 16
# Number of concurrent threads recovered when adding or removing nodes and load balancing. Default: 4
cluster.routing.allocation.node_concurrent_recoveries: 16
# Number of concurrent recovery threads during initial data recovery. Default: 4
cluster.routing.allocation.node_initial_primaries_recoveries: 16
Copy the code
- Distribute the configured ES service to the remaining two servers without any configuration changes
- Start each of the three servers and perform cluster tests in the Windows cluster