This is the 13th day of my participation in The August Wenwen Challenge.
Description of elasticearch.yml file configuration
Cluster Cluster configuration
The parameter name | The default value | instructions |
---|---|---|
cluster.name | elasticsearch | Cluster name. Only nodes with the same cluster name can form a logical cluster |
Node Node Configuration
The parameter name | The default value | instructions |
---|---|---|
node.name | System generated | Node name. Each node name in the same cluster must be different. If this parameter is not specified, the system generates the node name by default |
node.attr.rack | Node tribe attribute |
Paths Configuration
The parameter name | The default value | instructions |
---|---|---|
path.conf | Configuration file path | |
path.data | Data storage locations. You can separate multiple data storage locations with commas (,) to improve security | |
path.work | Temporary file storage path | |
path.logs | Path for storing log files |
Memory Memory configuration
The parameter name | The default value | instructions |
---|---|---|
bootstrap.mlockall | true | Considerations Considerations ElasticSearch will be poor performance when the JVM starts writing to swap space and should not be written to swap space, set this property to True to lock memory and also allow the ElasticSearch process to lock memory |
Network and HTTP configuration
The parameter name | The default value | instructions |
---|---|---|
network.bind_host | 0.0.0.0 | The bound IP address can be ipv4 or ipv6. The default IP address is 0.0.0.0 |
network.publish_host | Set the IP address for other nodes to interact with this node. If you do not set the IP address, it will be set automatically. The value must be a real IP address | |
network.host | Set both the bind_host and publish_host parameters | |
transport.tcp.port | 9300 | Set the TCP port for interaction between nodes. The default port is 9300 |
transport.tcp.compress | false | This parameter specifies whether to compress TCP transmission data. The default value is false |
http.port | 9200 | Set the HTTP port for external services. The default port is 9200 |
http.max_content_length | 100Mb | Set the maximum size of requested content. The default size is 100MB |
http.enabled | false | Use HTTP to provide services externally. The default value is true |
http.cors.enabled | true | Enable the head plug-in to monitor cluster information |
http.cors.allow-origin | “*” | Enable the head plug-in to monitor cluster information |
http.cors.allow-credentials | true | Enable the head plug-in to monitor cluster information |
Geteway gateway
The parameter name | The default value | instructions |
---|---|---|
gateway.type | local | The default gateway type is Local, which indicates the local file system. You can set it to the local file system |
gateway.recover_after_nodes | 1 | Recovery can be performed only after N nodes in a cluster are started |
gateway.recover_after_time | 5m | Set the timeout period of the initial recovery process. The timeout period starts from the time when N nodes configured in the previous configuration are started |
gateway.expected_nodes | 2 | Sets how many nodes are expected in the cluster. Local node recovery starts after the N nodes are started (and recover_after_nodes also fits) are added to the cluster. |
Recovery Cluster Recovery → Cluster HA configuration
The parameter name | The default value | instructions |
---|---|---|
cluster.routing.allocation.node_initial_primaries_recoveries | 4 | Number of concurrent recovery threads during initial data recovery. The default value is 4 |
cluster.routing.allocation.node_concurrent_recoveries | 2 | Number of concurrent recovery threads for adding or deleting nodes or load balancing. The default value is 2 |
indices.recovery.max_bytes_per_sec | 0 | Set the throughput at recovery time (for example: 100MB, default is 0 unlimited. If there are other services running on the machine, it is better to limit them. |
indices.recovery.concurrent_streams | 5 | Set to limit the maximum number of concurrent streams that can be opened simultaneously while recovering data from other shards. The default value is 5 |
Discovery cluster Discovery → Cluster Discovery using the main cluster Discovery configuration
The parameter name | The default value | instructions |
---|---|---|
discovery.zen.minimum_master_nodes | 1 | Set this parameter to ensure that the nodes in the cluster know about the other N master nodes. The default is 1. For large clusters, a larger value (2-4) can be set. |
discovery.zen.ping.timeout | 3 | Probe timeout, 3 seconds by default, increased a bit in case of bad network, to prevent brain split |
discovery.zen.ping.multicast.enabled | true | Set whether to enable the multicast discovery node. The default is true when multicast is unavailable or when the cluster is across network segments |
discovery.zen.ping.unicast.hosts | This is an initial list of primary nodes in a cluster that is probed when the node (primary or data node) is started |
Cache Cache → Configuration about Cache
The parameter name | The default value | instructions |
---|---|---|
indices.cache.filter.size | 10% | Filter cache threshold, for example, 1gb or 20% |
index.cache.field.expire | Cache past time | |
index.cache.field.max_size | Maximum number of entries in the cache | |
index.cache.field.type | Cache type |
Translog configuration → Log configuration for transaction operations
The parameter name | The default value | instructions |
---|---|---|
index.translog.flush_threshold_ops | unlimited | Flush when the number of operations occurred |
index.translog.flush_threshold_size | 512mb | When the translog size reaches this value, a flush operation is performed |
index.translog.flush_threshold_period | 30m | If no flush operation is performed within the specified period of time, a forced flush operation is performed |
index.translog.interval | 5s | How often will translog be checked to perform a flush operation? Es will perform a random flush operation between this value and twice this value |
index.gateway.local.sync | 5s | How often do I write to disk |
ES operation (part 1)
Index related operations (including mapping and setting)
Index health checks (minor) can be done with ESheader
# get_cluster /health:/ / 192.168.123.64:9200 / _cluster/healthReturn content ### {# node name"cluster_name": "es-application"The most important piece of response information is the status field. The status can be one of three values #green: all master and replica shards are allocated. Your cluster is 100% available. # YELLOW: All master shards have been sharded, but at least one copy is missing. No data is lost, so search results remain intact. However, your high availability is somewhat weakened. If more shards disappear, you lose data. Think of yellow as a warning that needs to be investigated in time. At least one master shard (and all its copies) is missing. This means you are missing data: a search returns only partial data, and a write request assigned to the shard returns an exception."status": "green"# Green/YELLOW /red status is a great way to get an overview of your cluster and see what's going on right now. The remaining metrics give you an overview of the cluster's state:"timed_out": false, #number_of_nodes and number_of_data_nodes the names are completely self-descriptive."number_of_nodes": 1."number_of_data_nodes": 1, # indicates the number of master shards in your cluster. This is a summary value that covers all indexes."active_primary_shards": 0, # is the total value of _ all _ shards covering all indexes, that is, including replica shards"active_shards": 0, # displays the number of shards currently migrating from one node to another. Normally it should be 0, but this value increases when Elasticsearch finds that the cluster is not very balanced. For example, a new node is added, or a node is taken offline."relocating_shards": 0, # is the number of shards just created. For example, when you create your first index, the shard will be in an 'initializing' state for a short time. This is usually a temporary event, and sharding should not stay in the 'initializing' state for long. You may also see 'initializing' sharding when the node is first restarted: when the sharding is loaded from disk, it starts from the 'initializing' state"initializing_shards": 0, # is a shard that already exists in the cluster state, but is not actually found in the cluster. Usually the source of an unallocated shard is an unallocated copy. For example, an index with 5 shards and 1 replica will have 5 unallocated replica shards on a single-node cluster. If your cluster is in 'red' state, it will hold unallocated shards for a long time (due to lack of primary shards)"unassigned_shards": 0."delayed_unassigned_shards": 0."number_of_pending_tasks": 0."number_of_in_flight_fetch": 0."task_max_waiting_in_queue_millis": 0."active_shards_percent_as_number": 100.0
}
Copy the code
Create indexes
#PUT /{index name} (custom)/ / 192.168.123.64:9200 / index_jacquesh
{
"settings": {
"index": {
"number_of_shards": "2", # number of master shards"number_of_replicas": "0"}} # return {"acknowledged": true."shards_acknowledged": true."index": "index_jacquesh"} # create successfullyCopy the code
View index
# GET _cat/indices? vCopy the code
return
Remove the index
#DELETE /{index name} (custom)/ / 192.168.123.64:9200 / index_jacquesh1# returns {"acknowledged": true} # delete succeededCopy the code