[This is my 14th day of The November Gwen Challenge. Check out the event details: The last Gwen Challenge 2021]

1. Configuration mode

Elasticsearch provides good defaults and requires very little configuration. The configuration file should contain the node-specific Settings as well as the cluster configuration

Node Settings:node.name,paths
Cluster configuration:cluster.name,network.host

Elasticsearch has three configuration files:

elasticsearch.ymlUsed to configure Elasticsearch
jvm.optionsUsed to configure Elasticsearch JVM Settings
log4j2.propertiesUsed to configure Elasticsearch logging

The configuration file is located in the config directory, and its default location depends on the installation method. For example, through the docker installation es configuration file located in/usr/share/elasticsearch/config. The configuration file also supports environment variable substitution, as shown below.

node.name:    ${HOSTNAME}
network.host: ${ES_NETWORK_HOST}
Copy the code

The value of the environment variable must be a simple string, provided with a comma-delimited string that Elasticsearch will parse into the value of the list.

Cluster nodes are divided into dynamic and static nodes. Dynamic nodes can be updated at run time and support temporary and persistent Settings. Static Settings can only use ElasticSearch.yml on nodes that are not enabled or disabled.

Elasticsearch will apply the Settings in the following order:

Transient: Temporary Settings will be removed after the first full cluster restart
Persistent Settings: Can survive a full cluster restart
elasticsearch.ymlfile
The default configuration

Temporary or persistent Settings are implemented through the API, and the specific type is determined by the field. Such as:

PUT /_cluster/settings
{
    "persistent" : {
        "discovery.zen.minimum_master_nodes" : 2 
    },
    "transient" : {
        "indices.store.throttle.max_bytes_per_sec" : "50mb" 
    }
}

{
  "acknowledged" : true,
  "persistent" : {
    "discovery" : {
      "zen" : {
        "minimum_master_nodes" : "2"
      }
    }
  },
  "transient" : { }
}
Copy the code

Second, important configuration

1. Set the path

Elasticsearch writes the data you index to the index and the data stream to the data directory. Elasticsearch writes its own application logs to a logs directory that contains information about cluster health and operations. Such as:

path:
  data: /var/data/elasticsearch
  logs: /var/log/elasticsearch
Copy the code

Warn: Do not modify anything in the data directory or run processes that might interfere with its contents. If something other than Elasticsearch changes the contents of the data directory, then Elasticsearch may fail, report corruption or other data inconsistencies, or it may work without silently losing some data. Do not attempt a file system backup of the data directory; There is no supported way to restore such a backup. Instead, snapshot-restore is used for secure backups. Do not run virus scanners on data directories. A virus scanner may prevent Elasticsearch from working properly and may modify the contents of the data directory. The data directory does not contain executable files, so virus scans will only pick up false positives.

2. Set the cluster name

A node can be added to a cluster only when its cluster.name is the same as all other nodes in the cluster. The default name is ElasticSearch. You should not set the same cluster name in different environments to avoid errors.

cluster.name: "docker-cluster"
Copy the code

3. Set the node name

The node name is used to describe the node and is returned in many responses.

node.name: "es02"
Copy the code

4. Set the network host

By default, Elasticsearch is only bound to loopback addresses, such as 127.0.0.1 and [::1]; If the loopback address is set, ES is in development mode and no boot check is performed. Non-loopback address Settings:

Network. The host: 192.168.1.10Copy the code

5. Configure cluster discovery

Elasticsearch will combine existing loopback addresses and scan local ports 9300 to 9305 to connect to other nodes running on the same server. This behavior provides a way to automatically connect clusters without any configuration.

If you want to connect to nodes on other servers, you need to set up other discoverable nodes using discovery.seed_hosts. The addresses can be ipv4, ipv6, or domain name. For example,

Discovery. Seed_hosts: - 192.168.1.10:9300-192.168.1.11 - SEEDs.mydomain.com - [0:0:0:0: FFFF: C0A8:10C]:9301Copy the code

When the Elasticsearch cluster starts for the first time, the cluster boot selects the primary node by counting votes in the first election. You can set this election list by using the cluster.initial_master_nodes setting. In development mode, if seed_hosts is not configured, this step is performed automatically by the node itself.

cluster.initial_master_nodes: es01,es02
Copy the code

6. Heap size Settings

By default, ES automatically sets the JVM heap size based on the role of the node and total memory, and most production environments can use the default size.

To override the default heap size, set the minimum heap using Xms and the maximum heap using Xmx. The minimum and maximum values must be the same; Set Xms and Xmx not to exceed 50% of total memory; The JVM. Options file can be configured by setting the heap size.

Docker-compose: docker-compose

es01:
    environment:
        - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
Copy the code

7. Set the JVM heap dump path

By default, Elasticsearch configures the JVM to dump the heap in case of an out-of-memory exception to the default data directory; Options file can be modified to set the parameter -xx :HeapDumpPath.

8. GC log Settings

By default, ES enables garbage collection (GC) logging. The ES logs are configured in the Jvm.options file and printed to the default location. By default, logs are rotated every 64 MB, which consumes up to 2 GB of disk space. Such as:

# disable log - Xlog: disable - Xlog: all = warning: stderr: utctime, level, the size of the tags # configuration directory, log rotation -Xlog:gc*,gc+age=trace,safepoint:file=/opt/my-app/gc.log:utctime,pid,tags:filecount=32,filesize=64mCopy the code

9. Temporary directory Settings

By default, ES uses the startup script to create a private temporary directory directly under the system temporary directory. On some Linux distributions, the system utility clears/TMP of files and directories that have not been accessed recently. This behavior causes private temporary directories to be deleted at ES runtime if the temporary directory functionality is not needed for a long time. Deleting the private temporary directory can cause problems if you subsequently use features that require this directory.

To avoid exceptions, you need to set temporary directory permissions so that only users running Elasticsearch can access it.

10. JVM fatal error log Settings

By default, Elasticsearch configures the JVM to write fatal error logs to the default log directory. You can modify the jvm.options file by setting -xx :ErrorFile to change the file path.

11. Cluster backup

In a disaster, snapshots can prevent permanent data loss; The only reliable and supported way to back up a cluster is to take a snapshot. You cannot back up an Elasticsearch cluster by copying its node’s data directory.

Third, refer to the article

Important Elasticsearch configuration

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Elasticsearch7 – Common Settings

1. Configuration mode

Second, important configuration

1. Set the path

2. Set the cluster name

3. Set the node name

4. Set the network host

5. Configure cluster discovery

6. Heap size Settings

7. Set the JVM heap dump path

8. GC log Settings

9. Temporary directory Settings

10. JVM fatal error log Settings

11. Cluster backup

Elasticsearch7 – Common Settings

1. Configuration mode

Second, important configuration

1. Set the path

2. Set the cluster name

3. Set the node name

4. Set the network host

5. Configure cluster discovery

6. Heap size Settings

7. Set the JVM heap dump path

8. GC log Settings

9. Temporary directory Settings

10. JVM fatal error log Settings

11. Cluster backup

Related Posts

How to orchestrate services through gateways?

Hibernate Validator, a Java Bean validation tool

What is SAP Business Function