[This is my 16th day of The November Gwen Challenge. Check out the event details: The last Gwen Challenge 2021]
The index module
An index module is a module created for each index that defines all the configuration associated with the index. We can configure two different index levels for each index setting:
- Static index: can only be in
When creating
Or in theBe shut down
Set on the index of - Dynamic index: the existing index can be set arbitrarily
The index module is classified as follows:
Static index setting
Shard the master shard
Number_of_routing_shards Is used to set routing parameters for document fragmentation
- Elasticsearch uses this value when splitting indexes. The parameter is
index.number_of_shards
Integer multiple of; - The default value of this setting depends on the number of master shards in the index, and the default setting allows shards to be pressed
Factor of 2
Maximum split1024
A shard.
Index. number_of_shards Sets the number of primary shards for the index
- The default value is 1.
- This setting can only be set at index creation time and cannot be changed on closed indexes;
- By default, the maximum number of fragments is 1024 to prevent excessive fragments from damaging cluster stability
export ES_JAVA_OPTS="-Des.index.max_number_of_shards=128"
Modify the restriction.
Routing formula:
routing_factor = num_routing_shards / num_primary_shards 3=30/3
shard_num = (hash(_routing) % num_routing_shards) / routing_factor (0~9)=(0~29)/3
Copy the code
Formula value:
num_primary_shards
The value isindex.number_of_shards
The value of thenum_routing_shards
The value isnumber_of_routing_shards
The value of the- The default
_routing
The value is document_id
, can also be set for each documentrouting
Specify custom values to implement
With this routing formula, you can extend the master shard. The illustration below
Custom Route
Adding a custom route:
PUT my-index-000001/_doc/1? routing=user1&refresh=true { "title": "This is a document" } GET my-index-000001/_doc/1? routing=user1Copy the code
Routing result:
The _id does not guarantee the uniqueness of all fragments in an index. In fact, identical documents may end up in different shards if the _id is indexed with different _routing values; The user needs to set appropriate parameters to ensure the uniqueness of the ID.
Set route parameters as required:
POST my-index-000002 { "mappings": { "_routing": { "required": true } } } PUT my-index-000002/_doc/1? routing=user1&refresh=true { "title": "This is a document2" } GET my-index-000002/_doc/1? routing=user1Copy the code
Set the _routing attribute of the index to true. If routing is not used, routing_missing_exception is raised.
Index. routing_partition_size Number of fragments that a user-defined route can reach. The default value is 1. This parameter can only be set during index creation.
When this parameter is set, the calculation formula changes:
routing_value = hash(_routing) + hash(_id) % routing_partition_size
shard_num = (routing_value % num_routing_shards) / routing_factor
Copy the code
In Elasticsearch 7.0.0 and later, this setting affects how documents are distributed between shards. When reindexing old indexes using custom routes, you must explicitly set index. number_of_ROUTing_shards to maintain the same document distribution
other
- The compression
The index.codec is used to set the data compression type. The default value is to use LZ4 compression to compress stored data. You can set Deflate to get a higher compression rate but degrade the performance of stored fields. If the compression type is being updated, the new compression type is applied after the merged segment. You can use Force Merge to enforce merge segments.
On the computing side, Deflate is a lossless data file compression format that combines LZSS and Hoffman encoding. It was designed by Phil Katz for version 2 of his PKZIP compression tool.
- Soft delete
Index. soft_delt.enabled is deprecated in 7.6.0. Creating an index that disables soft deletion is deprecated and will be removed in a future version of Elasticsearch. Soft delete can only be configured during index creation.
Soft_delt.retention_lease. period Sets the historical record retention time for sharding to ensure that soft deletes are retained during Lucene index merging. The default is 12h.
- other
Index.load_fixed_bitset_filters_gbit/s defines whether caching filters are preloaded for nested queries. Possible values are true (the default) and false.
Elasticsearch automatically checks the integrity of shard content at various points in the shard life cycle. The index.shard. Check_on_startup setting determines whether Elasticsearch performs additional integrity checks when opening sharding. If these checks detect corruption, they prevent the shard from opening. The related values are as follows:
The values | |
---|---|
false | Do not perform additional damage checking when opening sharding, which is the default and recommended behavior |
checksum | To validate each file in the shardThe checksum Whether it matches its content |
true | Check for physical and logical damage. This is an expensive operation, consuming CPU and memory |
This option has been deprecated and permanently removed after 7.0 |
Dynamic index setting
Dynamic index Settings can be changed on live indexes using the update-index-settings API.
The title | |
---|---|
index.number_of_replicas |
Number of copies per master shard. Default is 1 |
index.auto_expand_replicas | Automatically expand the number of replicas based on the number of data nodes in the cluster. Default is false |
index.search.idle.after | Shards cannot be received until they are considered search freesearch orTo obtain Request time. (Default: 30s) |
index.refresh_interval | How often the refresh operation is performed; The default is1s . Can be set- 1 To disable refresh |
index.max_result_window | from + size Search for the maximum value of this index, where from indicates the sequence number of the starting data and size indicates the number of data. The default is10000 , too small may lead toResult window is too large |
index.max_inner_result_window | Limits the result set in the returned result, which defaults to 100 |
index.max_rescore_window | docvalue_fields The maximum number allowed in the query |
index.max_docvalue_fields_search | script_fields Maximum number allowed in a query. The default value is32 |
index.max_script_fields | script_fields Maximum number allowed in a query. The default value is32 |
index.max_ngram_diff | The difference between max_gram and min_gram in the NGram token generator must be less than or equal tomax_script_fields |
index.max_shingle_diff | The maximum allowable difference between max_SHingLE_SIZE and min_shingLE_size for the Shingle token filter is 3 by default |
index.max_refresh_listeners | The maximum number of refresh listeners available on each shard of the index |
index.analyze.max_token_count | You can use_analyze API The maximum number of tokens to generate. Default is10000 |
index.highlight.max_analyzed_offset | The maximum number of characters parsed for the highlight request |
index.max_terms_count | Terms Specifies the maximum number of Terms that can be used in the query. The default value is65536 |
index.max_regex_length | The maximum length of a regex that can be used in a Regexp Query. The default is1000 |
index.query.default_field | A wildcard pattern that matches one or more fields |
index.routing.allocation.enable | Control the Sharding allocation of this index: ALL (All Sharding), Primaries (Primaries), New_Primaries (Newly created Primary Sharding), None (Not allowed) |
index.routing.rebalance.enable | Enable sharding rebalancing for this index: all,primaries ,replicas (Sub-fragment),none |
index.gc_deletes | Retention time of a deleted document. The default retention time is 60 seconds |
index.default_pipeline | The default intake node pipe for the index |
index.final_pipeline | In the end pipe |
index.mapping.dimension_fields.limit | Maximum number of time series dimensions for an index (for internal use of Elastic only) |
index.hidden | Indicates whether the index should be hidden by default, which it is not |
Other index Settings
Other index Settings available in the index module:
- Analysis of the
- Used to define Settings for analyzers, markers, tag filters, and character filters.
- Index sharding allocation
- Controls where, when, and how shards are allocated to nodes.
- mapping
- Enable or disable dynamic index mapping.
- merge
- Controls how the background merge process merges shards.
- The similarity
- Configuring user-defined similarity You can customize the scoring method of search results.
- Slow log
- Controls the speed at which records are queried and requests are obtained.
- storage
- The type of the file system used to access fragmented data is specified.
- Across the log
- Controls transaction logging and background refresh operations.
- History to retain
- Controls the retention of operation history in the index.
- Load pressure
- Configure load limits.
The resources
index-modules