To prevent hardware failures and increase search capacity, Elasticsearch can store copies of index data across multiple shards on multiple nodes. When the search request is run, Elasticsearch selects a node that contains a copy of the index data and forwards the search request to the shard of that node. This process is called searching for fragmented routes or routes.
Adaptive replica selection
By default, Elasticsearch uses adaptive copy selection to route search requests. This method selects eligible nodes using sharding allocation awareness and the following criteria:
- Coordinate response times for previous requests between nodes and qualified nodes
- The amount of time the qualifying node took to run the previous search
- The queue size of the search thread pool for eligible nodes
Adaptive copy selection is designed to reduce search latency. However, you can disable adaptive replica selection by setting cluster.routing.use_adaptive_replica_selection to false using the Cluster Settings API. If disabled, Elasticsearch uses a circular method to route search requests, which can cause slow searches.
Set a preference
By default, adaptive replica selection selects from all eligible nodes and shards. However, you might only want data from a local node, or you might want to route the search to a specific node based on its hardware. Alternatively, you might want to send repeated searches to the same shard to take advantage of caching.
To restrict the set of nodes and shards that match the search request, use the preference query parameter of the search API.
For example, the following request uses preference _local to search my-index-000001. This limits the search to shards on the local node. If the local node does not contain a sharded copy of the index data, the request is backed up with adaptive copy selection of another eligible node.
GET /my-index-000001/_search? preference=_local { "query": { "match": { "user.id": "kimchy" } } }Copy the code
You can also use the preference parameter to route the search to a particular shard based on the supplied string. If the cluster state and the selected shard have not changed, searches using the same preference string will be routed to the same shard in the same order.
We recommend using a unique preference string, such as a user name or Web Session ID. The string cannot start with an _.
TIP: You can use this option to provide cached results for frequently used and resource-intensive searches. If the shard data remains unchanged, repeated searches using the same preference string retrieve results from the same shard request cache. For time series use cases, such as logging, data in older indexes is rarely updated and can be supplied directly from this cache.
The following request searches the my-index-000001 index using the preference string my-custom-shard-string.
GET /my-index-000001/_search? preference=my-custom-shard-string { "query": { "match": { "user.id": "kimchy" } } }Copy the code
NOTE: If the cluster state or the selected shard changes, the same preference string may not route the search to the same shard in the same order. This can happen for a number of reasons, including shard relocation and shard failure. A node can also reject a search request and Elasticsearch will rerun it to another node.
Use a routing value
When indexing a document, you can specify an optional routing value that will route the document to a particular shard.
For example, the following index request uses my-routing-value to route documents.
POST /my-index-000001/_doc? Routing =my-routing-value {"@timestamp": "2099-11-15t13:12:00 ", "message": "GET /search HTTP/1.1 200 1070000", "user": { "id": "kimchy" } }Copy the code
You can use the same route values in the route query parameters of the search API. This ensures that the search is run on the same shard used to index the document.
GET /my-index-000001/_search? routing=my-routing-value { "query": { "match": { "user.id": "kimchy" } } }Copy the code
You can also provide multiple comma-separated route values:
GET /my-index-000001/_search? routing=my-routing-value,my-routing-value-2 { "query": { "match": { "user.id": "kimchy" } } }Copy the code
Search concurrency and parallelism
By default, Elasticsearch does not reject a search request based on the number of fragments hit by the request. However, hitting a large number of fragments can greatly increase CPU and memory usage.
TIP: For tips on preventing indexes with a large number of shards, see Avoid Oversharding.
You can use the max_concurrent_SHARd_requests query parameter to control the maximum number of concurrent shards that a search request can hit on each node. This prevents a single request from overloading the cluster. The default value of this parameter is 5.
GET /my-index-000001/_search? max_concurrent_shard_requests=3 { "query": { "match": { "user.id": "kimchy" } } }Copy the code
You can also use the action.search.shard_count.limit cluster setting to set the search sharding limit and reject requests that hit too many shards. You can configure action.search.shard_count. Limit using the Cluster Settings API.
See the website: www.elastic.co/guide/en/el…