Introduction to the

Elasticsearch is a distributed Search middleware. It’s said to be a piece of middleware that the founder created when he was developing a cookbook search app for his wife. Sure enough, the power of love is great, otherwise there wouldn’t be Elasticsearch.

Distributed, high performance, and near real time are the features of Elasticsearch. It can search for almost any type of data (basic value type, geospatial, IP, etc.), depending on the appropriate index structure for each type. In the next series we will look at the index section in detail.

For the sake of brevity, we will shorten Elasticsearch to ES.

Cluster Cluster

As a distributed system, there is definitely a need to cluster multiple ES instance nodes. Clusters need to be managed, and in ES the cluster is managed by master. The master is elected to ensure high availability and avoid single points of failure. In ES, not all nodes are candidates for master.

In ES, an instance node can also cluster itself

The following figure is a common ES cluster architecture diagram:

Node Node

A single ES instance is called a node. Nodes in ES can have one or more of the following roles:

Eligible primary Node -Master Eligible Node

Only the candidate primary node has the right to vote and be elected. Other nodes do not participate in the election.

The elected master node is responsible for creating indexes, dropping indexes, tracking which nodes are part of the cluster, deciding which shards to allocate to related nodes, and tracking the status of nodes in the cluster. A stable master node is very important for the health of the cluster.

In general, it is best to use a low-configuration machine to create independent candidate primary nodes with no less than three for stability.

Data Node -Date Node

It stores Data and performs related operations, such as adding, deleting, modifying, querying, and aggregating Data. Therefore, Data nodes have high requirements on machine configurations and consume large AMOUNTS of CPU, memory, and I/O.

The data node is a node under great pressure in the cluster. It is better to separate the data node from the master node to avoid affecting the stability of the master node, resulting in brain split, index and data inconsistency, etc.

1. Use SSDS to improve disk read and write capabilities

2. Allocating memory to the file cache in addition to the memory used by the JVM heap itself speeds up file access and avoids disk access every time

3. Disable swap

Coordinating Node -Coordinating Node

In general, each node can assume the role of coordinating node. Usually, the node that receives the client request is the coordinating node for the request. The coordination node is used to process client requests, distribute requests, merge results, and return them to the client.

Coordination nodes are between primary nodes and data nodes and do not require high IO capacity. The coordinated nodes are independent, which helps to reduce the pressure of data nodes and avoid mutual influence.

In addition to the above common node roles, there are also: Ingest Node, remote-eligible Node, Machine Learning Node, and Transform Node.

Shard Shard

A data node can contain more than one shard. A shard is a Lucene instance, and the index consists of a series of shards. ES is called distributed search because it can spread data across multiple shards, providing higher performance.

The number of shards must be determined during index creation because the shard to which the data is written needs to be routed. Once set, the number of shards cannot be changed.

Replica Shard

A primary shard can have zero or more replica shards. By default, each primary shard has a replica shard, which is never on the same node as the primary shard. Main functions:

1. Failover: When the master shard fails, the replica shard can be promoted to the master shard.

2. Improved performance: GET and search requests can be sent to the master shard or the replica Shard.

Lucene

Lucene is a full-text retrieval library. ES is based on Lucene. A Lucene index contains many segemnt segments, and each segment is an index structure. When searching, all segments are searched. An inverted index is built inside a segment for retrieval.

Segment is immutable, so:

1. When a document is deleted: Lucene marks the document as deleted

2, when updating: delete first, then insert (new Segmeng)

write

When the coordination node receives the write request, it finds the corresponding master shard through routing and forwards the write request to the master Shard. After the master SHard finishes writing, it will write and send to the duplicate SHard. After the duplicate SHard writes, it will return to the master SHard. When all replica shards are written, the master shard returns a request to the coordinating node.

When a node receives a write request, it writes data to the index buffer first. At this point, the data cannot be searched until the data is brushed to the file system cache by the refresh mechanism (default: 1 second), forming an immutable segment small index, and then the data can be searched (near real-time search). After refresh, data is written to the file cache and is not persisted to disk immediately. Instead, it is flushed to disk.

When data is written to the index buffer, it is also written to the Translog to prevent data loss. Translog fsyncs to disk in 5 seconds by default, so data written in less than 5 seconds can theoretically be lost. Translog does not grow indefinitely. When data is flushed to disk, Translog can be cleaned up.

Segment Flush

1. No request for index within 5 minutes

2. Translog reaches a certain size. The default value is 512 MB

3. Call flush API

Flush for the entire cluster

The curl - XPOST "http://127.0.0.1:9200/_flush/synced"Copy the code

Flush for a single index

The curl - XPOST "http://127.0.0.1:9200/demo/_flush/synced"Copy the code

Generating one segment per second will result in a large number of files and slow search response. Therefore, ES will automatically merge segments irregularly and delete old segments and release resources. Of course, manual forced merging can also be performed when the ES pressure is not high.

The curl -x POST "http://127.0.0.1:9200/demo/_forcemerge"Copy the code

search

When the search request reaches the first node, it acts as the coordinator node and is responsible for processing the request. The processing process is shown as follows:

If the search is paging: suppose the client requests from = 50, size = 50; From = 0, size = from + size = 100; Shard number * (from + size); Finally, paging data is selected, and fetch returns the data to be fetched.

Based on the above process, ES has a default limit of 10,000 pages to avoid the impact of deep paging on the system. So what’s the solution?

1. Modify index. max_result_window to set the maximum value dynamically

The curl - XPUT "http://127.0.0.1:9200/demo/_settings" - H 'the content-type: application/json - d' {" index. Max_result_window ": "1000000"} 'Copy the code

2, Scroll

For the first query, the scroll parameter indicates the scroll query and the expiration time of the Scroll context is set:

The curl - XGET "http://127.0.0.1:9200/demo/_search? scroll=10s" -H 'Content-Type: application/json' -d' { "size": 20, "query": { "match_all": {} } }'Copy the code

Return _scroll_id:

{
  "_scroll_id": "i1BOVdVWjJyU..."./ / to omit
}
Copy the code

Subsequent queries take the _scroll_id returned last time as an argument:

Curl -xgety "http://127.0.0.1:9200/_search/scroll" -h 'content-type: application/json' -d' {"scroll" : "10s", "scroll_id" : "i1BOVdVWjJyU..." } 'Copy the code

Scroll needs to maintain a snapshot context on the ES server to hold the segment. The segment merge can continue, but the old segment cannot be deleted to free resources. Until expired or manually cleaned:

Curl -xdelete "http://127.0.0.1:9200/_search/scroll" -h 'content-type: application/json' -d' {"scroll_id" : "i1BOVdVWjJyU..." } 'Copy the code

When traversing large amounts of data, slice Scroll can be used to speed things up

3, the Search After

The curl - XGET "http://127.0.0.1:9200/demo/_search" - H 'the content-type: application/json - d' {" from ": 0, / / the from ever starting from 0 "size" : 10, "query" : {" match_all ": {}}," search_after ": Sort: [{"_id": {}}]}' [{"_id": {}}]}' [{"_id": {}}]}'Copy the code

You need to set the sort field, and it is stable, usually the last sort field is _id. Records the sorting value of the last element on the previous page as the next query condition. In contrast, Scroll is stateless on the server and does not occupy ES resources.

conclusion

This article gives you a brief overview of some basic system concepts and the general flow of writing and searching in ES. Does it look like most distributed databases? We will further analyze the principle and performance of ES.

Reference:

ES website www.elastic.co/guide/en/el…