Elasticsearch is a powerful open source distributed search and data analysis engine, currently used by many major Chinese Internet companies, including Ctrip, Didi, Toutiao, Ele. me, 360 Security, Xiaomi, Vivo, etc.

In addition to search, the Elastic Stack, which combines Kibana, Logstash and Beats, is widely used for near-real-time analysis of big data, including log analysis, index monitoring, and information security.

It can help you explore mountains of structured and unstructured data, create visual reports on demand, set alarm thresholds for monitoring data, and even automatically identify anomalies using machine learning techniques.

Today, we’ll take a top-down, bottom-up look at how ElasticSearch works at the bottom and try to answer the following questions:

Why does my search for *foo-bar* not match foo-bar?

Why does adding more files compress an Index?

Why ElasticSearch takes up a lot of memory?

The illustration ElasticSearch

Elasticsearch version: ElasticSearch -2.2.0

① Clusters on the cloud

The diagram below:

② Boxes in the cluster

Each white square box inside the cloud represents a Node — Node.

③ Between nodes

On one or more nodes, multiple green squares are grouped together to form an index for ElasticSearch.

④ Little squares in the index

Small green squares distributed across multiple nodes under an index are called shards.

5. Shard = Lucene Index

An ElasticSearch Shard is essentially a Lucene Index.

Lucene is a Full Text search library (as well as many other search libraries), ElasticSearch is built on top of Lucene.

Most of the rest of the story is actually about how ElasticSearch works based on Lucene.

The illustration Lucene

Mini index: Segment

There are many small segments in Lucene. We can think of them as mini-indexes within Lucene.

The Segment internal

The Segment has a number of data structures inside it, as shown above:

Inverted Index

Stored Fields

Document Values

Cache

May be the most important Inverted Index

The diagram below:

Inverted Index mainly consists of two parts:

An ordered data Dictionary (including the word Term and its frequency).

The Postings corresponding to the word Term (that is, the file in which the word exists).

When we search, we first decompose the search content, and then find the corresponding Term in the dictionary, so as to find the file content related to the search.

① Query “The Fury”

The diagram below:

② Autocompletion-prefix

If I want to find the Inverted Index letter starting with the letter “C”, I can simply find the terms such as “choice” and “coming” in the Inverted Index table by Binary Search.

③ Expensive search

If you want to find all the Inverted Index words that contain the letter “our”, the system scans the entire Inverted Index, which is very expensive.

In this case, if we want to optimize, the problem we face is how to generate the appropriate Term.

④ The transformation of the problem

The diagram below:

There are several possible solutions to these and other problems:

* suffix→xiffus *, if we want to use suffixes as search criteria, we can do the reverse process for Term.

(60.6384, 6.5017)→ u4U8gyykk, for GEO location information, it can be converted to GEO Hash.

123→{1-hundreds, 12-tens, 123} for simple numbers, multiple forms of Term can be generated for them.

⑤ Fix spelling mistakes

A Python library solves the spelling problem by generating a tree state machine for words containing misspelling information.

⑥Stored Field searches

So Lucene provides another data structure, Stored Fields, to solve the problem.

Essentially, Stored Fields are a simple key-value pair. By default, ElasticSearch stores the JSON source for the entire file.

⑦Document Values for sorting, aggregation

Even so, we find that the above structure does not solve problems such as sorts, aggregations, and facets, because we may need to read a lot of unwanted information.

So, another data structure solves this problem: Document Values. This structure is essentially a columnar storage that is highly optimized for storage structures with the same type of data.

To improve efficiency, ElasticSearch can read all Document values under the index into the memory for operation. This greatly improves access speed but consumes a large amount of memory space.

All these data structures are Inverted Index, Stored Fields, Document Values and their caches inside the segment.

When the search occurs

When searching, Lucene searches all segments and returns the search results for each Segment, which are then consolidated and presented to the customer.

Lucene has several features that make this process important:

**Segments are immutable: **Delete? When a delete occurs, all Lucene does is flag it as delete, but the file stays where it was, unchanged.

The Update? So for updates, essentially what it does is delete, then re-index.

** Compression everywhere: **Lucene is very good at compressing data, and almost all textbook compression methods can be found in Lucene.

** Caches all: **Lucene also caches all information, which greatly improves its query efficiency.

The story of caching

When ElasticSearch indexes a file, it creates a cache for the file and refreshes the data periodically (every second) so that the file can be searched.

Over time, we have a lot of Segments, like this:

So ElasticSearch will merge these segments, and in the process, the Segment will eventually be deleted.

This is why increasing files may make the index take up less space, which causes the Merge, which may result in more compression.

Take a chestnut

Two segments will Merge:

These two segments will eventually be deleted and merged into a new Segment, as shown below:

At this point, the new Segment is Cold in the cache, but most segments remain unchanged and Warm.

The above scenario often occurs inside Lucene Index, as shown below:

Search in Shard

ElasticSearch searches from Shard in the same way as Lucene Segment searches.

Unlike searching Lucene segments, shards may be distributed across different nodes, so all information is transmitted over the network when searching and returning results.

**1 search to find 2 shards = 2 separate searches for shards.

** Processing of log files: ** When we want to search for logs generated on a specific date, we can greatly improve our search efficiency by chunking and indexing log files by timestamp.

It is also very convenient when we want to delete old data, just delete the old index.

In the previous case, there are two Shards per Index.

How to Scale

The diagram below:

Shards are not split further, but they may be moved to different nodes.

Therefore, if the cluster node pressure increases to a certain extent, we may consider adding new nodes, which will require us to reindex all data, which is not desirable.

Therefore, we need to consider clearly how to balance the relationship between enough nodes and insufficient nodes when planning.

Node allocation and Shard optimization:

Allocate better performance machines for more important data index nodes.

Ensure that each Shard has Replica information.

** Routing: ** Each node has a Routing table, so when a request is sent to any node, ElasticSearch has the ability to forward the request to the Shard of the desired node for further processing.

A real request

The diagram below:

(1) the Query

The diagram below:

Query has a filtered type and a multi_match Query.

(2) the Aggregation

The diagram below:

According to authors, the aggregation was carried out to obtain the top10 authors’ information of the top10 hits.

③ Request distribution

This request can be distributed to any node in the cluster, as shown below:

④ God node

The diagram below:

The node becomes the Coordinator of the current request and decides:

Based on the index information, determine which core node the request will be routed to.

And which copies are available.

And so on.

(5) routing

The diagram below:

⑥ Before real search

ElasticSearch converts Query to Lucene Query.

Then perform the calculation in all segments as shown below:

There is also a cache for the Filter condition itself, as shown below:

Queries are not cached, so if the same Query is executed repeatedly, the application needs to cache it itself.

So:

Filters can be used at any time.

Query is used only when Score is needed.

All landowners to return

After the search, the results will be returned layer by layer along the downward path, as shown below: