Elasticsearch Top 51 Most Important Interview Questions and Answers

.

The list of questions and answers are from foreign blogs (the original answers are not accurate, there are mistakes), in order to avoid misleading, I have made their own understanding and answers to each question.

The questions are very basic and the article is a bit long, but please bear with me and read through them. I hope this will help your Elastic job search!

What about Elasticsearch?

During the rigorous, following a paragraph directly copy the official website: www.elastic.co/cn/elastics…

Elasticsearch is a distributed, RESTful search and data analysis engine that addresses a variety of emerging use cases. At the heart of the Elastic Stack, it stores your data centrally, helping you find what you expect and what you don’t expect.

ElasticSearch is a Lucene based search server. It provides a distributed multi – user – capable full – text search engine based on RESTful Web interface. Elasticsearch, developed in Java and released as open source under the Apache license, is a popular enterprise-level search engine.

The core features are as follows:

Distributed real-time file storage where every field is indexed and searchable.
Distributed real-time analysis search engine, near real-time second response to massive data.
Simple restful API, naturally compatible with multi-language development.
Easy to expand, handling PB level structured or unstructured data.

2. Can you specify the stable version of Elasticsearch currently available for download?

The latest version of Elasticsearch is currently 7.10 (21 November 2020).

Why do you ask? ES update is too fast. If a candidate knows and uses the latest version, he can basically show that he pays attention to ES update. Even more broadly, he focuses on the iteration and updating of technology.

But ask yourself, many job seekers only know the ES version.

Do you need any components to install Elasticsearch?

Earlier versions of ES required JDK, and JDK has been integrated since 7.x, so there is no need for third-party dependencies.

4. Can you explain how to start the Elasticsearch server?

There are various boot modes. Generally, the bin path is used

./elasticsearch -d 
Copy the code

You can start it in the background.

Open a browser and type http://ES IP:9200 to check whether the cluster is successfully started.

If a startup error occurs, there will be detailed information in the log, check item by item to solve it.

Can you name 10 companies that use Elasticsearch as their search engine or database?

This question, Ming Yi originally wanted to delete. But if you think about it more closely, you can at least see if the candidate’s vision is broad enough.

If you’ve been involved in The Elastic Chinese community or follow it regularly, you’ll know that there are too many companies here (in no particular order) :

Ali.
tencent
baidu
jingdong
Meituan
millet
drops
ctrip
Today’s headline
Shell to find room
360
IBM
Along abundant express

Elasticsearch is used by almost every Internet company we can think of.

It is also a very good way to learn to follow TOP Internet companies’ related technology trends and technical blogs.

Elasticsearch Cluster

An Elasticsearch cluster is a set of one or more Instances of Elasticsearch nodes that are connected together.

The Elasticsearch cluster is designed to distribute tasks, search, and index across all nodes in the cluster.

Elasticsearch Node

The node is an instance of Elasticsearch. In real business, we would say: AN ES cluster consists of three nodes and seven nodes.

A node is an independent Elasticsearch process that is deployed on a separate server, vm, or container.

Nodes can be classified into the following types based on roles:

The master node

Helps configure and manage adding and removing nodes across the cluster.

Data nodes

Store data and perform operations such as CRUD (Create/read/update/delete) to search and aggregate data.

The client node (or coordinating node) forwards cluster requests to the master node and data-related requests to the data node
Intake of node

Used to preprocess documents before indexing.

. .

8, Explain the concept of indexes in Elasticsearch cluster.

An Elasticsearch cluster can contain multiple indexes, which are equivalent to database tables compared to a relational database

Other category concepts, as shown in the table below, are mentioned only.

Explain the concept of Type in Elasticsearch.

X and the previous 2.X and 1.X versions of ES support multiple types of an index. For example, the Join type in ES 6.X was implemented with multiple types in earlier versions.

Indexes created in version 6.0.0 or later can contain only one Mapping type.

Type will be deprecated in Elasticsearch 7.0.0 API and completely removed in 8.0.0.

Many people wonder why delete? Look here:

www.elastic.co/guide/en/el…

Can you define a mapping in Elasticsearch?

Mapping is the process of defining how documents and the fields they contain are stored and indexed.

For example, using a mapping definition:

Which string fields should be defined as text.
Which fields should be defined as: number, date, or geolocation type.
Custom rules to control the types of fields that are dynamically added.

What is the documentation for Elasticsearch?

A document is a JSON document stored in Elasticsearch. It is equivalent to a row of records in a relational database table.

How about Elasticsearch sharding?

When the number of documents increases and the hard disk capacity and processing power are insufficient, the response to client requests will be delayed.

In this case, the process of dividing index data into small pieces, called sharding, improves the retrieval of data search results.

13. What are the benefits of defining and creating replicas?

A replica is a corresponding copy of a fragment and is used to improve query throughput or achieve high availability under extreme load conditions.

The so-called high availability mainly refers to: if a master shard 1 fails, the corresponding replica shard 1 will be promoted to the master shard to ensure the high availability of the cluster.

14, explain how to add or create an index in Elasticsearch cluster.

To add a new index, use the Create index API option. Parameters required for creating an index are Settings, field Mapping, and index Alias.

Indexes can also be created using the Template Template.

What is the syntax for dropping an index from Elasticsearch?

Existing indexes can be dropped using the following syntax:

DELETE <index_name>
Copy the code

Wildcard deletion is supported:

DELETE my_*
Copy the code

What is the syntax for listing all indexes of a cluster in Elasticsearch?


GET _cat/indices
Copy the code

17, Update Mapping syntax in index.



PUT test_001/_mapping
{
  "properties": {
    "title": {"type":"keyword"}}}Copy the code

18, What is the syntax for retrieving documents by ID in Elasticsearch?

GET test_001/_doc/1
Copy the code

19, Explain dependencies and scores in Elasticsearch?

When you search the Internet for information about Apple. It can display search results for fruit or apple company names.

You may want to buy fruit online, check out the recipes in fruit or eat fruit, apples for health benefits.
You may also want to check Apple.com for the latest product range offered by the company, check and evaluate the company’s stock price and its performance on nasdaq over the last six months, one or five years.

Also, when we search for documents from Elasticsearch, you’ll be interested in getting the relevant information you need. Based on correlation, Lucene scoring algorithm was used to calculate the probability of obtaining relevant information.

The ES will give you all the relevant content, except that the calculated high score comes first, and the calculated low score comes second.

The two core factors associated with calculating ratings are word frequency and reverse document frequency (document scarcity).

It can be roughly explained as follows: The higher the word frequency of a single document, the higher the score; The scarcer the word in multiple documents, the higher the score.

What are the various possible ways we can perform searches in Elasticsearch?

The core approach is as follows:

Elasticsearch provides a complete query DSL based on JSON to define queries.

GET /shirts/_search
{
  "query": {
    "bool": {
      "filter": [{"term": { "color": "red"   }},
        { "term": { "brand": "gucci"}}]}}}Copy the code

Method 2: Search based on URL

GET /my_index/_search? q=user:seinaCopy the code

Method three: SQL search

POST /_sql? format=txt {"query": "SELECT * FROM uint-2020-08-17 ORDER BY itemid DESC LIMIT 5"
}
Copy the code

The function is not complete, not recommended.

21, What types of queries does Elasticsearch support?

There are two types of query: exact match and full-text search match.

Exact matches, such as term, EXISTS, term set, range, prefix, IDS, Wildcard, regEXP, fuzzy, etc.
Full-text retrieval, such as match, match_PHRASE, multi_match, match_phrase_prefix, query_string, etc

22. What are the differences between precise matching retrieval and full-text matching retrieval?

Essential differences between the two:

An exact match is used to: is it exactly the same?

Example: the matching of zip code and ID number is often accurate.

Full text search for: Is it relevant?

For example: similar to B station search specific keywords such as “Ma Baoguo video” is often fuzzy matching, relevant to return can be.

How about Elasticsearch aggregation?

Aggregation helps gather data from queries used in searches into various statistical indicators for statistical information or other analysis. Aggregation can help answer the following questions:

What is the average load time of my website?
Who is my most valuable customer based on transaction volume?
What would be considered a big file on my network?
How many products are there in each product category?

There are three types of aggregation:

Check the official documentation for 7.10. There are 4 categories in the early days.

Bucket aggregation

Group documents into buckets (also known as boxes) based on field values, ranges, or other criteria.

Metric aggregation

An aggregation of metrics (such as summation or average) calculated from field values.

Pipeline polymerization

Subaggregations that take input from other aggregations rather than documents or fields.

24, Can you tell me about the data store functionality in Elasticsearch?

Elasticsearch is a search engine. The process of input and writing ES is indexed. Data is serialized into A Json document according to the specified Mapping.

25, What is Elasticsearch Analyzer?

Parsers are used for text analysis and can be either built-in or custom parsers. Its core three parts are as follows:

Elasticsearch custom participle from a problem

26, Can you list the various types of profilers for Elasticsearch?

Elasticsearch Analyzer can be a built-in Analyzer or a custom Analyzer.

Standard Analyzer

The standard parser is the default word splitter, which is used if not specified.

It is based on the Unicode text segmentation algorithm and works in most languages.

Whitespace Analyzer

Cut words based on space characters.

Stop Analyzer

On the basis of simple Analyzer, remove the stop word.

Keyword Analyzer

Return the entire string of input together, without cutting the word.

A template for customizing a word divider

Setting of custom toggle in Mapping Setting:

PUT my_custom_index
{
 "settings": {"analysis": {"char_filter": {},"tokenizer": {},"filter": {},"analyzer": {}}}}Copy the code

In my mind, I still have the three-part diagram above. Among them:

Char_filter :{}, — corresponds to the character filtering part;

“Tokenizer” :{}, — corresponding text segmentation into word segmentation;

“Filter” :{}, — filter the part after the word should be segmented;

Analyzer :{} — The analyzer component, which contains: 1. 2. 3.

How to use Elasticsearch Tokenizer?

Tokenizer receives a character stream (if character filtering is included, the filtered character stream is received; Otherwise, receive the original character stream) and split it. Record the order or position after the word segmentation, the start value (start_offset) and the offset value (end_offset-start_offset).

How does token filter work in Elasticsearch?

Reprocess flows of tokenizers (lowercase, delete (stop words), add (synonyms), etc.

How does Ingest work for Elasticsearch?

Ingest node can be regarded as a node for data pre-processing and transformation, supporting pipeline pipeline setting. Ingest can be used to filter and transform data, which is similar to the function of filter in Logstash, with quite powerful functions.

What is the difference between a Master node and a candidate Master node?

The master node is responsible for cluster-related operations, such as creating or deleting indexes, keeping track of which nodes are part of the cluster, and deciding which shards to assign to which nodes.

Having a stable primary node is an important indicator of cluster health.

Candidate primary nodes are those nodes that have been selected as candidates and can be selected as primary nodes.

31, What is enabled, index and store for Elasticsearch?

Enabled: false. This parameter applies only to top-level mapping definitions and Object fields, causing Elasticsearch to skip field parsing entirely.

JSON can still be retrieved from the _source field, but it cannot be searched or stored in any other way.

If you set enable: false for non-global or Object types, the following error will occur:

 "type": "mapper_parsing_exception"."reason": "Mapping definition for [user_id] has unsupported parameters: [enabled : false]"
Copy the code

Index: false, the index option controls whether the field value is indexed. It accepts true or false and defaults to true. Fields that are not indexed cannot be queried.

If you must retrieve, the following error is reported:

 "type": "search_phase_execution_exception"."reason": "Cannot search on field [user_id] since it is not indexed."
Copy the code

Store:

In some special cases, if you want to retrieve the values of a single field or several fields, rather than the entire _source value, you can use source filtering.

This is where the store comes in handy.

How to use the character filter of Elasticsearch Analyzer

A character filter receives raw text as a character stream and can transform the character stream by adding, deleting, or changing characters.

Character filtering is classified as follows:

HTML Strip Character Filter.

Purpose: Remove HTML elements such as <b> and decode HTML entities such as &amp.

Mapping Character Filter

Purpose: Replace the specified character.

Pattern Replace Character Filter

Use: Replaces the specified character based on the regular expression.

Please explain NRT for Elasticsearch.

The default delay between indexing (writing) a document and being searchable is one second, so Elasticsearch is a near real-time (NRT) search platform.

In other words: document writes can be indexed in as little as a second, but not faster.

When tuning the write, we usually adjust it dynamically: refresh_interval = 30s or higher so that the write data is searched later.

34, What are the advantages of REST APIS for Elasticsearch?

REST apis are communication between systems using the Hypertext Transfer Protocol, which transmits data requests in XML and JSON formats.

REST protocols are stateless and separate from user interfaces with servers and stored data, enhancing the portability of the user interface to any type of platform. It also improves scalability and allows components to be implemented independently, so applications become more flexible.

REST apis are platform – and language-independent, except that the language used for data exchange is XML or JSON.

Using the REST API, you can easily view cluster information or troubleshoot problems.

35, When installing Elasticsearch, explain the different packages and their importance.

This seems to have nothing to say, go to the official documents to download the corresponding operating system installation package.

Some features are paid for, such as machine learning, high-level Kerberos authentication security, etc.

What configuration management tools does Elasticsearch support?

Ansible
Chef
Puppet
Salt Stack

Elasticsearch is a configuration tool supported by The DevOps team.

37, Can you explain the functionality and importance of x-pack for Elasticsearch?

X-pack is an extension installed with Elasticsearch.

X-pack’s various features include security (role-based access, privileges/permissions, role and user security), monitoring, reporting, alerts, and more.

38. Can you list the X-Pack API?

The paid feature has only been tested (just answer truthfully in the interview).

7.1 Security Function After free, use x-pack to create Spaces, roles, and users, set SSL encryption, and set different passwords and assign different permissions to different users.

Other apis such as machine learning, Watcher, Migration, etc.

39. Can you list the x-pack commands you use?

7.1 Security Function After the free security function is enabled, the following command is used: setup-Passwords Set passwords for accounts to ensure cluster security.

40, What is the cat API for Elasticsearch?

The CAT API command provides an analysis, overview, and health of the Elasticsearch cluster, including information related to aliases, assignments, indexes, node properties, and more.

These CAT commands take query strings as arguments and return the result information in json document format.

41, What are the cat commands for Elasticsearch?

You can say a few key words in the interview, including but not limited to:

meaning	The command
The alias	GET _cat/aliases? v
Distribution of the related	GET _cat/allocation
count	GET _cat/count? v
Field data	GET _cat/fielddata? v
Running status	GET_cat/health?
Indexes related to	GET _cat/indices? v
Master node correlation	GET _cat/master? v
Node properties	GET _cat/nodeattrs? v
node	GET _cat/nodes? v
Task to be processed	GET _cat/pending_tasks? v
The plug-in	GET _cat/plugins? v
restore	GET _cat / recovery? v
The repository	GET _cat /repositories? v
Period of	GET _cat /segments? v
shard	GET _cat/shards? v
The snapshot	GET _cat/snapshots? v
task	GET _cat/tasks? v
The template	GET _cat/templates? v
The thread pool	GET _cat/thread_pool? v

42. Can you explain the Explore API in Elasticsearch?

No, this is the Graph API.

Point so far can, similar problem actually opens discovery to use check now, similar problem does not have what meaning.

www.elastic.co/guide/en/el…

How is the Migration API used for Elasticsearch?

The migration API simplifies the upgrade of the X-pack index from one release to another.

Point so far can, similar problem actually opens discovery to use check now, similar problem does not have what meaning.

www.elastic.co/guide/en/el…

44, How to search for data in Elasticsearch?

The Search API helps find and retrieve data from specific shards guided by index, routing parameters.

45, Can you list the main available field data types related to Elasticsearch?

String data types, including text, which supports full-text retrieval, and keyword, which matches precisely.
Numeric data types, such as byte, short integer, long integer, floating point, double, half_float, SCALed_float.
Date type, Date nanoseconds Date nanoseconds, Boolean values, binary (Base64 encoded string), etc.
Range (integer range integer_range, long range long_range, doubLE_range, float_range, date range date_range)
Complex data types containing objects, nested, Object.
GEO Indicates the type of a location.
Specific types such as arrays (values in arrays should have the same data type)

ELK Stack and its contents

ELK Stack is a series of search and analysis tools (Elasticsearch), Collection and transformation tools (Logstash) as well as data management and visualization tools (Kibana), parsing and log collection tools (Beats future Agent), and monitoring and reporting tools (such as X Pack).

It is equivalent to the user basically no longer needs the third-party technology stack, can complete the whole process, the whole link of data access, storage, retrieval, visual analysis and other functions.

Where and how to use Kibana in Elasticsearch?

Kibana is part of ELK Stack – Log analysis solution.

It is an open source visualization tool that can visually analyze data by dragging and dropping custom charts, greatly reducing the threshold of data analysis.

The future will be similar to Tableau, a business intelligence and analytics software.

How does logstash work with Elasticsearch?

Logstash is an open source ETL server-side engine that comes with the ELK Stack, which collects and processes data from a variety of sources.

The most typical applications include: synchronizing log and mail data, synchronizing relational database (Mysql and Oracle) data, synchronizing non-relational database (MongoDB) data, synchronizing real-time data flow Kafka data, synchronizing high performance cache Redis data, etc.

How does Beats work with Elasticsearch?

Beats is an open source tool that can transfer data directly to Elasticsearch or through LogStash and process or filter data before viewing it with Kibana.

The types of data transferred include audit data, log files, cloud data, network traffic, and window event logs.

50. How to use Elastic Reporting?

Charging function, just understand, point so far.

The Reporting API helps generate data in PD F format, image PNG format, and spreadsheet CSV format for retrieval results that can be shared or saved as needed.

51. Can you list the application scenarios related to ELK log analysis?

E-commerce search solutions
Fraud identification
Market intelligence
Risk management
Safety analysis, etc.

summary

These are very, very basic questions. See the Elastic interview series for more.

An interview should be “peaceful” and not feuding. An Elastic interviewer should speak English and be polite.

Applicants should also pay attention to: do not be careless! Interviewers are “prepared”, for difficult questions, to timely “flash”, to do “all out”.

If the applicant has not answered, the interviewer to: “mouse wei juice”, and the applicant to reflect, do not make such a mistake again.

Reference:

www.softwaretestinghelp.com/elasticsear…

Add elastic6 (only a few slots left) and work with BAT to improve Elastic!

Elasticsearch Top 51 Most Important Interview Questions and Answers

.

What about Elasticsearch?

2. Can you specify the stable version of Elasticsearch currently available for download?

Do you need any components to install Elasticsearch?

4. Can you explain how to start the Elasticsearch server?

Can you name 10 companies that use Elasticsearch as their search engine or database?

Elasticsearch Cluster

Elasticsearch Node

8, Explain the concept of indexes in Elasticsearch cluster.

Explain the concept of Type in Elasticsearch.

Can you define a mapping in Elasticsearch?

What is the documentation for Elasticsearch?

How about Elasticsearch sharding?

13. What are the benefits of defining and creating replicas?

14, explain how to add or create an index in Elasticsearch cluster.

What is the syntax for dropping an index from Elasticsearch?

What is the syntax for listing all indexes of a cluster in Elasticsearch?

17, Update Mapping syntax in index.

18, What is the syntax for retrieving documents by ID in Elasticsearch?

19, Explain dependencies and scores in Elasticsearch?

What are the various possible ways we can perform searches in Elasticsearch?

21, What types of queries does Elasticsearch support?

22. What are the differences between precise matching retrieval and full-text matching retrieval?

How about Elasticsearch aggregation?

24, Can you tell me about the data store functionality in Elasticsearch?

25, What is Elasticsearch Analyzer?

26, Can you list the various types of profilers for Elasticsearch?

How to use Elasticsearch Tokenizer?

How does token filter work in Elasticsearch?

How does Ingest work for Elasticsearch?

What is the difference between a Master node and a candidate Master node?

31, What is enabled, index and store for Elasticsearch?

How to use the character filter of Elasticsearch Analyzer

Please explain NRT for Elasticsearch.

34, What are the advantages of REST APIS for Elasticsearch?

35, When installing Elasticsearch, explain the different packages and their importance.

What configuration management tools does Elasticsearch support?

37, Can you explain the functionality and importance of x-pack for Elasticsearch?

38. Can you list the X-Pack API?

39. Can you list the x-pack commands you use?

40, What is the cat API for Elasticsearch?

41, What are the cat commands for Elasticsearch?

42. Can you explain the Explore API in Elasticsearch?

How is the Migration API used for Elasticsearch?

44, How to search for data in Elasticsearch?

45, Can you list the main available field data types related to Elasticsearch?

ELK Stack and its contents

Where and how to use Kibana in Elasticsearch?

How does logstash work with Elasticsearch?

How does Beats work with Elasticsearch?

50. How to use Elastic Reporting?

51. Can you list the application scenarios related to ELK log analysis?

summary

Related Posts

Redis automatically serializes pits ~

Design pattern – singleton

List of Basic Redis types (2)