Improve Elasticsearch search experience by going to......

Elasticsearch tutorial live replay

1. Actual combat problems

Q: How do you search for the best results?

My side of a search function, the implementation method is to use THE IK word segmentation with multi query implementation.

A dictionary of words related to the customer’s field was added along the way.

But customers keep reporting that the search experience is not good.

What else can you do to improve your search experience?

From: Dead Hit Elasticsearch knowledge planet

This is a very typical problem that I have encountered in actual product development.

2. Search experience from a few examples

Example 1: Screenshot of the search for Trigger in MOX Net.

Note: I typed “trigger”, the first result returned is ok, the other several: “touch”, “send”, can be said to have nothing to do with my search.

From the perspective of user experience, I think: the experience is poor and returns a lot of irrelevant data.

Example 2: a question bank APP does not support page-turning.

As shown below, there are 1703 questions in the question bank, including: true and multiple choice questions.

Only: Click: Previous question, next question.

Actual scenario:

When 100, 200, only multiple choice questions; How many multiple choice questions?
When you quit, you need to click a few hundred times to enter the last question you did…..

It’s not bad user experience, it’s no user experience, the developers didn’t think through the design at all, the users will “doubt life”.

Example 3: E-commerce search for “first long Johns of autumn”, what should BE returned?

Zoom in to view the image, the highlights appear

This is a matter of opinion, and each e-commerce company has its own judgment on what to return.

However, simply stand in the user’s point of view, judge.

A comment from Ming Yi:

A lot of spelling

“It serves you right that you are developing fast” did return the expected result, and kindly recommended the information of “long Johns” in the region.

taobao

At least they can return to long Johns.

jingdong

Did not find the commodity, recommend “long Johns” for you, “why recommend, return directly not over”.

dangdang

Boy! The recommendations are for “fall” items. You’re a user. What do you think?

“What the hell?”
“Mixed emotions”
“Unintelligible”

Basically, the bottom line is that companies grow faster and search experiences grow faster.

Where there is data, there is search

With information flooding and exploding today, search is everywhere. The basic image can be summed up as: “where there is data, there is search”.

Search may be one of the most commonly used functions of users, learning, work, clothing, housing and transportation are inseparable from search.

learning

Enter keywords to search for reliable free or paid web resources.

work

When you encounter an error code, do a Google search for the answer.

Search wechat chat history to see some key valuable information.

clothing

Buying clothes online is actually a process of searching and choosing.

food

Daily ordering takeout at noon, the process of choosing takeout is the process of searching, close to the company + high evaluation = high probability of ordering.

live

Book hotels for business trips, search, compare and choose a cost-effective one.

line

11. Self-driving tour: Autonavi navigation before travel, input the destination search results, and select the appropriate route according to the returned results.

As the analysis of the Search experience points out, “The design and usability of the search box is an issue that cannot be ignored.

A good search experience may not make users feel good about your product, but a bad search experience can be fatal.

Therefore, whether in order to provide users with better services, or to avoid negative user experience, a good search experience for a content-oriented product is crucial. “

It is the minimum threshold to judge whether the search experience is good or not, and the search results meet the needs of users. The following contents are the research points and user concerns that can bring good search experience:

Search:

1) Visually highlight the search box and use the search box with the magnifying glass icon;

2) Place the search box in the expected position;

3) Provide search button;

4) The right size

Unkindly said: “in the navigation bar in the most prominent position of the search box is the user’s minimum respect”!

Searchable Content Tip: Tells users what they can search for

Every page should have a search box
Use intelligent recommendation/matching mechanisms

Intelligent recommendation or matching can save input costs for users.

Average users are not very good at organizing search language: in this case, if they don’t articulate the question in the first step, they will have a hard time succeeding in finding the right search results.

When smart matching works, it can help users express their search questions clearly and find satisfactory answers.

In short, a good search experience is a good user experience, and a good user experience is naturally linked to retention and even company growth.

4. Disassemble the five core links of user search

“Search is like a conversation between the user and the App or website, where the user asks for information and the App or website responds by showing results.

Users expect a smooth search experience, and users often form quick judgments about the value of an App based on the quality of the search results.”

In the process of search, the user’s experience can be roughly divided into five parts: discovering the search, entering keywords, waiting for the results, viewing the results, and completing the search. The experience of each step is part of the overall experience and will affect the user’s final search experience.

4.1 Discovery and Search

As mentioned earlier, the search box should be eye-catching, even independent of the header, and should occupy a focal position in the UI so that the user can easily find it.

4.2 Entering Keywords

You want to be able to prompt the user what keywords to type.
The ability to provide “search tips” based on certain key points entered by the user, such as the screenshot of the previous Google search.
Complex combination search, like Google Advanced Search, with auxiliary controls to filter dates, exclude keyword Settings, sort methods, and/or non-expressions, etc.

4.3 Waiting for results

To respond quickly, the user’s patience is limited, more than 3 seconds do not return, estimated the loss of users.
If the response is really slow, you can have a response animation or prompt message friendly prompt.
Can identify user input, necessary results user history search habits, after integration to return the optimal TOP N results.

4.4 Viewing Results

The process by which users are returned based on a search.
If there is no result, you are not advised to return 0. You can provide other recommended information, such as prompting the user to change the keyword.

4.5 Completing the Search

If there are results that meet the requirements, the search ends.
Without satisfying the results of users, users will continue to search for keywords, or users will lose to other apps or websites.

To improve the user experience, all of these steps are necessary.

5, Elasticsearch search logic

Elasticsearch searches can be understood by following two procedures.

The following is only for: text The text type of full-text retrieval.

5.1 Write Indexing Process

Elasticsearch does not write directly to the document, but builds an inverted index based on the segmentation defined by your Mapping (default: standard).
The selection of word segmentation determines the granularity of word segmentation, and the granularity of word segmentation determines whether the follow-up index can reach the standard.

5.2 Data retrieval process

Retrieval link, not what input to retrieve what, but different retrieval statements, there will be different retrieval mechanism.
Search link, what type of search to choose, the results will be completely different.

For example, a fine-grained search for “match” and a coarse-grained phrase match for “match_phrase” will result in very different results.

Match: Will first slice the keywords you type and then retrieve them.

Match_phrase: will retrieve the word you type as a phrase.

6. Quantifiable metrics for Elasticsearch search experience

User experience is a sensory response, but sensory search results need to be quantified.

How do you quantify it? The actual essential index is: accuracy rate (accuracy rate), recall rate (recall rate).

6.1 recall rate

Definition: The ratio of related documents contained in the search results to all related documents in the entire collection.

Measure recall of search results.

6.2 accurate rate

Definition: The proportion of relevant documents in the search results.

Measure the accuracy of search results.

It can be understood in terms of confusion matrix,

	related	Not related to
return	Real Cases (TP)	Pseudopositive example (FP)
Did not return	Pseudo counter Example (FN)	True counter example (TN)

Given the above matrix, the accuracy and recall rate can be calculated as follows:

Var2: = ref (var2, 1) and close > = ref (close, 2);

Precision: = tp/(TP + fp) * 100%

If you still don’t understand, the popular explanation on Zhihu is:

Recall rate: How many positive samples were recalled (how many were recalled).
Accuracy: How many guesses are correct (how accurately) of the sample you think is positive.

How to improve Elasticsearch search experience

As mentioned earlier, the search five links are linked together. Search experience is a matter of design, front end, back end, decision level, and management. It cannot be simply understood as a technical issue.

Elasticsearch backend technology

7.1 Select an appropriate word segmentation based on service scenarios

Note that there is no best tokenizer, no universal tokenizer for all business scenarios, and you need to choose the best tokenizer based on business scenarios.

If fine granularity is required, recall as long as it exists, then ngram segmentation is suitable or 7.9+ new wildcard data type is preferred.
It is necessary to make a comparison of tangent words in advance to verify whether different word segmentation can meet the business. English choice: IK, stutter, ANSJ or others.

Cut word comparison core API: Analyzer to live learn to use.

POST _analyze
{
  "text":"Providing the world's leading Cloud Computing services _ Helping Enterprises to get on the Cloud without worry"."analyzer": "ik_smart"
}
Copy the code

Select IK, to distinguish: IK_SMAR and IK_MAX_word.

Ik_smart is a coarse-grained participle (returns as little as possible, approximates the human worker participle);

Ik_max_word is a fine-grained participle (return as many as possible).

7.2 Pay attention to the selection and updating of dictionaries

“One cannot make bricks without straw”, “clever woman” is a participle, the dictionary is “rice”.

No matter how awesome the segmenter is, it is useless without a reliable dictionary.

Therefore, the dictionary choice is good, the segmentation will be more accurate.

Suggestion: When the basic thesaurus is relatively complete, add your own industry thesaurus and domain thesaurus based on business scenarios.

Even if the industry and field dictionaries are added, how to cover not entirely new words?

For example: new network vocabulary, industry vocabulary can not be comprehensive, resulting in incorrect word segmentation, poor user experience how to do?

As a plug-in, the original dictionary does not support dynamic update once configured, so it needs to be implemented by a third-party mechanism.

For example: IK dictionary dynamic update implementation mechanism: combined with modify IK word segmentation source + dynamic update mysql entries to update the dictionary.

7.3 Attach importance to data modeling in Mapping

Fielddata of text type is a big memory hog and is not recommended unless you have to.
Whether the keyword type is enabled depends on whether sorting or aggregation is required.
For fields that do not need indexes, set index to false.
For fields that do not need to be stored, set Store to False.
Large text such as Word and PDF text information, consider cutting into small pieces and storing them.

7.4 Select an appropriate search type based on your service scenario

As mentioned above, match and match_PHRASE are applicable to different scenarios.

Match deals with: high recall rate, high recall rate but low accuracy rate.
Match_phrase: matches a phrase and has a high precision and low recall rate.
Wildcard fuzzy matching is not recommended unless necessary.

Of course, there are other retrieval types, such as Query_string, fuzzy, etc., that need to be selected in conjunction with the business scenario.

7.5 Trade-offs must be made in pursuit of optimal response speed

The user’s patience is very limited, do not make the user wait.

Increase the ratio of data node memory to heap memory
The _source field is not returned unless necessary
Do not do complex business processing in the retrieval return phase

Including but not limited to:

1) Double or more polymerization

2) Wildcard or Regex regular retrieval

3) Custom highlighting

Highlight to make the selection according to the type of business

Note: FVH highlighting is especially suitable for files >1MB(large files).

Make business choices

For example: default from, size deep page 10000 is enough, if the product manager does not agree, need to discuss and convince them.

For example, inaccurate aggregate results are the default mechanism for Elasticsearch, so accept or make a different schema selection (like ClickHouse) and don’t worry about details.

7.6 Using the Intelligent Recommendation and matching mechanism

Simple search box recommendations can be implemented with the help of: prefix prefix search implementation.

GET kibana_sample_data_ecommerce/_search
{
  "_source": "customer_full_name"."query": {
    "prefix": {
      "customer_full_name.keyword": "Ed"}}}Copy the code

Suggester is used to implement a recommendation for a complex point that requires error correction.

POST /blogs/_search
{
  "suggest": {
    "my-suggestion": {
      "text": "lucne rock"."term": {
        "suggest_mode": "missing"."field": "body"}}}}Copy the code

Suggester has been Suggester.

Recommended: Uncle Wood’s article:

elasticsearch.cn/article/142

More complex, need user behavior recognition + recommendation engine mechanism to achieve.

A good recommendation engine tend to personalized recommendation, it can collect user valuable digital footprint (such as demographic, transaction details, interactive log, buy records, trading records, browse records) and information about the products (such as: specifications, user feedback, compared with other products, etc.), to complete the recommended before data analysis.

8, summary

The search experience determines the user experience, and the user experience determines the user rate of the product, which in turn determines the success of the product.

Liang Ning, a famous product expert, mentioned in Lecture 30 of Product Thinking that “We see many new Internet companies with inferior system capability to traditional enterprises, but they can snatch a large number of users from traditional enterprises, relying on user experience. In the case of such a large volume difference, user experience can become the core competitiveness; When competing in the same dimension, user experience is the most core competitiveness.

Search is the entrance of traffic, is the “war” (various apps, websites) user experience to contend for.

There is no end to the iteration of the search experience, and you can’t be too thorough or careful.

If you have good ideas and suggestions, you are welcome to exchange them.

Reference:

www.woshipm.com/ucd/1037490…
zhuanlan.zhihu.com/p/60826371
www.jianshu.com/p/677742838…
www.chanpin100.com/article/103…
www.uisdc.com/search-expe…
www.oreilly.com.cn/radar/?p=28
Do-it-yourself Recommendation Engine

Recommendation:

Commonly used dry | Elasticsearch development of actual combat command list

Dry goods | Elasticsearch developers best practice guide

Elasticsearch development operational combat Tips

The importance of the theory of dry goods | Elasticsearch data modeling

Dry goods | Elasticsearch index design practical guide

Dry goods | Elasticsearch multi-table associated design guidelines

Learn more in less time, faster!

40%+ Elastic certified engineers in China are here!

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Improve Elasticsearch search experience by going to……

1. Actual combat problems

2. Search experience from a few examples

Example 1: Screenshot of the search for Trigger in MOX Net.

Example 2: a question bank APP does not support page-turning.

Example 3: E-commerce search for “first long Johns of autumn”, what should BE returned?

Where there is data, there is search

4. Disassemble the five core links of user search

4.1 Discovery and Search

4.2 Entering Keywords

4.3 Waiting for results

4.4 Viewing Results

4.5 Completing the Search

5, Elasticsearch search logic

5.1 Write Indexing Process

5.2 Data retrieval process

6. Quantifiable metrics for Elasticsearch search experience

6.1 recall rate

6.2 accurate rate

How to improve Elasticsearch search experience

7.1 Select an appropriate word segmentation based on service scenarios

7.2 Pay attention to the selection and updating of dictionaries

7.3 Attach importance to data modeling in Mapping

7.4 Select an appropriate search type based on your service scenario

7.5 Trade-offs must be made in pursuit of optimal response speed

7.6 Using the Intelligent Recommendation and matching mechanism

8, summary

Improve Elasticsearch search experience by going to……

1. Actual combat problems

2. Search experience from a few examples

Example 1: Screenshot of the search for Trigger in MOX Net.

Example 2: a question bank APP does not support page-turning.

Example 3: E-commerce search for “first long Johns of autumn”, what should BE returned?

Where there is data, there is search

4. Disassemble the five core links of user search

4.1 Discovery and Search

4.2 Entering Keywords

4.3 Waiting for results

4.4 Viewing Results

4.5 Completing the Search

5, Elasticsearch search logic

5.1 Write Indexing Process

5.2 Data retrieval process

6. Quantifiable metrics for Elasticsearch search experience

6.1 recall rate

6.2 accurate rate

How to improve Elasticsearch search experience

7.1 Select an appropriate word segmentation based on service scenarios

7.2 Pay attention to the selection and updating of dictionaries

7.3 Attach importance to data modeling in Mapping

7.4 Select an appropriate search type based on your service scenario

7.5 Trade-offs must be made in pursuit of optimal response speed

7.6 Using the Intelligent Recommendation and matching mechanism

8, summary

Related Posts

Linux Process Scheduling algorithm -Completely Fair Scheduler

MyBatis Transaction Management

Design Patterns – Template method patterns