background

A brief introduction to two common plug-ins:

  1. ES Visual Web plug-in:elasticsearch-head
  2. Chinese word segmentation friendly word segmentation:elasticsearch-analysis-ik

Run first.

elasticsearch-head

  1. inGithubSearch to download and installelasticsearch-head, and decompress;
  2. Install dependencies:npm install;
  3. Activation:npm run startTo accesshttp://localhost:9100.
  • Problem:

If cross-domain ElasticSearch is not configured, an error message is displayed when you access http://localhost:9100.

Cross-source request blocked: The same-origin policy forbids reading of remote resources at http://localhost:9200/_all. (Cause: CORS header lacks’ access-Control-allow-origin ‘).

  • Solution:

Enable ES cross-domain: edit config/ elasticSearch. yml and add it at the end

http.cors.enabled: true
http.cors.allow-origin: "*"
Copy the code

Restart ES, visit http://localhost:9100 again, and click Connect. The result is as shown below. You can see that there are currently two indexes.

Elasticsearch -head, as the data visualization client of ES, the main menu includes: Overview, index, data browsing, basic query, match query, etc. In the last article, we used the index view function under data Browse.

elasticsearch-analysis-ik

  • ES comes with a word divider

Standard, Simple, Whitespace, Stop, language, etc.

However,, when it comes to Chinese, meng Forced.

  • Third party: IK word divider
  1. download

Select * from Github for elasticSearch-analysis-ik (ES 7.5.2);

Github.com/medcl/elast…

  1. The installation

Unzip to elasticSearch -7.5.2\plugins\ik, ik is a directory that can be named by its own name. You don’t need to configure other files and restart ES.

Remember to restart ES, if there is no restart, an error will be reported:

The following is the output of the console after the restart. You can see that the IK splitter is loaded:

  1. test
  • Chinese

  • In both Chinese and English

  1. Word segmentation model

Ik provides two word segmentation modes: ik_smart and ik_max_word, which were used in the previous examples.

  • Ik_smart: minimum sharding

  • Ik_max_word: the finest slice

Obviously, you can see the difference between the two approaches.

  1. Custom participles

In the case of “Novel Coronavirus pneumonia”, I want the word divider to treat the coronavirus as one word. As the WORD is not recorded in the ik’s own thesaurus, as we saw earlier, the IK divider will treat the coronavirus as two words; So, manually enter this entry here.

Added entry (ElasticSearch -7.5.2\plugins\ik\config) : this is written directly in main.dic.

Take a look at the effect of adding a custom word segmentation:

The new crown is divided into one word, not two words.


If you have any questions or any bugs are found, please feel free to contact me.

Your comments and suggestions are welcome!

This article has participated in the activity of “New person creation Ceremony”, and started the road of digging gold creation together.