This is the 17th day of my participation in the August More text Challenge. For details, see:August is more challenging
π past review
Thank you for reading, I hope it can help you, if there are flaws in the blog, please leave a message in the comment area or add my private chat in the profile of the home page, thank you for your advice. I’m XiaoLin, a boy who can write bugs and rap
- What are the weeding strategies for the Redis cache
- π Take out 20 minutes a day to bring you to the ElasticSearch full-text search engine 2οΈ with 3k salary increase
- π10 minutes, 6 points, SpringBoot integrated Swagger the whole thing π
IK participle
ElasticSearch uses standard word segmentation, which is not suitable for Chinese websites. Therefore, you need to modify ElasticSearch to Chinese friendly word segmentation to achieve better search results. And support Chinese word segmentation is IK word segmentation.
7.1. Install IK participle online
Delete original data from ElasticSearch (mandatory)
#Go to the ES installation directory and delete data from the Data directory
rm -rf data
Copy the code
Install IK participle
#Execute in the ES installation directory
./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.8.0/elasticsearch-analysis-ik-6.8.0.zip
Copy the code
IK participle must be of the same version as the current version. If you need to use another version, replace 6.2.4 with the version number used.
test
GET /_analyze {"text": "ik_smart", "analyzer": "ik_smart"}Copy the code
7.2. Install IK participle offline
IK/ElasticSearch/ElasticSearch/ElasticSearch/ElasticSearch
Go to the official website to download the appropriate version of IK word segmentation
#You can go to the official website to download and upload, or use the wegt command to download the other version of the modified version number.Wget HTTP: / / https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.2.4/elasticsearch-analysis-ik-6.2.4.zipCopy the code
Unpack the
Unzip elasticsearch - analysis - ik - 6.2.4. ZipCopy the code
Move to the plugins directory in the ES installation directory
The mv elasticsearch elasticsearch - 6.2.4 / plugins /Copy the code
Restarting ElasticSearch takes effect
Local installation ik participle device configuration directory: es in the installation directory/plugins/analysis – ik/config/IKAnalyzer. CFG. XML
test
DELETE /ems
PUT /ems
{
"mappings":{
"emp":{ "properties":{ "name":{ "type":"text", "analyzer": "ik_max_word", "search_analyzer": "ik_max_word" }, "age":{ "type":"integer" }, "bir":{ "type":"date" }, "content":{ "type":"text", "analyzer": "ik_max_word", "search_analyzer": "ik_max_word" }, "address":{ "type":"keyword" } } } } }
PUT /ems/emp/_bulk {" index ": {}} {" name" : "black", "age" : 23, "bir" : "2012-12-12", "content" : "for the development team to choose a good MVC framework is a difficult thing, In many feasible scheme decision requires a high level and experience ", "address", "Beijing"} {" index ": {}} {" name" : "wang black", "age" : 24, "bir" : "2012-12-12", "content" : "Spring The framework is a layered architecture consisting of seven well-defined modules. Spring modules are built on top of the core container, Core container defines the way to create, configure, and manage bean ", "address", "Shanghai"} {" index ": {}} {" name" : "five small", "age" : 8, "bir" : "2012-12-12", "content" : "Spring Cloud As a microservices framework for the Java language, it relies on Spring Boot and is characterized by rapid development, continuous delivery, and easy deployment. Spring Cloud has so many components that cover all aspects of microservices, ","address":" wuxi "} {"index":{}} {"name":" Win7 ","age":9,"bir":"2012-12-12"," Content ":"Spring aims to simplify Java development across the board. This is bound to lead to more explanations of how Spring simplifies Java development." ", "address", "nanjing} {" index" : {}} {" name ":" plum ", "age" : 43, "bir" : "2012-12-12", "content" : "use ANSI Redis is an open source C language, network support, can be based on memory and persistence of log type, key-value database, ","address":" hangzhou "} {"index":{}} {"name":" ElasticSearch ","age":59,"bir":"2012-12-12","content":"ElasticSearch "is a lucene-based search server. It provides a distributed multi-user capable full-text search engine based on the RESTful Web interface ","address":" Beijing "} GET /ems/emp/_search
{
"query":{
"Term" : {" content ":" framework "}}, "highlight" : {" pre_tags ": [" < span style =" color: red ">"], "post_tags" : [""], "fields": { "*":{} } } }Copy the code
7.3. Types of IK participles
7.3.1, ik_max_word
Ik_max_word: Will split the text into the most granular possible pieces, such as the “National anthem of the People’s Republic of China” into “The People’s Republic of China, the People’s Republic of China, The Chinese, the Chinese, the People’s Republic, the people, the people, the people, the People, the People, the Republic, the Republic, and the Guoguo, the national anthem”, and will exhaust all possible combinations.
7.3.2, ik_smart
Ik_smart: will do the coarsest fragmentation, such as “The National anthem of the People’s Republic of China” to “The National Anthem of the People’s Republic of China”.
7.4. Configure extension words
IK supports custom extension dictionary and deactivation dictionary. The so-called ** extension dictionary is the words that are not keywords, but also want to be used as keywords by ES. These words can be added to the extension dictionary. Deactivated dictionary ** is when words that are keywords, but do not want to be retrieved with those keywords in the business scenario, can be put into the deactivated dictionary. The encoding of the dictionary must be UTF-8, otherwise it will not take effect.
To define the extension dictionary and disable the dictionary, you can modify the ikAnalyzer.cfg. XML file in the config directory of the IK classifier.
Modify the IKAnalyzer. CFG. XML
<! DOCTYPEproperties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
<comment>IK Analyzer extension configuration</comment>
<! -- Users can configure their own extension dictionary here -->
<entry key="ext_dict">ext_dict.dic</entry>
<! -- Users can configure their own extended stop word dictionary here -->
<entry key="ext_stopwords">ext_stopword.dic</entry>
</properties>
Copy the code
Create the ext_dict.dic file in the config directory under the IK lexical analyzer directory
The encoding must be UTF-8 to take effect.
Just add an extension word to ext_dict.dic
Restarting ElasticSearch takes effect
8. Filter Query
8.1, filter query
To be exact, there are two types of query operations in ElasticSearch:
- Query: A query, also known as the query query mentioned earlier, calculates a score for each returned document by default and sorts it by score.
- Filter: Filter filters only the documents that match, does not count scores, and it can cache documents.
Therefore, filtering is faster than querying for performance alone.
Filtering is good for filtering data over a wide range, while queries are good for matching data precisely. In general application, you should first use the filter operation to filter the data and then use the query to match the data.
8.2. Filter syntax
GET /ems/emp/_search { "query": { "bool": { "must": [ {"match_all": {}}
], "filter": { "range": { "age": { "gte": 10 } } } } } }Copy the code
When the filter and query are executed, the filter is executed before the query. Elasticsearch automatically caches frequently used filters to speed up performance.
8.3. Filter type
8.3.1, term
GET /ems/emp/_search # use term filter {" query ": {" bool" : {" must ": [{" term" : {" name ": {" value" : "black"}}}], "filter" : {" term ": {"content":" frame "}}}}}Copy the code
8.3.2, terms,
GET /dangdang/book/_search # use terms filter {" query ": {" bool" : {" must ": [{" term" : {" name ": {" value" : "Chinese"}}}], "filter" : {" terms ": {"content":[" technology ", "voice"]}}}}}Copy the code
8.3.3 are included, the range
GET /ems/emp/_search {" query ": {" bool" : {" must ": [{" term" : {" name ": {" value" : "Chinese"}}}], "filter" : {" range ": {" age" : { "gte": 7, "lte": 20 } } } } } }Copy the code
8.3.4, exists
Filter index records that have a specified field and the field is not empty
GET /ems/emp/_search {" query ": {" bool" : {" must ": [{" term" : {" name ": {" value" : "Chinese"}}}], "filter" : {" exists ": { "field":"aaa" } } } } }Copy the code
8.3.5, ids
Filters index records that contain the specified field.
GET /ems/emp/_search {" query ": {" bool" : {" must ": [{" term" : {" name ": {" value" : "Chinese"}}}], "filter" : {" ids ": {" values" : ["1","2","3"]}}}}}Copy the code
Java ElasticSearch
ElasticSearch does not replace the database. The core and most powerful function of ElasticSearch is to complete the search. You can put the query data to the user in ElasticSearch
9.1 Importing a Dependency
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>6.8.0</version>
</dependency>
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>transport</artifactId>
<version>6.8.0</version>
</dependency>
<dependency>
<groupId>org.elasticsearch.plugin</groupId>
<artifactId>transport-netty4-client</artifactId>
<version>6.8.0</version>
</dependency>
Copy the code
Create index and type
PUT /dangdang
{
"mappings": {
"book": {"properties": {
"name": {"type":"text"."analyzer": "ik_max_word"
},
"age": {"type":"integer"
},
"sex": {"type":"keyword"
},
"content": {"type":"text"."analyzer": "ik_max_word"
}
}
}
}
}
Copy the code
9.3 Operating ElasticSearch in Java
9.3.1 Creating a Client Object
// Create an ES client operation object
@Test
public void init(a) throws UnknownHostException {
PreBuiltTransportClient preBuiltTransportClient = new PreBuiltTransportClient(Settings.EMPTY);
preBuiltTransportClient.addTransportAddress(new TransportAddress(
InetAddress.getByName("192.168.202.200"),9300));
}
Copy the code
9.3.2 creating an index
// Create index
@Test
public void createIndex(a) throws UnknownHostException, ExecutionException, InterruptedException {
PreBuiltTransportClient preBuiltTransportClient = new PreBuiltTransportClient(Settings.EMPTY);
preBuiltTransportClient.addTransportAddress(new TransportAddress(
InetAddress.getByName("192.168.202.200"),9300));
// Define the index request
CreateIndexRequest ems = new CreateIndexRequest("ems");
// Perform index creation
CreateIndexResponse createIndexResponse = preBuiltTransportClient.admin().indices().create(ems).get();
System.out.println(createIndexResponse.isAcknowledged());
}
Copy the code
9.3.3 deleting an Index
// Delete the index
@Test
public void deleteIndex(a) throws UnknownHostException, ExecutionException, InterruptedException {
PreBuiltTransportClient preBuiltTransportClient = new PreBuiltTransportClient(Settings.EMPTY);
preBuiltTransportClient.addTransportAddress(new TransportAddress(
InetAddress.getByName("192.168.202.200"),9300));
// Define the index request
DeleteIndexRequest ems = new DeleteIndexRequest("ems");
// Perform index deletion
AcknowledgedResponse acknowledgedResponse = preBuiltTransportClient.admin().indices().delete(ems).get();
System.out.println(acknowledgedResponse.isAcknowledged());
}
Copy the code
Create index and type
// Create index type and map
@Test
public void init(a) throws UnknownHostException, ExecutionException, InterruptedException {
PreBuiltTransportClient preBuiltTransportClient = new PreBuiltTransportClient(Settings.EMPTY);
preBuiltTransportClient.addTransportAddress(new TransportAddress(
InetAddress.getByName("192.168.202.200"),9300));
// Create index
CreateIndexRequest ems = new CreateIndexRequest("ems");
// Define the json format mapping
String json = "{\"properties\":{\"name\":{\"type\":\"text\",\"analyzer\":\"ik_max_word\"},\"age\":{\"type\":\"integer\"},\"sex\":{\"ty pe\":\"keyword\"},\"content\":{\"type\":\"text\",\"analyzer\":\"ik_max_word\"}}}";
// Set the type and mapping
ems.mapping("emp",json, XContentType.JSON);
// Perform the creation
CreateIndexResponse createIndexResponse = preBuiltTransportClient.admin().indices().create(ems).get();
System.out.println(createIndexResponse.isAcknowledged());
}
Copy the code
9.3.5. Index a record
9.3.5.1 Index by ID
// Index a document with a specified id
@Test
public void createIndexOptionId(a) throws JsonProcessingException {
Emp emp = new Emp("xiaolin".18."Male"."I'm Xiaolin from China.");
String s = JSONObject.toJSONString(emp);
IndexResponse indexResponse = transportClient.prepareIndex("ems"."emp"."1").setSource(s, XContentType.JSON).get();
System.out.println(indexResponse.status());
}
Copy the code
9.3.5.2 Automatically generates AN ID index record
// Index a document with a specified id
@Test
public void createIndexOptionId(a) throws JsonProcessingException {
Emp emp = new Emp("XiaoLin".18 , "Male"."I am XiaoLin");
String s = JSONObject.toJSONString(emp);
IndexResponse indexResponse = transportClient.prepareIndex("ems"."emp")
.setSource(s, XContentType.JSON).get();
System.out.println(indexResponse.status());
}
Copy the code
9.3.6. Update the Index
// Update a record
@Test
public void testUpdate(a) throws IOException {
Emp emp = new Emp();
emp.setName("Hello tomorrow.");
String s = JSONObject.toJSONString(emp);
UpdateResponse updateResponse = transportClient.prepareUpdate("ems"."emp"."1")
.setDoc(s,XContentType.JSON).get();
System.out.println(updateResponse.status());
}
Copy the code
9.3.6 Batch Update
// Batch update
@Test
public void testBulk(a) throws IOException {
// Add the first record
IndexRequest request1 = new IndexRequest("ems"."emp"."1");
Emp emp = new Emp("Chinese Science and Technology".23."Male"."This is the good guy.");
request1.source(JSONObject.toJSONString(emp),XContentType.JSON);
// Add a second record
IndexRequest request2 = new IndexRequest("ems"."emp"."2");
Emp emp2 = new Emp("Chinese Science and Technology".23."Male"."This is the good guy.");
request2.source(JSONObject.toJSONString(emp2),XContentType.JSON);
// Update the record
UpdateRequest updateRequest = new UpdateRequest("ems"."emp"."1");
Emp empUpdate = new Emp();
empUpdate.setName("China Power");
updateRequest.doc(JSONObject.toJSONString(empUpdate),XContentType.JSON);
// Delete a record
DeleteRequest deleteRequest = new DeleteRequest("ems"."emp"."2");
BulkResponse bulkItemResponses = transportClient.prepareBulk()
.add(request1)
.add(request2)
.add(updateRequest)
.add(deleteRequest)
.get();
BulkItemResponse[] items = bulkItemResponses.getItems();
for(BulkItemResponse item : items) { System.out.println(item.status()); }}Copy the code
9.3.7 Retrieving documents
9.3.7.1 querying all and sorting
* addSort("age", sortorder.asc) * addSort("age", sortorder.asc) Sortorder.desc) specifies the sort field and which way to sort */
@Test
public void testMatchAllQuery(a) throws UnknownHostException {
TransportClient transportClient = new PreBuiltTransportClient(Settings.EMPTY).addTransportAddress(new TransportAddress(InetAddress.getByName("172.16.251.142"), 9300));
SearchResponse searchResponse = transportClient.prepareSearch("dangdang").setTypes("book").setQuery(QueryBuilders.matchAllQuery()).addSort("age", SortOrder.DESC).get();
SearchHits hits = searchResponse.getHits();
System.out.println(Number of eligible records:+hits.totalHits);
for (SearchHit hit : hits) {
System.out.print("Current index score:"+hit.getScore());
System.out.print(", corresponding result :=====>"+hit.getSourceAsString());
System.out.println(", specify the field result :"+hit.getSourceAsMap().get("name"));
System.out.println("= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = ="); }}Copy the code
9.3.7.2 Paging query
Form = (pagenow-1)*size *size How many results are returned each time by default 10 */
@Test
public void testMatchAllQueryFormAndSize(a) throws UnknownHostException {
TransportClient transportClient = new PreBuiltTransportClient(Settings.EMPTY).addTransportAddress(new TransportAddress(InetAddress.getByName("172.16.251.142"), 9300));
SearchResponse searchResponse = transportClient.prepareSearch("dangdang").setTypes("book").setQuery(QueryBuilders.matchAllQuery()).setFrom(0).setSize(2).get();
SearchHits hits = searchResponse.getHits();
System.out.println(Number of eligible records:+hits.totalHits);
for (SearchHit hit : hits) {
System.out.print("Current index score:"+hit.getScore());
System.out.print(", corresponding result :=====>"+hit.getSourceAsString());
System.out.println(", specify the field result :"+hit.getSourceAsMap().get("name"));
System.out.println("= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = ="); }}Copy the code
9.3.7.3. Return the query field
SetFetchSource ("*","age") returns all fields except the age field SetFetchSource (new String[]{},new String[]{}) */
@Test
public void testMatchAllQuerySource(a) throws UnknownHostException {
TransportClient transportClient = new PreBuiltTransportClient(Settings.EMPTY).addTransportAddress(new TransportAddress(InetAddress.getByName("172.16.251.142"), 9300));
SearchResponse searchResponse = transportClient.prepareSearch("dangdang").setTypes("book").setQuery(QueryBuilders.matchAllQuery()).setFetchSource("*"."age").get();
SearchHits hits = searchResponse.getHits();
System.out.println(Number of eligible records:+hits.totalHits);
for (SearchHit hit : hits) {
System.out.print("Current index score:"+hit.getScore());
System.out.print(", corresponding result :=====>"+hit.getSourceAsString());
System.out.println(", specify the field result :"+hit.getSourceAsMap().get("name"));
System.out.println("= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = ="); }}Copy the code
9.3.7.8 Term query
/** * term query */
@Test
public void testTerm(a) throws UnknownHostException {
TransportClient transportClient = new PreBuiltTransportClient(Settings.EMPTY).addTransportAddress(new TransportAddress(InetAddress.getByName("172.16.251.142"), 9300));
TermQueryBuilder queryBuilder = QueryBuilders.termQuery("name"."China");
SearchResponse searchResponse = transportClient.prepareSearch("dangdang").setTypes("book").setQuery(queryBuilder).get();
}
Copy the code
9.3.7.9 range query
/** * rang query * lt is less than * LTE is less than or equal to * gt is greater than * gte is greater than or equal to */
@Test
public void testRange(a) throws UnknownHostException {
TransportClient transportClient = new PreBuiltTransportClient(Settings.EMPTY).addTransportAddress(new TransportAddress(InetAddress.getByName("172.16.251.142"), 9300));
RangeQueryBuilder rangeQueryBuilder = QueryBuilders.rangeQuery("age").lt(45).gte(8);
SearchResponse searchResponse = transportClient.prepareSearch("dangdang").setTypes("book").setQuery(rangeQueryBuilder).get();
}
Copy the code
9.3.7.10. Prefix query
/** * prefix Queries the prefix ** /
@Test
public void testPrefix(a) throws UnknownHostException {
TransportClient transportClient = new PreBuiltTransportClient(Settings.EMPTY).addTransportAddress(new TransportAddress(InetAddress.getByName("172.16.251.142"), 9300));
PrefixQueryBuilder prefixQueryBuilder = QueryBuilders.prefixQuery("name"."In");
SearchResponse searchResponse = transportClient.prepareSearch("dangdang").setTypes("book").setQuery(prefixQueryBuilder).get();
}
Copy the code
9.3.7.11. Wildcard query
/** * wildcardQuery wildcardQuery ** /
@Test
public void testwildcardQuery(a) throws UnknownHostException {
TransportClient transportClient = new PreBuiltTransportClient(Settings.EMPTY).addTransportAddress(new TransportAddress(InetAddress.getByName("172.16.251.142"), 9300));
WildcardQueryBuilder wildcardQueryBuilder = QueryBuilders.wildcardQuery("name"."*");
SearchResponse searchResponse = transportClient.prepareSearch("dangdang").setTypes("book").setQuery(wildcardQueryBuilder).get();
}
Copy the code
9.3.7.12 querying IDS
/** * ids query */
@Test
public void testIds(a) throws UnknownHostException {
TransportClient transportClient = new PreBuiltTransportClient(Settings.EMPTY).addTransportAddress(new TransportAddress(InetAddress.getByName("172.16.251.142"), 9300));
IdsQueryBuilder idsQueryBuilder = QueryBuilders.idsQuery().addIds("1"."2");
SearchResponse searchResponse = transportClient.prepareSearch("dangdang").setTypes("book").setQuery(idsQueryBuilder).get();
}
Copy the code
9.3.7.13 fuzzy query
/** * query */
@Test
public void testFuzzy(a) throws UnknownHostException {
TransportClient transportClient = new PreBuiltTransportClient(Settings.EMPTY).addTransportAddress(new TransportAddress(InetAddress.getByName("172.16.251.142"), 9300));
FuzzyQueryBuilder fuzzyQueryBuilder = QueryBuilders.fuzzyQuery("content"."Chinese people");
SearchResponse searchResponse = transportClient.prepareSearch("dangdang").setTypes("book").setQuery(fuzzyQueryBuilder).get();
}
Copy the code
9.3.7.14, bool Query
/** * bool Bool query
@Test
public void testBool(a) throws UnknownHostException {
TransportClient transportClient = new PreBuiltTransportClient(Settings.EMPTY).addTransportAddress(new TransportAddress(InetAddress.getByName("172.16.251.142"), 9300));
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
boolQueryBuilder.should(QueryBuilders.matchAllQuery());
boolQueryBuilder.mustNot(QueryBuilders.rangeQuery("age").lte(8));
boolQueryBuilder.must(QueryBuilders.termQuery("name"."China"));
SearchResponse searchResponse = transportClient.prepareSearch("dangdang").setTypes("book").setQuery(boolQueryBuilder).get();
}
Copy the code
9.3.7.15. Highlight query
Highlighter (highlightBuilder) to specify highlighting setting * requireFieldMatch(false) to enable multiple fields highlighting * field to define highlighted fields * PreTags (""
@Test
public void testHighlight(a) throws UnknownHostException {
TransportClient transportClient = new PreBuiltTransportClient(Settings.EMPTY).addTransportAddress(new TransportAddress(InetAddress.getByName("172.16.251.142"), 9300));
TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("name"."China");
HighlightBuilder highlightBuilder = new HighlightBuilder();
highlightBuilder.requireFieldMatch(false).field("name").field("content").preTags("<span style='color:red'>").postTags("</span>");
SearchResponse searchResponse = transportClient.prepareSearch("dangdang").setTypes("book").highlighter(highlightBuilder).highlighter(highlightBuilder).setQuery(termQueryBuilder).get();
SearchHits hits = searchResponse.getHits();
System.out.println(Number of eligible records:+hits.totalHits);
for (SearchHit hit : hits) {
Map<String, Object> sourceAsMap = hit.getSourceAsMap();
Map<String, HighlightField> highlightFields = hit.getHighlightFields();
System.out.println("================ highlighted before ==========");
for(Map.Entry<String,Object> entry:sourceAsMap.entrySet()){
System.out.println("key: "+entry.getKey() +" value: "+entry.getValue());
}
System.out.println("================ highlighted ==========");
for (Map.Entry<String,Object> entry:sourceAsMap.entrySet()){
HighlightField highlightField = highlightFields.get(entry.getKey());
if(highlightField! =null){
System.out.println("key: "+entry.getKey() +" value: "+ highlightField.fragments()[0]);
}else{
System.out.println("key: "+entry.getKey() +" value: "+entry.getValue()); }}}}Copy the code