This is the 7th day of my participation in the August More Text Challenge

Version 7.4.2 of this Elasticsearch article is available

The library has a collection of books, and the relevant information is stored in the Book index. Here are two documents:

POST /book/_doc/1
{
  "body": "elasticsearch filter",
  "title": "elasticsearch basic query"
}
Copy the code
POST /book/_doc/2
{
  "body": "single value search",
  "title": "elasticsearch aggs query"
}
Copy the code

Select * from elasticSearch; select * from elasticSearch; select * from elasticSearch;

POST /book/_search
{
  "query": {
    "bool": {
      "should": [
        {"match": {"body": "elasticsearch aggs"}},
        {"match": {"title": "elasticsearch aggs"}}
        ]
    }
  }
}
Copy the code

You would think that Document 2 would be a better match for Ming’s needs, but the returned query results have a missing relevance score for Document 1 higher than document 2:

{ "took" : 4, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : {" total ": {" value" : 2, the "base" : "eq"}, "max_score" : 0.9372343, "hits" : [{" _index ":" book ", "_type" : "_doc", "_id" : "1", "_score" : 0.9372343, "_source" : {" body ":" elasticsearch filter ", "title" : "Elasticsearch basic query"}}, {" _index ":" book ", "_type" : "_doc", "_id" : "2", "_score" : 0.87546873, "_source" : { "body" : "single value search", "title" : "elasticsearch aggs query" } } ] } }Copy the code

That’s because the correlation score of the SHOULD query is calculated as follows:

  1. Elasticsearch aggs query using match is split into elasticSearch and aggs.
  2. Then use each clause in should to query the same document:
  3. Elasticsearch = “should”;
  4. Title hits elasticSearch and aggs, but only satisfies the title clause in should.
  5. So document 1 ends up scoring higher than document 2, even though document 2 is a better match.

So, how do you get a more closely matched document 2 to have a higher correlation score than document 1? Then you need to use dis_max query. Dis_max query score calculation: The score of the best-matched clause is used as the correlation score for the entire document.

POST /book/_search? Size = 1000 {" query ": {" dis_max" : {" tie_breaker ": 0.3," queries ": [{" match" : {" body ": "elasticsearch aggs"}}, {"match": {"title": "elasticsearch aggs"}} ] } } }Copy the code

Then, the score calculation process of document 1 and document 2 in dis_max query is as follows:

  1. Elasticsearch aggs query using match is split into elasticSearch and aggs.
  2. Then use each clause in should to query the same document:
  3. Elasticsearch = 1; elasticSearch = 1; elasticSearch = 1; elasticSearch = 1;
  4. Title hits elasticSearch and aggs, body hits none, so title gets 2 points, body gets 0 points, and body gets 2 points.
  5. Therefore, the final score of Document 2 is higher than that of Document 1, which is in line with our requirements.
{ "took" : 0, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : {" total ": {" value" : 2, the "base" : "eq"}, "max_score" : 0.87546873, "hits" : [{" _index ":" book ", "_type" : "_doc", "_id" : "2", "_score" : 0.87546873, "_source" : {" body ":" single value search ", "title" : "Elasticsearch aggs query"}}, {" _index ":" book ", "_type" : "_doc", "_id" : "1", "_score" : 0.80960923, "_source" : { "body" : "elasticsearch filter", "title" : "elasticsearch basic query" } } ] } }Copy the code