This is the 7th day of my participation in the August More Text Challenge
Version 7.4.2 of this Elasticsearch article is available
The library has a collection of books, and the relevant information is stored in the Book index. Here are two documents:
POST /book/_doc/1
{
"body": "elasticsearch filter",
"title": "elasticsearch basic query"
}
Copy the code
POST /book/_doc/2
{
"body": "single value search",
"title": "elasticsearch aggs query"
}
Copy the code
Select * from elasticSearch; select * from elasticSearch; select * from elasticSearch;
POST /book/_search
{
"query": {
"bool": {
"should": [
{"match": {"body": "elasticsearch aggs"}},
{"match": {"title": "elasticsearch aggs"}}
]
}
}
}
Copy the code
You would think that Document 2 would be a better match for Ming’s needs, but the returned query results have a missing relevance score for Document 1 higher than document 2:
{ "took" : 4, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : {" total ": {" value" : 2, the "base" : "eq"}, "max_score" : 0.9372343, "hits" : [{" _index ":" book ", "_type" : "_doc", "_id" : "1", "_score" : 0.9372343, "_source" : {" body ":" elasticsearch filter ", "title" : "Elasticsearch basic query"}}, {" _index ":" book ", "_type" : "_doc", "_id" : "2", "_score" : 0.87546873, "_source" : { "body" : "single value search", "title" : "elasticsearch aggs query" } } ] } }Copy the code
That’s because the correlation score of the SHOULD query is calculated as follows:
- Elasticsearch aggs query using match is split into elasticSearch and aggs.
- Then use each clause in should to query the same document:
- Elasticsearch = “should”;
- Title hits elasticSearch and aggs, but only satisfies the title clause in should.
- So document 1 ends up scoring higher than document 2, even though document 2 is a better match.
So, how do you get a more closely matched document 2 to have a higher correlation score than document 1? Then you need to use dis_max query. Dis_max query score calculation: The score of the best-matched clause is used as the correlation score for the entire document.
POST /book/_search? Size = 1000 {" query ": {" dis_max" : {" tie_breaker ": 0.3," queries ": [{" match" : {" body ": "elasticsearch aggs"}}, {"match": {"title": "elasticsearch aggs"}} ] } } }Copy the code
Then, the score calculation process of document 1 and document 2 in dis_max query is as follows:
- Elasticsearch aggs query using match is split into elasticSearch and aggs.
- Then use each clause in should to query the same document:
- Elasticsearch = 1; elasticSearch = 1; elasticSearch = 1; elasticSearch = 1;
- Title hits elasticSearch and aggs, body hits none, so title gets 2 points, body gets 0 points, and body gets 2 points.
- Therefore, the final score of Document 2 is higher than that of Document 1, which is in line with our requirements.
{ "took" : 0, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : {" total ": {" value" : 2, the "base" : "eq"}, "max_score" : 0.87546873, "hits" : [{" _index ":" book ", "_type" : "_doc", "_id" : "2", "_score" : 0.87546873, "_source" : {" body ":" single value search ", "title" : "Elasticsearch aggs query"}}, {" _index ":" book ", "_type" : "_doc", "_id" : "1", "_score" : 0.80960923, "_source" : { "body" : "elasticsearch filter", "title" : "elasticsearch basic query" } } ] } }Copy the code