preface

In this article, you will find the most useful search techniques for Elasticsearch, as well as the Java API implementation

Data preparation

To illustrate the different types of ES retrieval, we will retrieve a collection of documents containing the following types:

Title Title authors Author Summary publish_date Num_reviews PublisherCopy the code

First, we use the BULK API to batch create new indexes and commit data

Set index Settings
PUT /bookdb_index
{ "settings": { "number_of_shards": 1}}# Bulk Commits data
POST /bookdb_index/book/_bulk
{"index": {"_id": {1}}"title":"Elasticsearch: The Definitive Guide"."authors": ["clinton gormley"."zachary tong"]."summary":"A distibuted real-time search and analytics engine"."publish_date":"2015-02-07"."num_reviews": 20."publisher":"oreilly"}
{"index": {"_id": {2}}"title":"Taming Text: How to Find, Organize, and Manipulate It"."authors": ["grant ingersoll"."thomas morton"."drew farris"]."summary":"organize text using approaches such as full-text search, proper name recognition, clustering, tagging, information extraction, and summarization"."publish_date":"2013-01-24"."num_reviews": 12."publisher":"manning"}
{"index": {"_id": {3}}"title":"Elasticsearch in Action"."authors": ["radu gheorge"."matthew lee hinman"."roy russo"]."summary":"build scalable search applications using Elasticsearch without having to do complex low-level programming or understand  advanced data science algorithms"."publish_date":"2015-12-03"."num_reviews": 18."publisher":"manning"}
{"index": {"_id": {4}}"title":"Solr in Action"."authors": ["trey grainger"."timothy potter"]."summary":"Comprehensive guide to implementing a scalable search engine using Apache Solr"."publish_date":"2014-04-05"."num_reviews": 23."publisher":"manning"}
Copy the code

Note: The ES version used in this experiment is ES 6.3.0

1. Basic Match Query

1.1 Full-text Retrieval

There are two ways to perform full-text retrieval:

1) Use a retrieval API that contains parameters as part of the URL

Example: Perform a full-text search for “Guide” below

GET bookdb_index/book/_search? q=guide [Results]"hits": {
    "total": 2."max_score": 1.3278645."hits": [{"_index": "bookdb_index"."_type": "book"."_id": "4"."_score": 1.3278645."_source": {
          "title": "Solr in Action"."authors": [
            "trey grainger"."timothy potter"]."summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr"."publish_date": "2014-04-05"."num_reviews": 23."publisher": "manning"}}, {"_index": "bookdb_index"."_type": "book"."_id": "1"."_score": 1.2871116."_source": {
          "title": "Elasticsearch: The Definitive Guide"."authors": [
            "clinton gormley"."zachary tong"]."summary": "A distibuted real-time search and analytics engine"."publish_date": "2015-02-07"."num_reviews": 20."publisher": "oreilly"}}}]Copy the code

2) Use the full ES DSL, where the Json body is the request body and the result is the same as method 1.

GET bookdb_index/book/_search
{
  "query": {
    "multi_match": {
      "query": "guide"."fields" : ["_all"]}}}Copy the code

Interpretation: Use the multi_match keyword instead of the match keyword as a convenient shorthand for running the same query on multiple fields. The fields attribute specifies the fields to be queried, in which case we want to query all fields in the document

Note: ES 6.x does not enable the _all field by default and does not specify fields by default search for all fields

1.2 Specify specific field retrieval

Both apis also allow you to specify fields to search for. For example, search for books with the word “in Action “in the title field

1) URL retrieval method

GET bookdb_index/book/_search? q=title:in action

[Results]
  "hits": {
    "total": 2."max_score": 1.6323128."hits": [{"_index": "bookdb_index"."_type": "book"."_id": "3"."_score": 1.6323128."_source": {
          "title": "Elasticsearch in Action"."authors": [
            "radu gheorge"."matthew lee hinman"."roy russo"]."summary": "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand  advanced data science algorithms"."publish_date": "2015-12-03"."num_reviews": 18."publisher": "manning"}}, {"_index": "bookdb_index"."_type": "book"."_id": "4"."_score": 1.6323128."_source": {
          "title": "Solr in Action"."authors": [
            "trey grainger"."timothy potter"]."summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr"."publish_date": "2014-04-05"."num_reviews": 23."publisher": "manning"}}}]Copy the code

2) DSL retrieval Methods However, Full Body’s DSL gives you more flexibility to create more complex queries (we’ll see later) and specify the results you want to return. In the following example, we specify the number of results to return, the offset (useful for paging), the document fields we are returning, and the highlighting of the properties.

Representation of the number of results: size Representation of the offset value: from Specifies the representation of the returned field: _source Representation of the highlighted value: highliaght

GET bookdb_index/book/_search
{
  "query": {
    "match": {
      "title": "in action"}},"size": 2."from": 0."_source": ["title"."summary"."publish_date"]."highlight": {
    "fields": {
      "title": {}
    }
  }
}

[Results]
  "hits": {
    "total": 2."max_score": 1.6323128."hits": [{"_index": "bookdb_index"."_type": "book"."_id": "3"."_score": 1.6323128."_source": {
          "summary": "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand  advanced data science algorithms"."title": "Elasticsearch in Action"."publish_date": "2015-12-03"
        },
        "highlight": {
          "title": [
            "Elasticsearch <em>in</em> <em>Action</em>"]}}, {"_index": "bookdb_index"."_type": "book"."_id": "4"."_score": 1.6323128."_source": {
          "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr"."title": "Solr in Action"."publish_date": "2014-04-05"
        },
        "highlight": {
          "title": [
            "Solr <em>in</em> <em>Action</em>"}}]}Copy the code

Note:

  1. For multi-word retrieval, matching queries allow you to specify whether to use the AND operator instead of the default OR operator –> “operator” : “and”
  2. You can also specify the minimum_should_match option to adjust the correlation of returned results. For details, see the Elasticsearch Guide.

2. Multi-field Search

As we have already seen, to query multiple document fields in a search (for example, to search for the same query string in the title and summary), use the multi_match query

GET bookdb_index/book/_search
{
  "query": {
    "multi_match": {
      "query": "guide"."fields": ["title"."summary"]
    }
  }
}

[Results]
  "hits": {
    "total": 3."max_score": 2.0281231."hits": [{"_index": "bookdb_index"."_type": "book"."_id": "1"."_score": 2.0281231."_source": {
          "title": "Elasticsearch: The Definitive Guide"."authors": [
            "clinton gormley"."zachary tong"]."summary": "A distibuted real-time search and analytics engine"."publish_date": "2015-02-07"."num_reviews": 20."publisher": "oreilly"}}, {"_index": "bookdb_index"."_type": "book"."_id": "4"."_score": 1.3278645."_source": {
          "title": "Solr in Action"."authors": [
            "trey grainger"."timothy potter"]."summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr"."publish_date": "2014-04-05"."num_reviews": 23."publisher": "manning"}}, {"_index": "bookdb_index"."_type": "book"."_id": "3"."_score": 1.0333893."_source": {
          "title": "Elasticsearch in Action"."authors": [
            "radu gheorge"."matthew lee hinman"."roy russo"]."summary": "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand  advanced data science algorithms"."publish_date": "2015-12-03"."num_reviews": 18."publisher": "manning"}}}]Copy the code

Note: The reason document 4 (_id=4) matches in the above results is that the Guide exists in the summary.

Boosting the Retrieval of a Boosting field

Since we are searching in multiple fields, we may want to improve the score for one field. In the example below, we increased the score for the “Summary” field by a factor of three to increase the importance of the “Summary” field and thus improve the relevance of document 4.

GET bookdb_index/book/_search
{
  "query": {
    "multi_match": {
      "query": "elasticsearch guide"."fields": ["title"."summary^3"]}},"_source": ["title"."summary"."publish_date"]
}

[Results]
  "hits": {
    "total": 3."max_score": 3.9835935."hits": [{"_index": "bookdb_index"."_type": "book"."_id": "4"."_score": 3.9835935."_source": {
          "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr"."title": "Solr in Action"."publish_date": "2014-04-05"}}, {"_index": "bookdb_index"."_type": "book"."_id": "3"."_score": 3.1001682."_source": {
          "summary": "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand  advanced data science algorithms"."title": "Elasticsearch in Action"."publish_date": "2015-12-03"}}, {"_index": "bookdb_index"."_type": "book"."_id": "1"."_score": 2.0281231."_source": {
          "summary": "A distibuted real-time search and analytics engine"."title": "Elasticsearch: The Definitive Guide"."publish_date": "2015-02-07"}}}]Copy the code

Note: Boosting means more than just calculating score multiplications to add factors. The actual improved score value is achieved through normalization and some internal optimization. See Elasticsearch Guide for more information

4, Bool Query (Bool Query)

We can use the AND/OR/NOT operators to fine-tune our search query to provide more relevant OR specified search results.

This is done through the bool query in the search API. A bool query takes either a must argument (equivalent to AND), a must_NOT argument (equivalent to NOT), OR a should argument (equivalent to OR).

For example, if I want to search for a book called “Elasticsearch” or “Solr” in the title, AND by “Clinton Gormley”, but NOT by “Radu Gheorge”

GET bookdb_index/book/_search
{
  "query": {
    "bool": {
      "must": [{"bool": {
            "should": [{"match": {"title": "Elasticsearch"}},
              {"match": {"title": "Solr"}}]}}, {"match": {"authors": "clinton gormely"}}]."must_not": [{"match": {"authors": "radu gheorge"}
        }
      ]
    }
  }
}

[Results]
  "hits": {
    "total": 1,
    "max_score": 2.0749094."hits": [{"_index": "bookdb_index"."_type": "book"."_id": "1"."_score": 2.0749094."_source": {
          "title": "Elasticsearch: The Definitive Guide"."authors": [
            "clinton gormley"."zachary tong"]."summary": "A distibuted real-time search and analytics engine"."publish_date": "2015-02-07"."num_reviews": 20."publisher": "oreilly"}}}]Copy the code

There are two cases of should in a bool query:

  • When there is must at the same level of “should”, the conditions in “should” can be satisfied or not satisfied. The more the conditions are satisfied, the higher the score will be
  • When there is no must, at least one condition must be satisfied in should by default

Note: As you can see, a bool query can contain any other query type, including other Boolean queries, to create arbitrarily complex or deeply nested queries

5, Fuzzy Fuzzy Queries

Fuzzy matching can be enabled in Match and multi-match retrieval to catch spelling errors. The ambiguity is specified based on the Levenshtein distance from the original word

GET bookdb_index/book/_search
{
  "query": {
    "multi_match": {
      "query": "comprihensiv guide"."fields": ["title"."summary"]."fuzziness": "AUTO"}},"_source": ["title"."summary"."publish_date"]."size": 2
}

[Results]
  "hits": {
    "total": 2."max_score": 2.4344182."hits": [{"_index": "bookdb_index"."_type": "book"."_id": "4"."_score": 2.4344182."_source": {
          "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr"."title": "Solr in Action"."publish_date": "2014-04-05"}}, {"_index": "bookdb_index"."_type": "book"."_id": "1"."_score": 1.2871116."_source": {
          "summary": "A distibuted real-time search and analytics engine"."title": "Elasticsearch: The Definitive Guide"."publish_date": "2015-02-07"}}}]Copy the code

The fuzzy value of “AUTO” is equivalent to specifying the value 2 if the field length is greater than 5. However, setting the edit distance to 1 for 80% of spelling errors and setting ambiguity to 1 May improve overall search performance. For more information, Typos and Misspellingsch

6. Wildcard Query Wildcard Query

Wildcard queries allow you to specify matching patterns rather than a whole term search

  • ? Match any character
    • Matches zero or more characters

For example, to find all records that have authors beginning with the letter “T”, look like this

GET bookdb_index/book/_search
{
  "query": {
    "wildcard": {
      "authors": {
        "value": "t*"}}},"_source": ["title"."authors"]."highlight": {
    "fields": {
      "authors": {}
    }
  }
}

[Results]
  "hits": {
    "total": 3."max_score": 1,
    "hits": [{"_index": "bookdb_index"."_type": "book"."_id": "1"."_score": 1,
        "_source": {
          "title": "Elasticsearch: The Definitive Guide"."authors": [
            "clinton gormley"."zachary tong"]},"highlight": {
          "authors": [
            "zachary <em>tong</em>"]}}, {"_index": "bookdb_index"."_type": "book"."_id": "2"."_score": 1,
        "_source": {
          "title": "Taming Text: How to Find, Organize, and Manipulate It"."authors": [
            "grant ingersoll"."thomas morton"."drew farris"]},"highlight": {
          "authors": [
            "<em>thomas</em> morton"]}}, {"_index": "bookdb_index"."_type": "book"."_id": "4"."_score": 1,
        "_source": {
          "title": "Solr in Action"."authors": [
            "trey grainger"."timothy potter"]},"highlight": {
          "authors": [
            "<em>trey</em> grainger"."<em>timothy</em> potter"}}]}Copy the code

7. Regular expression Query (Regexp Query)

Regular expressions can specify more complex retrieval modes than wildcard retrieval, as shown in the following example:

POST bookdb_index/book/_search
{
  "query": {
    "regexp": {
      "authors": "t[a-z]*y"}},"_source": ["title"."authors"]."highlight": {
    "fields": {
      "authors": {}
    }
  }
}

[Results]
  "hits": {
    "total": 1,
    "max_score": 1,
    "hits": [{"_index": "bookdb_index"."_type": "book"."_id": "4"."_score": 1,
        "_source": {
          "title": "Solr in Action"."authors": [
            "trey grainger"."timothy potter"]},"highlight": {
          "authors": [
            "<em>trey</em> grainger"."<em>timothy</em> potter"}}]}Copy the code

8. Match Phrase Query

Matching phrase queries require that all words in the query string exist in the document, in the order specified in the query string, and close to each other.

By default, these words must be completely adjacent, but you can specify a slop value that indicates the deviation from word to word while document matching is still considered.

GET bookdb_index/book/_search
{
  "query": {
    "multi_match": {
      "query": "search engine"."fields": ["title"."summary"]."type": "phrase"."slop": 3}},"_source": [ "title"."summary"."publish_date" ]
}

[Results]
  "hits": {
    "total": 2."max_score": 0.88067603."hits": [{"_index": "bookdb_index"."_type": "book"."_id": "4"."_score": 0.88067603."_source": {
          "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr"."title": "Solr in Action"."publish_date": "2014-04-05"}}, {"_index": "bookdb_index"."_type": "book"."_id": "1"."_score": 0.51429313."_source": {
          "summary": "A distibuted real-time search and analytics engine"."title": "Elasticsearch: The Definitive Guide"."publish_date": "2015-02-07"}}}]Copy the code

Note: In the example above, for non-phrase type queries, document _id 1 usually has a higher score and is displayed before document _id 4 because of its shorter field length.

However, as a phrase query, proximity between words is taken into account, so the document _ID 4 score is better

9, matching phrase prefix retrieval

Matching phrase prefix queries provide an auto-complete version of searching for immediate types or “relatively easy” at query time without having to prepare the data in any way.

Like the match_PHRASE query, it takes a slope argument, making word order and relative position less “strict.” It also accepts the max_expansions parameter to limit the number of matching conditions to reduce resource intensity

GET bookdb_index/book/_search
{
  "query": {
    "match_phrase_prefix": {
      "summary": {
        "query": "search en"."slop": 3."max_expansions": 10}}},"_source": ["title"."summary"."publish_date"]}Copy the code

Note: The query time search type has a performance cost. A better solution is to use time as the index type. Find out more about the Completion Suggester API or Edge-Ngram filters.

10. Query String

Query_string query provides a concise concise syntax to implement multi-match queries multi_match queries, Boolean queries bool queries, boosting scores Boosting, fuzzy matching. Wildcards, regular expression regexp, and range queries.

In the examples below, we perform a fuzzy search on the term “search algorithm”, one of which was written by “Grant Ingersoll” or “Tom Morton”. We search all fields, but apply the promotion to the summary field of document 2

GET bookdb_index/book/_search
{
  "query": {
    "query_string": {
      "query": "(saerch~1 algorithm~1) AND (grant ingersoll) OR (tom morton)"."fields": ["summary^2"."title"."authors"."publisher"]}},"_source": ["title"."summary"."authors"]."highlight": {
    "fields": {
      "summary": {}
    }
  }
}

[Results]
  "hits": {
    "total": 1,
    "max_score": 3.571021."hits": [{"_index": "bookdb_index"."_type": "book"."_id": "2"."_score": 3.571021."_source": {
          "summary": "organize text using approaches such as full-text search, proper name recognition, clustering, tagging, information extraction, and summarization"."title": "Taming Text: How to Find, Organize, and Manipulate It"."authors": [
            "grant ingersoll"."thomas morton"."drew farris"]},"highlight": {
          "summary": [
            "organize text using approaches such as full-text <em>search</em>, proper name recognition, clustering, tagging"}}]}Copy the code

11. Simple Query String

Simple_query_string query is a version query_string query, exposed to the user more suitable for a single search box, because it with + / | replacing AND/OR / / – NOT use, AND give up invalid part of the query, Instead of throwing an exception when the user makes a mistake.

GET bookdb_index/book/_search
{
  "query": {
    "simple_query_string": {
      "query": "(saerch~1 algorithm~1) + (grant ingersoll) | (tom morton)"."fields": ["summary^2"."title"."authors"."publisher"]}},"_source": ["title"."summary"."authors"]."highlight": {
    "fields": {
      "summary": {}
    }
  }
}

[Results]
# result same as above
Copy the code

12, Term/Terms search (specified field search)

The examples in sections 1-11 above are examples of full-text search. Sometimes we are more interested in structured searches where we want to find a perfect match and return results

In the following example, we search all the books in the index published by Manning Publications (with term and terms queries)

GET bookdb_index/book/_search
{
  "query": {
    "term": {
      "publisher": {
        "value": "manning"}}},"_source" : ["title"."publish_date"."publisher"]
}

[Results]
  "hits": {
    "total": 3."max_score": 0.35667494."hits": [{"_index": "bookdb_index"."_type": "book"."_id": "2"."_score": 0.35667494."_source": {
          "publisher": "manning"."title": "Taming Text: How to Find, Organize, and Manipulate It"."publish_date": "2013-01-24"}}, {"_index": "bookdb_index"."_type": "book"."_id": "3"."_score": 0.35667494."_source": {
          "publisher": "manning"."title": "Elasticsearch in Action"."publish_date": "2015-12-03"}}, {"_index": "bookdb_index"."_type": "book"."_id": "4"."_score": 0.35667494."_source": {
          "publisher": "manning"."title": "Solr in Action"."publish_date": "2014-04-05"}}}]Copy the code

Multiple terms allows you to specify Multiple keywords for retrieval

GET bookdb_index/book/_search
{
  "query": {
    "terms": {
      "publisher": ["oreilly"."manning"]}}}Copy the code

13, Sorted by Term Query – (Term Query – Sorted)

The Term query is as easy to sort as any other query. Multilevel sorting is also allowed

GET bookdb_index/book/_search
{
  "query": {
    "term": {
      "publisher": {
        "value": "manning"}}},"_source" : ["title"."publish_date"."publisher"]."sort": [{"publisher.keyword": { "order": "desc"}},
    {"title.keyword": {"order": "asc"}}]
}

[Results]
  "hits": {
    "total": 3."max_score": null,
    "hits": [{"_index": "bookdb_index"."_type": "book"."_id": "3"."_score": null,
        "_source": {
          "publisher": "manning"."title": "Elasticsearch in Action"."publish_date": "2015-12-03"
        },
        "sort": [
          "manning"."Elasticsearch in Action"] {},"_index": "bookdb_index"."_type": "book"."_id": "4"."_score": null,
        "_source": {
          "publisher": "manning"."title": "Solr in Action"."publish_date": "2014-04-05"
        },
        "sort": [
          "manning"."Solr in Action"] {},"_index": "bookdb_index"."_type": "book"."_id": "2"."_score": null,
        "_source": {
          "publisher": "manning"."title": "Taming Text: How to Find, Organize, and Manipulate It"."publish_date": "2013-01-24"
        },
        "sort": [
          "manning"."Taming Text: How to Find, Organize, and Manipulate It"]]}}Copy the code

Select * from Elasticsearch (select * from Elasticsearch (select * from Elasticsearch (select * from Elasticsearch))))

14. Range Query

Another example of structured retrieval is range retrieval. In the example below, we searched for books published in 2015.

GET bookdb_index/book/_search
{
  "query": {
    "range": {
      "publish_date": {
        "gte": "2015-01-01"."lte": "2015-12-31"}}},"_source" : ["title"."publish_date"."publisher"]
}

[Results]
  "hits": {
    "total": 2."max_score": 1,
    "hits": [{"_index": "bookdb_index"."_type": "book"."_id": "1"."_score": 1,
        "_source": {
          "publisher": "oreilly"."title": "Elasticsearch: The Definitive Guide"."publish_date": "2015-02-07"}}, {"_index": "bookdb_index"."_type": "book"."_id": "3"."_score": 1,
        "_source": {
          "publisher": "manning"."title": "Elasticsearch in Action"."publish_date": "2015-12-03"}}}]Copy the code

Note: Range queries apply to date, number, and string type fields

Filtered Query

(No longer available since version 5.0, don’t worry about it)

Filtered queries allow you to filter the results of the query. In the following example, we query for a book named “Elasticsearch” in the title or summary, but we want to filter the results to only 20 or more reviews.

POST /bookdb_index/book/_search
{
    "query": {
        "filtered": {
            "query" : {
                "multi_match": {
                    "query": "elasticsearch"."fields": ["title"."summary"]}},"filter": {
                "range" : {
                    "num_reviews": {
                        "gte": 20}}}}},"_source" : ["title"."summary"."publisher"."num_reviews"]
}


[Results]
"hits": [{"_index": "bookdb_index"."_type": "book"."_id": "1"."_score": 0.5955761."_source": {
          "summary": "A distibuted real-time search and analytics engine"."publisher": "oreilly"."num_reviews": 20."title": "Elasticsearch: The Definitive Guide"}}]Copy the code

Note: Filtered queries do not require the existence of queries to be filtered. If no query is specified, the match_all query is run, which basically returns all documents in the index and then filters them. Actually, run the filter first to reduce the surface area to be queried. In addition, the filter is cached after the first use, which makes it very effective

Update: Filtered queries have been removed from Elasticsearch 5.x in favor of Boolean queries. This is the same example as the bool query rewritten above. The result returned is exactly the same.

GET bookdb_index/book/_search
{
  "query": {
    "bool": {
      "must": [{"multi_match": {
            "query": "elasticsearch"."fields": ["title"."summary"]}}],"filter": {
        "range": {
          "num_reviews": {
            "gte": 20}}}}},"_source" : ["title"."summary"."publisher"."num_reviews"]}Copy the code

Search for Multiple Filters

(5.x is no longer supported, so don’t worry.) Multiple filters can be combined by using Boolean filters.

In the next example, the filter determines that the result returned must contain at least 20 comments, must not be published before 2015, and should be published by Oreilly

POST /bookdb_index/book/_search
{
    "query": {
        "filtered": {
            "query" : {
                "multi_match": {
                    "query": "elasticsearch"."fields": ["title"."summary"]}},"filter": {
                "bool": {
                    "must": {
                        "range" : { "num_reviews": { "gte": 20}}},"must_not": {
                        "range" : { "publish_date": { "lte": "2014-12-31"}}},"should": {
                        "term": { "publisher": "oreilly"}}}}}},"_source" : ["title"."summary"."publisher"."num_reviews"."publish_date"]
}


[Results]
"hits": [{"_index": "bookdb_index"."_type": "book"."_id": "1"."_score": 0.5955761."_source": {
          "summary": "A distibuted real-time search and analytics engine"."publisher": "oreilly"."num_reviews": 20."title": "Elasticsearch: The Definitive Guide"."publish_date": "2015-02-07"}}]Copy the code

17, Function Score: Field Value Factor

There may be a case where you want to include the value of a particular field in the document in the correlation score calculation. This is typical in situations where you want to increase the relevance of a document based on its popularity

In our case, we wanted to add more popular books (judged by the number of reviews). This can be scored using the field_value_factor function

GET bookdb_index/book/_search
{
  "query": {
    "function_score": {
      "query": {
        "multi_match": {
          "query": "search engine"."fields": ["title"."summary"]}},"field_value_factor": {
        "field": "num_reviews"."modifier": "log1p"."factor": 2}}},"_source": ["title"."summary"."publish_date"."num_reviews"]
}

[Results]
    "hits": [{"_index": "bookdb_index"."_type": "book"."_id": "1"."_score": 1.5694137."_source": {
          "summary": "A distibuted real-time search and analytics engine"."num_reviews": 20."title": "Elasticsearch: The Definitive Guide"."publish_date": "2015-02-07"}}, {"_index": "bookdb_index"."_type": "book"."_id": "4"."_score": 1.4725765."_source": {
          "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr"."num_reviews": 23."title": "Solr in Action"."publish_date": "2014-04-05"}}, {"_index": "bookdb_index"."_type": "book"."_id": "3"."_score": 0.14181662."_source": {
          "summary": "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand  advanced data science algorithms"."num_reviews": 18."title": "Elasticsearch in Action"."publish_date": "2015-12-03"}}, {"_index": "bookdb_index"."_type": "book"."_id": "2"."_score": 0.13297246."_source": {
          "summary": "organize text using approaches such as full-text search, proper name recognition, clustering, tagging, information extraction, and summarization"."num_reviews": 12."title": "Taming Text: How to Find, Organize, and Manipulate It"."publish_date": "2013-01-24"}}}]Copy the code

Note 1: We can run a regular multi_match query and sort by the num_reviews field, but we lose the benefit of correlation scores. Note 2: There are a number of additional parameters that can be adjusted to adjust the degree of enhancement to the raw correlation score (e.g. ‘Modifier’, ‘Factor’, ‘boost_mode’, etc.). See the Elasticsearch guide.

16, Function Score: Decay Functions

Let’s say we don’t want to increments the score with the value of a field to get the desired result. Examples: price range, number field range, date range. In our example, we are searching for “Search Engines” books published around June 2014.

GET bookdb_index/book/_search
{
  "query": {
    "function_score": {
      "query": {
        "multi_match": {
          "query": "search engine"."fields": ["title"."summary"]}},"functions": [{"exp": {
            "publish_date": {
              "origin": "2014-06-15"."scale": "30d"."offset": "7d"}}}]."boost_mode": "replace"}},"_source": ["title"."summary"."publish_date"."num_reviews"]
}

[Results]
  "hits": {
    "total": 4."max_score": 0.22793062."hits": [{"_index": "bookdb_index"."_type": "book"."_id": "4"."_score": 0.22793062."_source": {
          "summary": "Comprehensive guide to implementing a scalable search engine using Apache Solr"."num_reviews": 23."title": "Solr in Action"."publish_date": "2014-04-05"}}, {"_index": "bookdb_index"."_type": "book"."_id": "1"."_score": 0.0049215667."_source": {
          "summary": "A distibuted real-time search and analytics engine"."num_reviews": 20."title": "Elasticsearch: The Definitive Guide"."publish_date": "2015-02-07"}}, {"_index": "bookdb_index"."_type": "book"."_id": "2"."_score": 0.000009612435."_source": {
          "summary": "organize text using approaches such as full-text search, proper name recognition, clustering, tagging, information extraction, and summarization"."num_reviews": 12."title": "Taming Text: How to Find, Organize, and Manipulate It"."publish_date": "2013-01-24"}}, {"_index": "bookdb_index"."_type": "book"."_id": "3"."_score": 0.0000049185574."_source": {
          "summary": "build scalable search applications using Elasticsearch without having to do complex low-level programming or understand  advanced data science algorithms"."num_reviews": 18."title": "Elasticsearch in Action"."publish_date": "2015-12-03"}}}]Copy the code

19, Function Score: Script Scoring

In cases where the built-in scoring functionality doesn’t suit your needs, you have the option of specifying Groovy scripts for scoring

In our example, we specify a script that takes publish_date into account, and then decide how many comments to consider. Newer books may not have as many reviews, so they shouldn’t ‘pay the price’ for that

The scoring script is as follows:

publish_date = doc['publish_date'].value
num_reviews = doc['num_reviews'].value

if (publish_date > Date.parse('yyyy-MM-dd', threshold).getTime()) {my_score = math.log (2.5 + num_reviews)}else {
  my_score = Math.log(1 + num_reviews)
}
return my_score
Copy the code

To use the scoring script dynamically, we use the script_score parameter

GET /bookdb_index/book/_search
{
  "query": {
    "function_score": {
      "query": {
        "multi_match": {
          "query": "search engine"."fields": ["title"."summary"]}},"functions": [{"script_score": {
            "script": {
              "params": {
                "threshold": "2015-07-30"
              },  
              "lang": "groovy"."source": "publish_date = doc['publish_date'].value; num_reviews = doc['num_reviews'].value; Parse (' YYYY-MM-dd ', threshold).gettime ()) {return log(2.5 + num_reviews)}; publish_date > date.parse (' YYYY-MM-dd ', threshold).gettime ()) {return log(2.5 + num_reviews)}; return log(1 + num_reviews);"}}}]}},"_source": ["title"."summary"."publish_date"."num_reviews"]}Copy the code

Note 1: To use dynamic scripts, you must enable elasticSearch instances in the config/ElasticSearch.yml file. You can also use scripts already stored on the Elasticsearch server. See Elasticsearch Reference Docs for more information. Note 2: JSON cannot contain embedded newlines, so semicolons are used to separate statements. By Tim Ojo Aug. 05, 16 · Big Data Zone

Note: How do I enable Groovy scripts in ES6.3? Failed to configure script.allowed_types: inline & script.allowed_contexts: search, update

Java API implementation

The Java API implements the above query, with the code at github.com/whirlys/ela…

23 Useful Elasticsearch Example Queries you need to know


For more, visit my personal blog: laijianfeng.org

Open the wechat scan, follow the wechat official account of “Xiao Xiao Xiao Feng”, and timely receive the blog push