preface

Structured search for Elasticsearch is a search for data types such as numbers, dates, times, bores, etc. These data types are in a precise format, usually using term exact matching based on terms or prefix prefix matching. The new version of “text”, “keyword”, and Term queries are also explained.

Structured search

Structured search is a search of Structured data. For example, dates, times, and numbers are structured and have precise formats that we can logically manipulate. Common operations include comparing ranges of numbers or times, determining the size of two values, prefix matching, and so on.

Text can also be structured. For example, a colored pen can have discrete color sets: red, green and blue. A blog might be tagged with the keywords distributed and search. Products on e-commerce sites have UPCs (Universal Product Codes) or other unique identifiers that are subject to strict, structured formats.

In structured queries, we only get “yes” or “no” results, depending on the scenario, we can decide whether structured search needs to be scored or not, but usually we don’t need to score.

Exact value lookup

Let’s start by creating and indexing a document that represents a product. The document has fields price, productID, show, createdAt, tags (price, productID, whether to display, creation time, marking information)

POST products/_doc/_bulk
{ "index": { "_id": 1 }}
{ "price" : 10, "productID" : "XHDK-A-1293-#fJ3", "show":true, "createdAt":"2021-03-03", "tags":"abc" }
{ "index": { "_id": 2 }}
{ "price" : 20, "productID" : "KDKE-B-9947-#kL5", "show":true, "createdAt":"2021-03-04" }
{ "index": { "_id": 3 }}
{ "price" : 30, "productID" : "JODL-X-1937-#pV7", "show":false, "createdAt":"2021-03-05"}
{ "index": { "_id": 4 }}
{ "price" : 30, "productID" : "QQPX-R-3956-#aD8", "show":true, "createdAt":"2021-03-06"}
Copy the code

digital

Now what we want to do is find all the products with a certain price. Suppose we want to get an item with a price of $20, we can use the term query as follows

GET products/_search
{
  "query": {
    "term": {
      "price": 20
    }
  }
}
Copy the code

Usually when looking for an exact value, we don’t want to score the query. We only want documents to be included or excluded, so we use the constant_score query to execute the term query in non-scoring mode with a uniform score of 1.0.

The result of the final composition is a constant_score query that contains a term query:

GET products/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "term": {
          "price": 20
        }
      }
    }
  }
}
Copy the code

For numbers, there are also range queries

GET products/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "range": {
          "price": {
            "gte": 10,
            "lte": 20
          }
        }
      }
    }
  }
}
Copy the code

Range Indicates the supported options

  • Gt: > greater than (greater than)
  • Lt: Less than (less than)
  • Gte: >= greater than or equal to (greater than or equal to)
  • Lte: <= less than or equal to

Boolean value

GET products/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "term": {
          "show": true
        }
      }
    }
  }
}
Copy the code

The date of

Search for documents within a time range

POST products/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "range": {
          "createdAt": {
            "gte": "now-9d"
          }
        }
      }
    }
  }
}

POST products/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "range": {
          "createdAt": {
            "gte": "2021-01-05"
          }
        }
      }
    }
  }
}
Copy the code

Date matching expression

  • Y years
  • M a month
  • Week of w
  • D day
  • H/H hours
  • M minutes
  • S second

The text

POST products/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "terms": {
          "productID.keyword": [
            "XHDK-A-1293-#fJ3",
            "KDKE-B-9947-#kL5"
          ]
        }
      }
    }
  }
}
Copy the code

Keyword in productid. keyword is not a keyword, but a subfield that is automatically generated for productID when Elasticsearch inserts a document. The name is keyword.

Null processing

Exists with must_not exists with must_not

POST products/_search {"query" : {"constant_score" : {"filter" : {"exists": POST products/_search {"query": {"constant_score": {"field":"tags"}}}}} { "filter": { "bool": { "must_not": { "exists": { "field": "tags" } } } } } } }Copy the code

Note that the new version no longer uses the “missing” keyword, which is now deprecated and reversed with “must_not”. Use “missing” will report an error, the error information is as follows:

"reason": "no [query] registered for [missing]"
Copy the code

keyword

In the 2.x version, the text uses the string field. After 5.0, the string field was made obsolete and text and keyword fields were introduced, both of which can be used to store strings.

“Text” is used for full-text search and “keyword” is used for structured search. “Keyword” is similar to enumerations in Java. In the new version, if you do not create your own mapping, text processing automatically maps the text to “text” and generates a subfield of type “keyword”.

In storage, “text” is segmented by the word splitter, while “keyword” is left intact. For example, “Rabit is jumping” might be stored as “Rabit” or “jump” in the case of “text”, or “Rabit is jumping” in the case of “keyword”.

The Term query

In ES, the term query, which does not segment the input, takes the input as a whole, looks for the exact term in the inverted index, and scores the relevance for each document containing the term using the correlation score formula.

For example, the above (“productID”: “qqpx-R-3956 -#aD8”) will be split into “QQPX”, “R”, “3956”, “aD8”.

“Productid. keyword” is of type keyword, so even if a match query is used, it will eventually become a Term query.

//"productID. Keyword ": "qqpx-r-3956-#ad8" GET products/_search {"query": {"match": {//"productID": "QQPX-R-3956-#aD8" //"productID": "qqpx" //"productID": "qqpx-r-3956-#ad8" //"productID.keyword": "Qqpx-r-3956 -#aD8" "productID. Keyword ": "qqpx-r-3956-# aD8" "productID. Keyword ": "qqpx-r-3956-# aD8" "}}} // "productID": "qqpx-r-3956-# aD8" "productID. GET products/_search {"query": {"term": {"productID": "QQPX-R-3956-#aD8" //"productID": "qqpx" //"productID": "qqpx-r-3956-#ad8" //"productID.keyword": "QQPX-R-3956-#aD8" //"productID.keyword": "qqpx-r-3956-#ad8" } } }Copy the code

data

  • www.elastic.co/guide/cn/el…
  • www.elastic.co/guide/en/el…
  • www.elastic.co/guide/en/el…