This article is based on ElasticSearch version 6.4.x. One of the most important features of Elasticsearch is that it will try to avoid you and let you start exploring the data as quickly as possible. To index a document, you don’t have to first create the index, define the mapping type, and define the fields — you just index the document and the indexes, types, and fields start automatically, for example:

PUT data/_doc/1 
{ "count": 5 }
Copy the code

When executing the above request, the index “data” does not need to be created in the first place. The API will automatically create the index data, the type map _doc, which contains the field count under the map type and whose type is long. The process of automatically mapping a type to infer the type of a document from its value is the mechanism this article focuses on: the dynamic type mapping mechanism. The dynamic mapping mechanism includes the following two mapping rules:

  • Dynamic field mappings
  • Dynamic templates

The two dynamic mapping rules described above are introduced next.

1. Dynamic field mappings Dynamic field mapping rules By default, Elasticsearch adds a new field to the type map when it finds a field in a document that has not been seen before. You can disable this behavior at the document and object levels by setting the mapping parameter Dynamic to false(new fields are ignored) or strict(exceptions are thrown when unknown fields are encountered).

JSON datatype Elasticsearch datatype
null Type mappings are not automatically added
true or false boolean
floating point number float
integer long
object object
array Judge by the first non-null value in the array
string Date, double, Long, text(with keyword subfield)

Date type detection. If date_detection is enabled (the default), new string fields are checked to see if their content matches any Date patterns specified in Dynamic_date_format. If any of the formats is matched, the type of the field is set to date and the format of the date is set to the matching format. Such as:

PUT my_index/_doc/1
{
  "create_date": "2015/09/02"
}
Copy the code

The create_date field is a string in JSON, but if date_detection=true, it can be mapped to date. You can set whether to enable date_detection at the type _type level as shown in the following example:

PUT my_index
{
  "mappings": {
    "_doc": {
      "date_detection": false
    }
  }
}
Copy the code

1.2 Customizing date detection formats You can customize date detection formats at type level (_type) with the dynamic_date_formats parameter, as shown in the following example:

PUT my_index
{
  "mappings": {
    "_doc": {
      "dynamic_date_formats": ["MM/dd/yyyy"]
    }
  }
}

PUT my_index/_doc/1
{
  "create_date": "09/25/2015"
}
Copy the code

1.3 Numeric Detection. Similarly, if numeric values are represented as strings in JSON, if date type detection is enabled, the mapping will be mapped to numeric values instead of strings.

PUT my_index
{
  "mappings": {
    "_doc": {
      "numeric_detection": true
    }
  }
}
Copy the code

By default, numeric_detection is false.

By default, elasticSearch allows you to create custom field mappings based on the data types supported by elasticSearch. Dynamic field mappings allow you to create custom field mappings based on the data types supported by ElasticSearch. To add field type mappings. Dynamic mapping templates are defined in the following ways:

"dynamic_templates": [ // @1 { "my_template_name": { // @2 ... match conditions ... // @3 "mapping": {...} // @4}},...]Copy the code

Code @1: Defines a dynamic mapping template, of type array, using the dynamic_templates property at type mapping time. Code @2: Define the dynamic mapping template name. Code @3: Matching condition, which is defined in the following ways: match_mapping_type, match, match_pattern, unmatch, path_match, path_unmatch. Code @4: Type mapping definition used by fields matching @3 (mapping parameters are supported in type mapping)

The core of dynamic type mapping template is matching condition and type mapping. Next, we will focus on the dynamic type template mapping mechanism according to the definition of matching condition.

2.1. Match_mapping_type first uses a JSON parser to resolve the type of the field value. Since JSON cannot distinguish between long and integer, and does not allow the distinction between double and float, it always chooses a broader data type, such as 5. Match_mapping_type = match_mapping_type = match_mapping_type = match_mapping_type = match_mapping_type And map all string fields to text and keywords using the following template:

PUT my_index
{
  "mappings": {
    "_doc": {
      "dynamic_templates": [
        {
          "integers": {
            "match_mapping_type": "long",
            "mapping": {
              "type": "integer"
            }
          }
        },
        {
          "strings": {
            "match_mapping_type": "string",
            "mapping": {
              "type": "text",
              "fields": {
                "raw": {
                  "type":  "keyword",
                  "ignore_above": 256
                }
              }
            }
          }
        }
      ]
    }
  }
}
Copy the code

In short, match_mapping_type sets up a mapping for a type derived from field dynamic mapping (field type detection) and converts that type to the type defined in the mapping definition. Match parameters use pattern matching field names, while unmatch uses pattern excluding matching fields. The following are examples of match and unmatch:

PUT my_index
{
  "mappings": {
    "_doc": {
      "dynamic_templates": [
        {
          "longs_as_strings": {
            "match_mapping_type": "string",   // @1
            "match":   "long_*",                        // @2
            "unmatch": "*_text",                       // @3
            "mapping": {                                  // @4
              "type": "long"
            }
          }
        }
      ]
    }
  }
}
PUT my_index/_doc/1
{
  "long_num": "5",       // @5
  "long_text": "foo"      // @6
}
Copy the code

@1: indicates that the automatic mapping template is for a new field whose type is String detected by the JSON parser. Code @2: Field names begin with long_. Code @3: Exclude field names with _text. @4: Fields that match the start of long_ and not the end of _text. If JSON detects a new field of type string, map to long. Code @5: long_num, mapping type is long. The code @6: long_text will also start with long_ but end with _text, so this field will not be mapped to long. Instead, the type string detected by JSON will be mapped to text field and keyword multiple fields (see dynamic field mapping mechanism).

2.3 match_pattern uses regular expressions to match field names.

"dynamic_templates": [
  {
    "longs_as_strings": {
       "match_mapping_type": "string",  
	"match_pattern": "regex",   // @1
       "match": "^profit_\d+$"	   // @2
       "mapping": {                                  
           "type": "long"
        }
    }
  }
]
Copy the code

Code @1: Sets the matching mode to Regex for Java regular expressions code @2: Java regular expressions

Path_match and path_unmatch work in the same way as match and unmatch, except that path_match is a full path for fields. Especially for nested types (object, nested). Copy middle from name to full_NAME from name.

PUT my_index
{
  "mappings": {
    "_doc": {
      "dynamic_templates": [
        {
          "copy_full_name": {
            "path_match":   "name.*",
            "path_unmatch": "*.middle",
            "mapping": {
              "type":       "text",
              "copy_to":    "full_name"
            }
          }
        }
      ]
    }
  }
}
PUT my_index/_doc/1
{
  "name": {
    "first":  "Alice",
    "middle": "Mary",
    "last":   "White"
  }
}
Copy the code

2.5, {name} and {dynamic_type} {name} stands for the name of the field. {dynamic_type}: specifies the field type parsed by the JSON parser.

PUT my_index
{
  "mappings": {
    "_doc": {
      "dynamic_templates": [
        {
          "named_analyzers": {                           // @1
            "match_mapping_type": "string",
            "match": "*",
            "mapping": {
              "type": "text",
              "analyzer": "{name}"
            }
          }
        },
        {
          "no_doc_values": {                         // @2
            "match_mapping_type":"*",
            "mapping": {
              "type": "{dynamic_type}",
              "doc_values": false
            }
          }
        }
      ]
    }
  }
}

PUT my_index/_doc/1
{
  "english": "Some English text", 
  "count":   5 
}
Copy the code

Code @1: Mapping template meaning: for all matched string type, type mapping to text, the corresponding analyzer name and field name is the same, this is used carefully, there may not be an analyzer with the same name, this example is just a demonstration. Code @2: For any type that is matched, the mapping is defined as a type that is automatically detected, and doc_values=false is disabled.

This section describes the dynamic type mapping mechanism of Elasticsearch in detail.