By default, each hit in the search response contains the document _source, which is the entire JSON object provided when the document is indexed. To retrieve a specific field in the search response, use the fields parameter:
POST my-index-000001/_search
{
"query": {
"match": {
"message": "foo"
}
},
"fields": ["user.id", "@timestamp"],
"_source": false
}
Copy the code
The fields parameter checks both the document’s _source and index maps to load and return values. Because it makes use of mappings, fields have some advantages over direct references to _source: it accepts multi-fields and field aliases, and it also formats field values such as dates in a consistent way.
The document’s _source is stored in a single field in Lucene. Therefore, even if only a few fields are requested, the entire _source object must be loaded and parsed. To avoid this limitation, you can try another way of loading fields:
- Gets the value of the selected field using the docvalue_fields parameter. This is a good choice when fairly few fields (such as keywords and dates) are returned that support doc values.
- Use the stored_fields parameter to get the value of a specific storage field (field that uses the Store mapping option).
If desired, the script_field parameter can be used to convert the field values in the response through the script. However, scripts cannot take advantage of Elasticsearch’s index structure or related optimizations. Sometimes this can lead to slower searches.
You can find more details about each method in the following sections:
- Fields
- Doc value fields
- Stored fields
- Source filtering
- Script fields
Fields
WARNING: This feature is in beta and is subject to change. The design and code are less mature than the formal GA functionality and are provided as-is without warranty. Beta functionality is not subject to the support SLA of official GA functionality.
The fields parameter allows you to retrieve a list of document fields in the search response. It looks at both the document _source and the index map, returning each value in a standardized way that conforms to its mapping type. By default, a date field is formatted according to the date format parameter in its mapping.
The following search request uses the fields parameter to retrieve the user.id field as http.response. All fields at the beginning and the value of the @TIMESTAMP field:
POST my-index-000001/_search { "query": { "match": { "user.id": "kimchy" } }, "fields": {"field": "@timestamp", "format": Date fields date and date_nanos accept date format. Spatial fields accept geoJSON as geoJSON (default), Other field types do not support the format argument}], "_source": false}Copy the code
In each match, the values in the Fields section are returned as a flat list:
{ "took" : 2, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : {" total ": {" value" : 1, the "base" : "eq"}, "max_score" : 1.0, "hits" : [{" _index ":" my - index - 000001 ", "_id" : "0", "_score" : 1.0, "_type" : "_doc", "fields" : {". User id: "[]" kimchy ", "@ timestamp" : [ "4098435132000" ], "http.response.bytes": [ 1070000 ], "http.response.status_code": [ 200 ] } } ] } }Copy the code
Only leaf fields are returned; fields does not allow extraction of the entire object.
The fields parameter is used to handle field types, such as field aliases and constant_keyword, whose value does not always appear in _source. Other mapping options are considered, including ignore_above, ignore_malformed, and null_value.
NOTE: Even if there is only one value in _source, the fields response always returns an array of values for each field. This is because Elasticsearch has no dedicated array type, and any field can contain multiple values. The fields argument also does not guarantee that the array values will be returned in a particular order. For more background, see the Arrays mapping documentation.
Doc value fields
You can use the docvalue_fields parameter to return doc values for one or more fields in the search response.
Doc values store the same values as _source, but are optimized in a column-based structure on disk for sorting and summarizing. Because each field is stored separately, Elasticsearch only reads the requested field value and avoids loading the entire document _source.
By default, document values are stored for supported fields. However, doc values are not supported for text or Text_Annotated fields.
The following search request uses the docvalue_fields parameter to retrieve the user.id field as http.response. Doc values for the @TIMESTAMP field:
GET my-index-000001/_search { "query": { "match": { "user.id": "kimchy" } }, "docvalue_fields": {"field": "date", "format": "Epoch_millis" # Using the object notation, you can pass format parameters, Date fields support date format. Value fields support DecimalFormat mode. Other field data types do not support format parameters}]}Copy the code
TIP: You cannot retrieve doc values of nested objects using the docvalue_fields parameter. If a nested object is specified, the search returns an empty array ([]) for that field. To access nested fields, use the docvalue_fields property of the inner_hits parameter.
Stored fields
You can also use the Store mapping option to store the values of individual fields. You can include these stored values in the search response using the stored_fields parameter.
WARNING: The stored_fields parameter is used to explicitly mark fields as stored in a map. It is off by default and is generally not recommended. Instead, source filtering is used to select a subset of the original source documents to return.
Allows selective loading of specific storage fields for each document that the search hits the representation.
GET /_search
{
"stored_fields" : ["user", "postDate"],
"query" : {
"term" : { "user" : "kimchy" }
}
}
Copy the code
* Can be used to load all stored fields from the document.
An empty array causes only the _id and _type of each match to be returned, for example:
GET /_search
{
"stored_fields" : [],
"query" : {
"term" : { "user" : "kimchy" }
}
}
Copy the code
If the requested fields are not stored (with store mapping set to false), they are ignored.
Storage field values retrieved from the document itself are always returned as an array. In contrast, metadata fields such as _routing are never returned as arrays.
In addition, leaf fields can only be returned with the stored_fields option. If an object field is specified, it is ignored.
NOTE: As such, stored_fields cannot be used to load fields in a nested object, and if a field contains a nested object in its path, no data is returned for that stored field. To access nested fields, you must use stored_fields within the inner_hits block.
Disable stored fields
To disable stored fields (and metadata fields) completely, use _none_ :
GET /_search
{
"stored_fields": "_none_",
"query" : {
"term" : { "user" : "kimchy" }
}
}
Copy the code
NOTE: The _source and version parameters cannot be activated if _none_ is used.
Source filtering
You can use the _source parameter to select which fields to return from the source, which is called source filtering.
The following search API request sets the _source request body parameter to false. The document source is not included in the response.
GET /_search
{
"_source": false,
"query": {
"match": {
"user.id": "kimchy"
}
}
}
Copy the code
To return only a subset of the source fields, specify the wildcard (*) pattern in the _source parameter. The following search API request returns only the source of the OBj field and its attributes.
GET /_search
{
"_source": "obj.*",
"query": {
"match": {
"user.id": "kimchy"
}
}
}
Copy the code
You can also specify an array of wildcard patterns in the _source field. The following search API request returns only the sources of obj1 and OBJ2 fields and their attributes.
GET /_search
{
"_source": [ "obj1.*", "obj2.*" ],
"query": {
"match": {
"user.id": "kimchy"
}
}
Copy the code
For better control, you can specify an object with an array of includes and Excludes patterns in the _source parameter.
If an includes attribute is specified, only the source fields that match one of its schemas are returned. You can use the Excludes attribute to exclude fields from this subset.
If the include attribute is not specified, the entire document source is returned, excluding any fields from the Excludes attribute that match the pattern.
The following search API request returns only the sources of obj1 and OBJ2 fields and their attributes, and does not include any subdescription fields.
GET /_search
{
"_source": {
"includes": [ "obj1.*", "obj2.*" ],
"excludes": [ "*.description" ]
},
"query": {
"term": {
"user.id": "kimchy"
}
}
}
Copy the code
Script fields
You can use the script_fields parameter to retrieve script evaluations (based on different fields) for each match. Such as:
GET /_search { "query": { "match_all": {} }, "script_fields": { "test1": { "script": { "lang": "painless", "source": "doc['price'].value * 2" } }, "test2": { "script": { "lang": "painless", "source": "Doc [' price '] value * params. The factor", "params" : {" factor ", 2.0}}}}}Copy the code
Script fields can work on fields that are not stored (price in the case above) and allow custom values to be returned (the evaluation value of the script).
Script fields can also use params[‘_source’] to access the actual _source document and extract specific elements to return from it. Here’s an example:
GET /_search
{
"query" : {
"match_all": {}
},
"script_fields" : {
"test1" : {
"script" : "params['_source']['message']"
}
}
}
Copy the code
Note the _source keyword here to browse jSON-like models.
It is important to understand the difference between doc[‘my_field’].value and params[‘_source’][‘my_field’]. The first use of the doc keyword will result in the field’s terms being loaded into memory (cache), which will result in faster execution but will take up more memory. In addition, doc […]. The notation allows only simple value fields (from which you cannot return A JSON object) and only makes sense for fields that are not parsed or based on a single term. However, it is still recommended to use doc (even if possible) to access values from documents, because _source is very slow to use because it must be loaded and parsed every time it is used.
See the website: www.elastic.co/guide/en/el…