6. In-depth polymerization analysis

1. `Bucket` & `Metric`Aggregation analysis and nested aggregation

1.1 `Bucket` & `Metric Aggregation` 与 `SQL`The understanding of the

1.2 `Aggregation`The grammar of the

Aggregation is part of Search. In general, you are advised to set its Size to 0

This will only return the Aggregation.

1.3 `Metric Aggregation`

1.3.1 `Metric`Understanding and analysis of aggregation

Single-value analysis: Outputs only one analysis result
- min,max,avg,sum
- Cardinality(similar todistinct Count)
Multivalue analysis: Output multiple analysis results
- stats,extended,stats
- percentile,percentile rank(used when you’re trying to find percentiles)
- top hits(Previous example)

1.3.2 `Metric`Specific of aggregation`Demo`

1.3.2.1 Data Preparation

Mapping PUT /employees/ {"mappings": {"properties": {"age": {"type": "integer"}, "gender": { "type": "keyword" }, "job": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 50 } } }, "name": { "type": "keyword" }, "salary": { "type": "Integer"}}}} # Add some data to the employees index PUT /employees/_bulk {"index" : {"_id" : "1"}} {"name" : "Emma","age":32,"job":"Product Manager","gender":"female","salary":35000 } { "index" : { "_id" : "2" } } { "name" : "Underwood","age":41,"job":"Dev Manager","gender":"male","salary": 50000} { "index" : { "_id" : "3" } } { "name" : "Tran","age":25,"job":"Web Designer","gender":"male","salary":18000 } { "index" : { "_id" : "4" } } { "name" : "Rivera","age":26,"job":"Web Designer","gender":"female","salary": 22000} { "index" : { "_id" : "5" } } { "name" : "Rose","age":25,"job":"QA","gender":"female","salary":18000 } { "index" : { "_id" : "6" } } { "name" : "Lucy","age":31,"job":"QA","gender":"female","salary": 25000} { "index" : { "_id" : "7" } } { "name" : "Byrd","age":27,"job":"QA","gender":"male","salary":20000 } { "index" : { "_id" : "8" } } { "name" : "Foster","age":27,"job":"Java Programmer","gender":"male","salary": 20000} { "index" : { "_id" : "9" } } { "name" : "Gregory","age":32,"job":"Java Programmer","gender":"male","salary":22000 } { "index" : { "_id" : "10" } } { "name" : "Bryant","age":20,"job":"Java Programmer","gender":"male","salary": 9000} { "index" : { "_id" : "11" } } { "name" : "Jenny","age":36,"job":"Java Programmer","gender":"female","salary":38000 } { "index" : { "_id" : "12" } } { "name" : "Mcdonald","age":31,"job":"Java Programmer","gender":"male","salary": 32000} { "index" : { "_id" : "13" } } { "name" : "Jonthna","age":30,"job":"Java Programmer","gender":"female","salary":30000 } { "index" : { "_id" : "14" } } { "name" : "Marshall","age":32,"job":"Javascript Programmer","gender":"male","salary": 25000} { "index" : { "_id" : "15" } } { "name" : "King","age":33,"job":"Java Programmer","gender":"male","salary":28000 } { "index" : { "_id" : "16" } } { "name" : "Mccarthy","age":21,"job":"Javascript Programmer","gender":"male","salary": 16000} { "index" : { "_id" : "17" } } { "name" : "Goodwin","age":25,"job":"Javascript Programmer","gender":"male","salary": 16000} { "index" : { "_id" : "18" } } { "name" : "Catherine","age":29,"job":"Javascript Programmer","gender":"female","salary": 20000} { "index" : { "_id" : "19" } } { "name" : "Boone","age":30,"job":"DBA","gender":"male","salary": 30000} { "index" : { "_id" : "20" } } { "name" : "Kathy","age":29,"job":"DBA","gender":"female","salary": 20000}Copy the code

1.3.2.2 Viewing the minimum Salary

POST employees/_search {"size": 0, "AGgs ": {"min_salary": {"min": {"field": "salary"}}}}Copy the code

1.3.2.3 View the highest salary

POST employees/_search {"size": 0, "AGgs ": {"max_salary": {" Max ": {"field": "salary"}}}}Copy the code

1.3.2.4 An aggregate outputs multiple values

The first way is the following one

POST employees/_search {"size": 0, "aggs": {"max_salary": {" Max ": {"field": "salary" } }, "min_salary": { "min": { "field": "salary" } }, "avg_salary": { "avg": { "field": "salary" } } } }Copy the code

The second way

POST employees/_search {"size": 0, "AGgs ": {"stats": {"field": "salary"}}}}Copy the code

1.4 `Bucket Aggregation`

According to certain rules, documents are assigned to different buckets to achieve the purpose of classification.ESOffer some common onesBucket Aggregation
- Term
- Numeric types
  - Range/Data Range
  - Histogram/Date Histogram
Support nesting: also do buckets in buckets

1.4.1 `Terms Aggregation`

Fields need to be openedfielddataBefore carrying outTerms Aggregation
- keywordThe default supportdoc_values
- TextNeed to beMappingThe enable. According to the results of word segmentation will be graded

1.4.2 `Terms Aggregation` 的`Demo`

1.4.2.1 to`Job`and`job.keyword`aggregated

POST employees/_search {"size": 0, "aggs": {"jobs": {"terms": {"field": "job. Keyword "}}}}Copy the code

POST employees/_search {"size": 0, "aggs": {"jobs": {"terms": {"field": "job"}}}}Copy the code

If you want to do aggregate analysis on fields of type Text, you need to enable FieldData in Mapping

# open fieldData for Text and support terms aggregation PUT employees/_mapping {"properties": {"job": {"type": "text", "fielddata": true } } }Copy the code

# distinct use POST employees/_search {"size": 0, "AGgs ": {"cardinate": {"cardinality": {"field": "job.keyword" } } } }Copy the code

1.4.2.2 Conduct for gender`Terms`The aggregation

POST employees/_search {"size": 0, "aggs": {"gender": {"terms": {"field": "gender"}}}}Copy the code

1.4.2.3 specified`bucket size`

# specify bucket size POST employees/_search {"size": 0, "AGgs ": {" agES_5 ": {"terms": {"field": "age", "size": 3}}}}Copy the code

I’m going to have three buckets

1.4.3 `Bucket Size` & `Top Hits` 的 `Demo`

Application scenario: After a bucket is obtained, the list of the most matched documents on the top of the bucket is displayed
Size: Buckets are divided by age to find the bucket information of the specified data amount
Top Hits: Look at the three oldest employees in each job category

POST employees/_search {"size": 0, "aggs": {"jobs": {"terms": {"field": "job.keyword" }, "aggs": { "old_employee": { "top_hits": { "size": 3, "sort": [ { "age": { "order": "desc" } } ] } } } } } }Copy the code

1.4.4 optimization`Terms`Performance of aggregation

This configuration is turned on when aggregated queries are very frequent. It is a pre-loaded configuration switch that can greatly improve performance

1.4.5 `Range` & `Histogram`The aggregation

Buckets are divided according to the range of numbers
inRange AggregationCan be customizedKey
Demo:
- By salaryRangePoints barrels
- According to the interval of salary (HistogramBarrels) points

{"size": 0, "aggs": {"salary_range": {"range": {"field": "salary", "ranges": [ { "to": 10000 }, { "from": 10000, "to": 20000 }, { "key": ">20000", "from": 20000 } ] } } } }Copy the code

# salary_histrogram {"size": 0, "aggs": {"salary_histrogram": {"salary_histrogram": { "histogram": { "field": "salary", "interval": 5000, "extended_bounds": { "min": 0, "max": 100000 } } } } }Copy the code

1.5 `Bucket Aggregation` + `Metric Aggregation`

BucketAggregation analysis allows further analysis by adding subaggregation analysis, which can be
- Bucket
- Metric
Demo
- According to the type of work for buckets, and statistical salary information
- Buckets are divided first by job type, then by gender, and salary information is collected

1.5.1 Nested aggregation`Demo`

POST employees/_search {"size": 0, "aggs": {"job": {"terms": {"field": "job.keyword" }, "aggs": { "salary": { "stats": { "field": "salary" } } } } } }Copy the code

# multiple nesting. POST employees/_search {"size": 0, "AGgs ": {"job": {"terms": {"field": "job.keyword" }, "aggs": { "gender": { "terms": { "field": "gender" }, "aggs": { "stat_salary": { "stats": { "field": "salary" } } } } } } } }Copy the code

2. `Pipeline`Aggregation analysis (do the aggregation again)

Basically, you do the aggregation analysis, you do the aggregation analysis again

Example: 2.1`Pipeline: min_bucket`

Of the occupations with the largest number of employees, find the occupations with the lowest average wages

Bucket_path is used to specify the keyword. See bucket_PATH later so this is a Pipeline aggregation

2.2 `Pipeline`Conceptual understanding

Pipe (Pipeline) concept: support aggregation analysis of the results of aggregation analysis
PipelineThe analysis results will be output to the original results, which can be divided into two categories according to the different positions
- Sibling: results are identical to existing analysis results (this example is Sibling type)
  - Max, Min , Avg & Sum Bucket
  - Stats, Extended Status Bucket
  - Percetiles Bucket
- Parent: Results are embedded in existing aggregation analysis results
  - Derivate (derivative)
  - Cumultive(cumulative sum)
  - Moving Function(Moving Window)

2.3 example

Note that the experimental data for the following demonstration examples are the same as the prepared data above

2.3.1 View the types of jobs with the lowest average wages

POST /employees/_search {"size": 0, "aggs": {"jobs": {"terms": {"field": "job. Keyword ", "size": 10 }, "aggs": { "avg_salary": { "avg": { "field": "salary" } } } }, "min_salary_by_job": { "min_bucket": { "buckets_path": "jobs>avg_salary" } } } }Copy the code

2.3.2 Percentile of average salary

POST employees/_search {"size": 0, "aggs": {"jobs": {"terms": {"field": "job. Keyword ", "size": 10 }, "aggs": { "avg_salary": { "avg": { "field": "salary" } } } }, "percentiles_salary_by_job": { "percentiles_bucket": { "buckets_path": "jobs>avg_salary" } } } }Copy the code

2.3.3 Take the derivative of average salary according to age

POST employees/_search {"size": 0, "aggs": {"age": { "age", "min_doc_count": 1, "interval": 1 }, "aggs": { "avg_salary": { "avg": { "field": "salary" } }, "derivative_avg_salary": { "derivative": { "buckets_path": "avg_salary" } } } } } }Copy the code

3. Scope and sorting

What if we want to aggregate on the result set of a query structure?

This is the scope of aggregation

ESThe default scope for aggregate analysis isqueryQuery result set of
At the same timeESThe following ways to change the scope of aggregation are also supported
- Filter
- Post Filter
- Global

3.1 `Query`Scope of action of

POST employees/_search {"size": 0, "Query ": {"range": {"age": {"gte": 40}}}, "aggs": {"jobs": { "terms": { "field": "job.keyword" } } } }Copy the code

3.2 `Filter`Scope of action of

FilterIt can apply to something specificaggsIn the query

# Filter
POST employees/_search
{
  "size": 0,
  "aggs": {
    "old_person": {
      "filter": {
        "range": {
          "age": {
            "from": 35  
          }
        }
      },
      "aggs": {
        "jobs": {
          "terms": {
            "field": "job.keyword"
          }
        }
      }
    },
    "all_jobs": {
      "terms": {
        "field": "job.keyword"
      }
    }
  }
}
Copy the code

3.3 `Post Filter`Scope of action of

When we’re done aggregating, we can use it if we want to show specific information that fits the criteriaPost Filter

# post field, a statement that finds all job types. POST employees/_search {"aggs": {"jobs": {"terms": {"field": "job. Keyword "}}}, "post_filter": { "match": { "job.keyword": "Dev Manager" } } }Copy the code

3.4 `Global`Scope of action of

GlobalCan be ignored by our aggregationQueryThe qualified

# global
POST employees/_search
{
    "size": 0,
    "query": {
        "range": {
            "age": {
                "gte": 40
            }
        }
    },
    "aggs": {
        "jobs": {
            "terms": {
                "field": "job.keyword"
            }
        },
        "all": {
            "global": {},
            "aggs": {
                "salary_avg": {
                    "avg": {
                        "field": "salary"
                    }
                }
            }
        }
    }
}
Copy the code

3.5 the sorting

# order # count and key POST employees/_search {"size": 0, "query": {"range": {"age": {"gte": 20}}}, "aggs": { "jobs": { "terms": { "field": "job.keyword", "order": [ { "_count": "asc" }, { "_key": "desc" } ] } } } }Copy the code

POST employees/_search
{
    "size": 0,
    "aggs": {
        "jobs": {
            "terms": {
                "field": "job.keyword",
                "order": [
                    {
                        "avg_salary": "desc"
                    }
                ]
            },
            "aggs": {
                "avg_salary": {
                    "avg": {
                        "field": "salary"
                    }
                }
            }
        }
    }
}
Copy the code

Elasticsearch Is a game about Elasticsearch. It’s a game about Elasticsearch.

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Elasticsearch core Technology and Practice 5

6. In-depth polymerization analysis

1. `Bucket` & `Metric`Aggregation analysis and nested aggregation

1.1 `Bucket` & `Metric Aggregation` 与 `SQL`The understanding of the

1.2 `Aggregation`The grammar of the

1.3 `Metric Aggregation`

1.3.1 `Metric`Understanding and analysis of aggregation

1.3.2 `Metric`Specific of aggregation`Demo`

1.3.2.1 Data Preparation

1.3.2.2 Viewing the minimum Salary

1.3.2.3 View the highest salary

1.3.2.4 An aggregate outputs multiple values

1.4 `Bucket Aggregation`

1.4.1 `Terms Aggregation`

1.4.2 `Terms Aggregation` 的`Demo`

1.4.2.1 to`Job`and`job.keyword`aggregated

1.4.2.2 Conduct for gender`Terms`The aggregation

1.4.2.3 specified`bucket size`

1.4.3 `Bucket Size` & `Top Hits` 的 `Demo`

1.4.4 optimization`Terms`Performance of aggregation

1.4.5 `Range` & `Histogram`The aggregation

1.5 `Bucket Aggregation` + `Metric Aggregation`

1.5.1 Nested aggregation`Demo`

2. `Pipeline`Aggregation analysis (do the aggregation again)

Example: 2.1`Pipeline: min_bucket`

2.2 `Pipeline`Conceptual understanding

2.3 example

2.3.1 View the types of jobs with the lowest average wages

2.3.2 Percentile of average salary

2.3.3 Take the derivative of average salary according to age

3. Scope and sorting

3.1 `Query`Scope of action of

3.2 `Filter`Scope of action of

3.3 `Post Filter`Scope of action of

3.4 `Global`Scope of action of

3.5 the sorting

Elasticsearch core Technology and Practice 5

6. In-depth polymerization analysis

1. Bucket & MetricAggregation analysis and nested aggregation

1.1 Bucket & Metric Aggregation 与 SQLThe understanding of the

1.2 AggregationThe grammar of the

1.3 Metric Aggregation

1.3.1 MetricUnderstanding and analysis of aggregation

1.3.2 MetricSpecific of aggregationDemo

1.3.2.1 Data Preparation

1.3.2.2 Viewing the minimum Salary

1.3.2.3 View the highest salary

1.3.2.4 An aggregate outputs multiple values

1.4 Bucket Aggregation

1.4.1 Terms Aggregation

1.4.2 Terms Aggregation 的Demo

1.4.2.1 toJobandjob.keywordaggregated

1.4.2.2 Conduct for genderTermsThe aggregation

1.4.2.3 specifiedbucket size

1.4.3 Bucket Size & Top Hits 的 Demo

1.4.4 optimizationTermsPerformance of aggregation

1.4.5 Range & HistogramThe aggregation

1.5 Bucket Aggregation + Metric Aggregation

1.5.1 Nested aggregationDemo

2. PipelineAggregation analysis (do the aggregation again)

Example: 2.1Pipeline: min_bucket

2.2 PipelineConceptual understanding

2.3 example

2.3.1 View the types of jobs with the lowest average wages

2.3.2 Percentile of average salary

2.3.3 Take the derivative of average salary according to age

3. Scope and sorting

3.1 QueryScope of action of

3.2 FilterScope of action of

3.3 Post FilterScope of action of

3.4 GlobalScope of action of

3.5 the sorting

Related Posts

Computer network review (2) application layer

Decoupled artifact observer mode

The ElasticSearch ElasticSearch ElasticSearch ElasticSearch ElasticSearch ElasticSearch ElasticSearch ElasticSearch ElasticSearch

1. `Bucket` & `Metric`Aggregation analysis and nested aggregation

1.1 `Bucket` & `Metric Aggregation` 与 `SQL`The understanding of the

1.2 `Aggregation`The grammar of the

1.3 `Metric Aggregation`

1.3.1 `Metric`Understanding and analysis of aggregation

1.3.2 `Metric`Specific of aggregation`Demo`

1.4 `Bucket Aggregation`

1.4.1 `Terms Aggregation`

1.4.2 `Terms Aggregation` 的`Demo`

1.4.2.1 to`Job`and`job.keyword`aggregated

1.4.2.2 Conduct for gender`Terms`The aggregation

1.4.2.3 specified`bucket size`

1.4.3 `Bucket Size` & `Top Hits` 的 `Demo`

1.4.4 optimization`Terms`Performance of aggregation

1.4.5 `Range` & `Histogram`The aggregation

1.5 `Bucket Aggregation` + `Metric Aggregation`

1.5.1 Nested aggregation`Demo`

2. `Pipeline`Aggregation analysis (do the aggregation again)

Example: 2.1`Pipeline: min_bucket`

2.2 `Pipeline`Conceptual understanding

3.1 `Query`Scope of action of

3.2 `Filter`Scope of action of

3.3 `Post Filter`Scope of action of

3.4 `Global`Scope of action of