Aggregation operations

Gets an aggregate action instance of a database collection

db.collection('scores').aggregate()
Copy the code

addFields

Aggregation phase. Adds a new field to the output record. After the addFields aggregation phase, all records output will have fields specified by addFields in addition to those entered with them.

API specification

AddFields is equivalent to the Project phase that specifies both existing and new fields.

addFieldsThe form is as follows:

addFields({  <The new fields>: <expression>})
Copy the code

AddFields specifies multiple new fields, the value of which is determined by the expression used.

If the specified new field has the same name as the original field, the value of the new field overrides the value of the original field. Note that addFields cannot be used to add elements to array fields.

Example 1: addFields twice in a row

Assume that set scores has the following records:

{
  _id: 1,
  student: "Maya",
  homework: [ 10.5.10 ],
  quiz: [ 10.8 ],
  extraCredit: 0
}
{
  _id: 2,
  student: "Ryan",
  homework: [ 5.6.5 ],
  quiz: [ 8.8 ],
  extraCredit: 8
}
Copy the code

AddFields is applied twice, the first time adding two fields as the sum of homework and quiz, the second time adding a field and then calculating the sum based on the previous two sums.

const $ = db.command.aggregate
db.collection('scores').aggregate()
  .addFields({
    totalHomework: $.sum('$homework'),
    totalQuiz: $.sum('$quiz')
  })
  .addFields({
    totalScore: $.add(['$totalHomework'.'$totalQuiz'.'$extraCredit'])}).end(a)Copy the code

The result is as follows:

{
  "_id" : 1."student" : "Maya"."homework" : [ 10.5.10]."quiz" : [ 10.8]."extraCredit" : 0."totalHomework" : 25."totalQuiz" : 18."totalScore" : 43
}
{
  "_id" : 2."student" : "Ryan"."homework" : [ 5.6.5]."quiz" : [ 8.8]."extraCredit" : 8."totalHomework" : 16."totalQuiz" : 16."totalScore" : 40
}
Copy the code

Example 2: Adding fields to a nested record

You can add fields to a nested record using dot notation. Suppose the vehicles collection contains the following records:

{ _id: 1, type: "car", specs: { doors: 4, wheels: 4 } }
{ _id: 2, type: "motorcycle", specs: { doors: 0, wheels: 2 } }
{ _id: 3, type: "jet ski" }
Copy the code

A new field, fuel_type, can be added to the specs field as follows, both set to the fixed string unleaded:

db.collection('vehicles').aggregate()
  .addFields({
    'spec.fuel_type': 'unleaded'}).end(a)Copy the code

The result is as follows:

{ _id: 1, type: "car",
   specs: { doors: 4, wheels: 4, fuel_type: "unleaded" } }
{ _id: 2, type: "motorcycle",
   specs: { doors: 0, wheels: 2, fuel_type: "unleaded" } }
{ _id: 3, type: "jet ski",
   specs: { fuel_type: "unleaded"}}Copy the code

Example 3: Set the field value to another field

You can set the value of a field to the value of another field by expressing the value as a string of $plus the field name.

Using the same collection example, you could add a field vehicle_type with the following operation, setting its value to the value of the Type field:

db.collection('vehicles').aggregate()
  .addFields({
    vehicle_type: '$type'}).end(a)Copy the code

The result is as follows:

{ _id: 1, type: "car", vehicle_type: "car",
   specs: { doors: 4, wheels: 4, fuel_type: "unleaded" } }
{ _id: 2, type: "motorcycle", vehicle_type: "motorcycle",
   specs: { doors: 0, wheels: 2, fuel_type: "unleaded" } }
{ _id: 3, type: "jet ski", vehicle_type: "jet ski",
   specs: { fuel_type: "unleaded"}}Copy the code

bucket

Aggregation phase. The input records are divided into different groups, each of which is a bucket, based on given conditions and boundaries.

API specification

Each group is output as a record, containing an _ID field with a lower bound and a count field with a value of the number of records in the group. Count is output by default when output is not specified.

A bucket is output only when there is at least one record in the group.

A bucket is of the following form:

bucket({
  groupBy: <expression>,
  boundaries: [<lowerbound1>.<lowerbound2>. ] .default: <literal>,
  output: {
    <output1>: <accumulator expr>.<outputN>: <accumulator expr>}})Copy the code

GroupBy is a grouping expression that is applied to each input record. You can use the $prefix plus the path of the field you want to group as an expression. Unless default values are specified with default, each record must contain the specified field and the field values must be within the boundaries specified.

Boundaries are an array where each element is the lower boundary of each group. At least two boundary values must be specified. Array values must be incremented of the same type.

Default Optional. After this parameter is specified, records that do not enter any group will enter a default group. The id of the group is determined by default. The value of default must be less than the minimum value in boundaries or greater than or equal to the maximum value. The value of default can be different from the value type of the boundaries element.

Output is optional. It determines which fields to include in the output record in addition to _id. The values of each field must be specified by an accumulator expression. When output is specified, the default count is not output by default and must be specified manually:

output: {  count: $.sum(1),...<outputN>: <accumulator expr>}
Copy the code

To use a bucket, at least one of the following conditions must be met, otherwise an error will be thrown:

Each input record applied to the groupBy expression must be a value within boundaries

Specifies a default value that is outside boundaries, or a different type from the value of the boundaries element.

The sample

Suppose the collection items have the following record:

{
  _id: "1",
  price: 10
}
{
  _id: "2",
  price: 50
}
{
  _id: "3",
  price: 20
}
{
  _id: "4",
  price: 80
}
{
  _id: "5",
  price: 200
}
Copy the code

Group the above records, divide [0, 50) into a group, [50, 100) into a group, and other records into a group:

const $ = db.command.aggregate
db.collection('items').aggregate()
  .bucket({
    groupBy: '$price',
    boundaries: [0.50.100].default: 'other',
    output: {
      count: $.sum(),
      ids: $.push('$_id')}}).end(a)Copy the code

The result is as follows:

[{"_id": 0."count": 2."ids": [
      "1"."3"] {},"_id": 50."count": 2."ids": [
      "2"."4"] {},"_id": "other"."count": 22."ids": [
      "5"]}]Copy the code

bucketAuto

Aggregation phase. Divide the input records into different groups, one bucket per group, based on the given criteria. One of the differences with buckets is that no boundaries are assigned; bucketAuto automatically tries to split the records as evenly as possible between each group.

Each group is output as a record, which contains an _ID field based on the object containing the maximum and minimum values of the group and a count field based on the number of records in the group. Count is output by default when output is not specified.

The form of bucketAuto is as follows:

bucketAuto({
  groupBy: <expression>,
  buckets: <number>,
  granularity: <string>,
  output: {
    <output1>: <accumulator expr>.<outputN>: <accumulator expr>}})Copy the code

GroupBy is a grouping expression that is applied to each input record. You can use the $prefix plus the path of the field you want to group as an expression. Unless default values are specified with default, each record must contain the specified field and the field values must be within the boundaries specified.

Buckets is a positive integer used to specify the number of partitions.

Granularity is a string of optional enumerated values used to ensure that automatically calculated boundaries conform to a given rule. This field can only be used if all groupBy values are numeric and there is no NaN. Enumeration values include R5, R10, R20, R40, R80, 1-2-5, E6, E12, E24, E48, E96, E192, and POWERSOF2.

Output is optional. It determines which fields to include in the output record in addition to _id. The values of each field must be specified by an accumulator expression. When output is specified, the default count is not output by default and must be specified manually:

output: {  count: $.sum(1),...<outputN>: <accumulator expr>}
Copy the code

The number of output groups may be smaller than the given number of groups in the following cases:

The number of input records is less than the number of packets

  • groupByThe calculated unique value is less than the number of groups
  • granularityThe spacing is less than the number of packets
  • granularityNot refined enough to divide evenly among groups

Granularity

Granularity is used to ensure that boundary values belong to a given sequence of numbers.

Renard sequence

The Renard sequence is a sequence of numbers between 1.0 and 10.0 (or 10.3 in the case of R80) derived from the 5/10/20/40/80th root of 10.

Granularity is set to R5 / R10 / R20 / R40 / R80 to limit the granularity to the sequence. If groupBy is not in the range of 1.0 to 10.0 (10.3 for R80), the sequence number is automatically multiplied by 10.

E sequence

The E sequence is a sequence of numbers between 1.0 and 10.0 derived from 10 to the 6/12/24/48/96/192 with a specific error.

The 1-2-5 series

The 1-2-5 sequence performed the same as the three-valued Renard sequence.

Two to the power

A sequential number consisting of the powers of 2.

The sample

Suppose the collection items have the following record:

{
  _id: "1",
  price: 10.5
}
{
  _id: "2",
  price: 50.3
}
{
  _id: "3",
  price: 20.8
}
{
  _id: "4",
  price: 80.2
}
{
  _id: "5",
  price: 200.3
}
Copy the code

The above records are automatically grouped into three groups:

const $ = db.command.aggregate
db.collection('items').aggregate()
  .bucket({
    groupBy: '$price',
    buckets: 3,}).end(a)Copy the code

The result is as follows:

{
  "_id": {
    "min": 10.5."max": 50.3
  },
  "count": 2
}
{
  "_id": {
    "min": 50.3."max": 200.3
  },
  "count": 2
}
{
  "_id": {
    "min": 200.3."max": 200.3
  },
  "count": 1
}
Copy the code

count

Aggregation phase. Computes the number of records entered in the previous aggregation phase to this phase and outputs a record where the value of the specified field is the number of records.

API specification

Count is of the following form:

count(<string>)
Copy the code

Is the name of the field that outputs the number of records. It cannot be an empty string, start with a $, and contain a. Characters.

The count phase is equivalent to the group + project operation:

const $ = db.command.aggregate
db.collection('items').aggregate()
  .group({
    _id: null,
    count: $.sum(1),
  })
  .project({
    _id: 0,}).end(a)Copy the code

The above operation outputs a record containing the count field.

The sample

Suppose the collection items have the following record:

{
  _id: "1",
  price: 10.5
}
{
  _id: "2",
  price: 50.3
}
{
  _id: "3",
  price: 20.8
}
{
  _id: "4",
  price: 80.2
}
{
  _id: "5",
  price: 200.3
}
Copy the code

Find the number of records where the price is greater than 50:

const $ = db.command.aggregate
db.collection('items').aggregate()
  .match({
    price: $.gt(50)}).count('expensiveCount').end(a)Copy the code

The result is as follows:

{  "expensiveCount": 3}
Copy the code

group

Aggregation phase. The input records are grouped by a given expression, each record represents a group when output, and the _id of each record is the key that distinguishes the different groups. Accumulative values can also be included in the output record, and setting the output field to the accumulative value computes the accumulative value from the group.

API specification

The form of group is as follows:

group({
  _id: <expression>.<field1>: <accumulator1>.<fieldN>: <accumulatorN>
})
Copy the code

The _id parameter is mandatory. If constant is specified, there is only one set. The other fields are optional and are aggregate values, using a $. Sum or other aggregator (const $= db.mand. Aggregate), but other expressions can also be used.

The accumulator must be one of the following operators:

See the accumulator operator for details

The operator instructions
addToSet Adds a value to an array without doing anything if the value already exists in the array
avg Returns the average value of data for a specified field in a set of collections
sum Computes and returns the sum of all values of a set of fields
first Returns the value of the first record of a specified field in a set of collections. This operation makes sense only if the set is sorted by some definition.
last Returns the value of the last record of a specified field in a set of collections. This operation makes sense only if the set is sorted by some definition.
max Returns the maximum value of a set of values
min Returns the minimum value of a set of values
push In the group phase, returns an array of the columns specified by the expression and their corresponding values
stdDevPop Returns the standard deviation of the values of a set of fields
stdDevSamp Calculate the sample standard deviation of the input values. If the input value on behalf of the general data, or no more data, please use db.com mand. Aggregate. StdDevPop
mergeObjects Combine multiple documents into a single document

Memory limit

This phase has a 100M memory usage limit.

Example 1: Group by field value

Assume that the set Avatar has the following records:

{
  _id: "1",
  alias: "john",
  region: "asia",
  scores: [40.20.80],
  coins: 100
}
{
  _id: "2",
  alias: "arthur",
  region: "europe",
  scores: [60.90],
  coins: 20
}
{
  _id: "3",
  alias: "george",
  region: "europe",
  scores: [50.70.90],
  coins: 50
}
{
  _id: "4",
  alias: "john",
  region: "asia",
  scores: [30.60.100.90],
  coins: 40
}
{
  _id: "5",
  alias: "george",
  region: "europe",
  scores: [20],
  coins: 60
}
{
  _id: "6",
  alias: "john",
  region: "asia",
  scores: [40.80.70],
  coins: 120
}
Copy the code
const $ = db.command.aggregate
db.collection('avatar').aggregate()
  .group({
    _id: '$alias',
    num: $.sum(1)}).end(a)Copy the code

The result is as follows:

{
  "_id": "john"."num": 3
}
{
  "_id": "authur"."num": 1
}
{
  "_id": "george"."num": 2
}
Copy the code

Example 2: Grouping by multiple values

The way _id is passed records can be grouped by multiple values. According to the above sample data, groups were divided according to regions with the same highest score, and the total amount of coins in each group was calculated:

const $ = db.command.aggregate
db.collection('avatar').aggregate()
  .group({
    _id: {
      region: '$region',
      maxScore: $.max('$scores')
    },
    totalCoins: $.sum('$coins')}).end(a)Copy the code

The result is as follows:

{
  "_id": {
    "region": "asia"."maxScore": 80
  },
  "totalCoins": 220
}
{
  "_id": {
    "region": "asia"."maxScore": 100
  },
  "totalCoins": 100
}
{
  "_id": {
    "region": "europe"."maxScore": 90
  },
  "totalCoins": 70
}
{
  "_id": {
    "region": "europe"."maxScore": 20
  },
  "totalCoins": 60
}
Copy the code

limit

Aggregation phase. Limits the number of records output to the next stage.

The sample

Suppose the collection items have the following record:

{
  _id: "1",
  price: 10
}
{
  _id: "2",
  price: 50
}
{
  _id: "3",
  price: 20
}
{
  _id: "4",
  price: 80
}
{
  _id: "5",
  price: 200
}
Copy the code

Returns the two smallest records with a price greater than 20:

const $ = db.command.aggregate
db.collection('items').aggregate()
  .match({
    price: $.gt(20)
  })
  .sort({
    price: 1,
  })
  .limit(2).end(a)Copy the code

The result is as follows:

{
  "_id": "3"."price": 20
}
{
  "_id": "4"."price": 80
}
Copy the code

lookup

Aggregation phase. Table query. Left outer JOIN left outer join left outer join left outer join left outer join left outer join For each input record in this phase, the lookup adds an array field to the record that is the list of records in the linked table that meet the matching criteria. The LOOKUP will output the result of the connection to the next stage.

API specification

Lookup can be used in two ways

Equal match

The following definition is used to equal match a field of the input record to a field of the joined collection:

lookup({
  from: <The name of the collection to connect to>,
  localField: <The field of the input record to be matched for equality>,
  foreignField: <The field of the connected collection for equality matching>.as: <The name of the array field output>
})
Copy the code

Parameter Description

Parameter field instructions
from The name of the other collection to connect to
localField The field name of the input record of the current pipeline that will be used for equal matching with the foreignField of the collection specified from. If the field is not present in the input record, the value of the field will be treated as null in a match
foreignField The name of the field of the connected collection that will be used for equal matching with localField. If the field does not exist in the records of the connected collection, the value of the field will be treated as NULL for matching
as Specifies the field names to store in the list of records matched by the join. This array contains matched records from the FROM collection. If the field is already in the input record, it will be overwritten

This operation is equivalent to the following pseudo-SQL operation:

SELECT *.<output array field>
FROM collection
WHERE <output array field> IN (SELECT *
                               FROM <collection to join>
                               WHERE <foreignField> = <collection.localField>);
Copy the code

Example:

  • Specify an equality matching condition
  • Apply equality matching to array fields
  • Combine mergeObjects to apply equality matching

Customize connection conditions and splice subqueries

This usage is not currently supported by Aliyun

If you need to specify join conditions other than equality matching, or specify multiple equality matching conditions, or you need to concatenate the results of a subquery of the joined set, you can use the following definition:

lookup({
  from: <The name of the collection to connect to>,
  let: { <variable1>: <expression1>. .<The variable n>: <The expression of n> },
  pipeline: [ <Pipelined operations on collections to be joined>].as: <The name of the array field output>
})
Copy the code

Parameter Description

Parameter field instructions
from The name of the other collection to connect to
let Optional. Specify a variable that can be used in pipeline. The value of the variable can refer to the input record field. For example, let: {userName: ‘$name’} represents the name field of the input record as the value of the variable userName. Fields of input records cannot be accessed directly in pipeline and must be accessed through let definitions in the expR operator
Variable name, for example Variable name, for example
The userName.
pipeline Specifies the aggregation operation to run in the connected collection. If the entire collection is returned, the field takes the value of an empty array []. Fields of input records cannot be accessed directly in pipeline and must be accessed through let definitions in the expR operator
Variable name, for example Variable name, for example
The userName.
as Specifies the field names to store in the list of records matched by the join. This array contains matched records from the FROM collection. If the field is already in the input record, it will be overwritten

This operation is equivalent to the following pseudo-SQL statement:

SELECT *.<output array field>
FROM collection
WHERE <output array field> IN (SELECT <documents as determined from the pipeline>
                               FROM <collection to join>
                               WHERE <pipeline> );
Copy the code

example

  • Specify multiple connection conditions
  • Concatenates subqueries of the joined collection

The sample

Specify an equality matching condition

Suppose the Orders collection has the following records:

[{"_id":4."book":"novel 1"."price":30."quantity":2},
  {"_id":5."book":"science 1"."price":20."quantity":1},
  {"_id":6}]Copy the code

The Books collection has the following records:

[{"_id":"book1"."author":"author 1"."category":"novel"."stock":10."time":1564456048486."title":"novel 1"},
  {"_id":"book3"."author":"author 3"."category":"science"."stock":30."title":"science 1"},
  {"_id":"book4"."author":"author 3"."category":"science"."stock":40."title":"science 2"},
  {"_id":"book2"."author":"author 2"."category":"novel"."stock":20."title":"novel 2"},
  {"_id":"book5"."author":"author 4"."category":"science"."stock":50."title":null},
  {"_id":"book6"."author":"author 5"."category":"novel"."stock":"60"}]Copy the code

The following aggregation operation can join the Orders and books collections with an equal match condition that matches the book field of the Orders collection and the title field of the books collection:

const db = cloud.database()
db.collection('orders').aggregate()
  .lookup({
    from: 'books',
    localField: 'book',
    foreignField: 'title'.as: 'bookList',}).end().then(res => console.log(res))
  .catch(err => console.error(err))
Copy the code

Results:

[{"_id": 4."book": "novel 1"."price": 30."quantity": 2."bookList": [{"_id": "book1"."title": "novel 1"."author": "author 1"."category": "novel"."stock": 10}]}, {"_id": 5."book": "science 1"."price": 20."quantity": 1."bookList": [{"_id": "book3"."category": "science"."title": "science 1"."author": "author 3"."stock": 30}]}, {"_id": 6."bookList": [{"_id": "book5"."category": "science"."author": "author 4"."stock": 50."title": null
      },
      {
        "_id": "book6"."author": "author 5"."stock": "60"."category": "novel"}}]]Copy the code

Applying equality matching to array fields assumes that the authors collection has the following records:

[{"_id": 1."name": "author 1"."intro": "Two-time best-selling sci-fiction novelist"},
  {"_id": 3."name": "author 3"."intro": "UCB assistant professor"},
  {"_id": 4."name": "author 4"."intro": "major in CS"}]Copy the code

The Books collection has the following records:

[{"_id":"book1"."authors": ["author 1"]."category":"novel"."stock":10."time":1564456048486."title":"novel 1"},
  {"_id":"book3"."authors": ["author 3"."author 4"]."category":"science"."stock":30."title":"science 1"},
  {"_id":"book4"."authors": ["author 3"]."category":"science"."stock":40."title":"science 2"}]Copy the code

The following operation gets authors and their respective published books, using the lookup operation to match the name field of the authors collection and the authors array field of the books collection:

const db = cloud.database()
db.collection('authors').aggregate()
  .lookup({
    from: 'books',
    localField: 'name',
    foreignField: 'authors'.as: 'publishedBooks',}).end().then(res => console.log(res))
  .catch(err => console.error(err))
Copy the code

The results of

[{"_id": 1."intro": "Two-time best-selling sci-fiction novelist"."name": "author 1"."publishedBooks": [{"_id": "book1"."title": "novel 1"."category": "novel"."stock": 10."authors": [
          "author 1"]}]}, {"_id": 3."name": "author 3"."intro": "UCB assistant professor"."publishedBooks": [{"_id": "book3"."category": "science"."title": "science 1"."stock": 30."authors": [
          "author 3"."author 4"] {},"_id": "book4"."title": "science 2"."category": "science"."stock": 40."authors": [
          "author 3"]}]}, {"_id": 4."intro": "major in CS"."name": "author 4"."publishedBooks": [{"_id": "book3"."category": "science"."title": "science 1"."stock": 30."authors": [
          "author 3"."author 4"]}]}Copy the code

Combine mergeObjects to apply equality matching

Suppose the Orders collection has the following records:

[{"_id":4."book":"novel 1"."price":30."quantity":2},
  {"_id":5."book":"science 1"."price":20."quantity":1},
  {"_id":6}]Copy the code

The Books collection has the following records:

[{"_id":"book1"."author":"author 1"."category":"novel"."stock":10."time":1564456048486."title":"novel 1"},
  {"_id":"book3"."author":"author 3"."category":"science"."stock":30."title":"science 1"},
  {"_id":"book4"."author":"author 3"."category":"science"."stock":40."title":"science 2"},
  {"_id":"book2"."author":"author 2"."category":"novel"."stock":20."title":"novel 2"},
  {"_id":"book5"."author":"author 4"."category":"science"."stock":50."title":null},
  {"_id":"book6"."author":"author 5"."category":"novel"."stock":"60"}]Copy the code

The following operations match the book field of Orders and the title field of books, and merge the books match result directly into the Orders record.

var db = cloud.database()
var $ = db.command.aggregate
db.collection('orders').aggregate()
  .lookup({
    from: "books",
    localField: "book",
    foreignField: "title",
    as: "bookList"
  })
  .replaceRoot({
    newRoot: $.mergeObjects([ $.arrayElemAt(['$bookList'.0]), '$$ROOT' ])
  })
  .project({
    bookList: 0}).end().then(res => console.log(res))
  .catch(err => console.error(err))
Copy the code

The results of

[{"_id": 4."title": "novel 1"."author": "author 1"."category": "novel"."stock": 10."book": "novel 1"."price": 30."quantity": 2
  },
  {
    "_id": 5."category": "science"."title": "science 1"."author": "author 3"."stock": 30."book": "science 1"."price": 20."quantity": 1
  },
  {
    "_id": 6."category": "science"."author": "author 4"."stock": 50."title": null}]Copy the code

Specify multiple connection conditions

Suppose the Orders collection has the following records:

[{"_id":4."book":"novel 1"."price":300."quantity":20},
  {"_id":5."book":"science 1"."price":20."quantity":1}]Copy the code

The Books collection has the following records:

[{"_id":"book1"."author":"author 1"."category":"novel"."stock":10."time":1564456048486."title":"novel 1"},
  {"_id":"book3"."author":"author 3"."category":"science"."stock":30."title":"science 1"}]Copy the code

The following operation joins the Orders and books collections and requires two conditions:

  • The book field of Orders is equal to the title field of books
  • The quantity field of Orders is greater than or equal to the stock field of books
const db = cloud.database()
const $ = db.command.aggregate
db.collection('orders').aggregate()
.lookup({
  from: 'books',
  let: {
    order_book: '$book',
    order_quantity: '$quantity'
  },
  pipeline: $.pipeline()
    .match(_.expr($.and([
      $.eq(['$title'.'$$order_book']),
      $.gte(['$stock'.'$$order_quantity'])
    ])))
    .project({
      _id: 0,
      title: 1,
      author: 1,
      stock: 1
    })
    .done(),
  as: 'bookList',}).end().then(res => console.log(res))
.catch(err => console.error(err))
Copy the code

Results:

[{"_id": 4."book": "novel 1"."price": 300."quantity": 20."bookList": []}, {"_id": 5."book": "science 1"."price": 20."quantity": 1."bookList": [{"title": "science 1"."author": "author 3"."stock": 30}}]]Copy the code

Concatenates subqueries of the joined collection

Suppose the Orders collection has the following records:

[{"_id":4."book":"novel 1"."price":30."quantity":2},
  {"_id":5."book":"science 1"."price":20."quantity":1}]Copy the code

The Books collection has the following records:

[{"_id":"book1"."author":"author 1"."category":"novel"."stock":10."time":1564456048486."title":"novel 1"},
  {"_id":"book3"."author":"author 3"."category":"science"."stock":30."title":"science 1"},
  {"_id":"book4"."author":"author 3"."category":"science"."stock":40."title":"science 2"}]Copy the code

Append to each output record an array field whose value is the result of a query statement on the books collection:

const db = cloud.database()
const $ = db.command.aggregate
db.collection('orders').aggregate()
  .lookup({
    from: 'books',
    let: {
      order_book: '$book',
      order_quantity: '$quantity'
    },
    pipeline: $.pipeline()
      .match({
        author: 'author 3'
      })
      .project({
        _id: 0,
        title: 1,
        author: 1,
        stock: 1
      })
      .done(),
    as: 'bookList',}).end().then(res => console.log(res))
  .catch(err => console.error(err))
Copy the code

The results of

[{"_id": 4."book": "novel 1"."price": 30."quantity": 20."bookList": [{"title": "science 1"."author": "author 3"."stock": 30
      },
      {
        "title": "science 2"."author": "author 3"."stock": 40}]}, {"_id": 5."book": "science 1"."price": 20."quantity": 1."bookList": [{"title": "science 1"."author": "author 3"."stock": 30
      },
      {
        "title": "science 2"."author": "author 3"."stock": 40}}]]Copy the code

match

Aggregation phase. Documents are filtered based on criteria, and those that meet the criteria are passed to the next pipeline phase.

API specification

The form of match is as follows:

match(<Query conditions>)
Copy the code

The query conditions are the same as those for common queries. You can use common query operators. The match phase is different from other aggregation phases.

//Use strings directlymatch({
  name: 'Tony Stark'
})
Copy the code
//Use the operator const _= db.command
match({
  age: _.gt(18)})Copy the code

The sample

Suppose that the collection articles has the following records:

{ "_id" : "1"."author" : "stark"."score" : 80 }
{ "_id" : "2"."author" : "stark"."score" : 85 }
{ "_id" : "3"."author" : "bob"."score" : 60 }
{ "_id" : "4"."author" : "li"."score" : 55 }
{ "_id" : "5"."author" : "jimmy"."score" : 60 }
{ "_id" : "6"."author" : "li"."score" : 94 }
{ "_id" : "Seven"."author" : "justan"."score" : 95 }
Copy the code

matching

Here is an example of a direct match:

db.collection('articles')
  .aggregate()
  .match({
    author: 'stark'}).end(a)Copy the code

The code here tries to find all articles whose author fields are Stark, so the match is as follows:

{ "_id" : "1"."author" : "stark"."score" : 80 }
{ "_id" : "2"."author" : "stark"."score" : 85 }
Copy the code

count

Once match filters out documents, it can also be used with other pipelining phases.

For example, we use group to count the number of documents whose score field is greater than 80:

const _ = db.command
const $ = _.aggregate
db.collection('articles')
  .aggregate()
  .match({
    score: _.gt(80)}).group({
      _id: null,
      count: $.sum(1)}).end(a)Copy the code

The return value is as follows:

{ "_id" : null."count" : 3 }
Copy the code

project

Aggregation phase. Passes the specified field, either an existing field or a calculated new field, to the next pipeline.

API specification

The form of project is as follows:

project({  <expression>})
Copy the code

Expressions can have the following formats:

format instructions
< field >: <1 or true> Specifies to include an existing field
_id: < > 0 or false Discard the _ID field
< field >: < expression > Add a new field or reset an existing field
< field >: <0 or false> Discard a field (if you specify that a non-_ID field is discarded, you cannot use other expressions in this project)

Specify include fields

The _id field is included in the output by default. Any other fields that you want to include in the output must be specified in the project. If you specify to include a field that does not yet exist, the project ignores the field and does not add it to the output document;

Specify excluded fields

If you specify to exclude a field in a project, all other fields will be included in the output. If a non-_ID field is specified to exclude, no other expressions can be used in this project.

Add a new field or reset an existing field

You can use special expressions to add new fields or reset an existing field.

Multiple layers of nested fields

Sometimes some fields are at the bottom of multiple layers of nesting, we can use dot notation:

"contact.phone.number": <1 or 0 orexpression>
Copy the code

You can also use the nested format directly:

contact: { phone: { number: <1 or 0 orexpression>}}Copy the code

The sample

Suppose we have a articles collection that contains the following documents:

{
    "_id": Awesome!."title": "This is title"."author": "Nobody"."isbn": "123456789"."introduction": "..."
}
Copy the code

Specifies to include certain fields

The following code uses project to make the output contain only the _ID, title, and author fields:

db.collection('articles')
  .aggregate()
  .project({
    title: 1,
    author: 1}).end(a)Copy the code

The output is as follows:

{ "_id" : Awesome!."title" : "This is title"."author" : "Nobody" }
Copy the code

Remove the _id field from the output

The _id is included in the output by default. If you don’t want it, you can specify to remove it:

db.collection('articles')
  .aggregate()
  .project({
    _id: 0.//Specify to remove _id field title:1,
    author: 1}).end(a)Copy the code

The output is as follows:

{ "title" : "This is title"."author" : "Nobody" }
Copy the code

Remove a non-_ID field

We can also specify that a non-_id field is removed from the output so that all other fields are printed:

db.collection('articles')
  .aggregate()
  .project({
    isbn: 0.//Specify to remove the ISBN field}).end(a)Copy the code

The output is as follows, without the ISBN field compared to the input:

{
    "_id" : Awesome!."title" : "This is title"."author" : "Nobody"."introduction": "..."
}
Copy the code

Add the computed new field

Suppose we have a students collection that contains the following documents:

{
    "_id": 1."name": "Xiao Ming"."scores": {
        "chinese": 80."math": 90."english": 70}}Copy the code

In the following code, we add a new field, totalScore, to the output using project:

const { sum } = db.command.aggregate
db.collection('students')
  .aggregate()
  .project({
    _id: 0,
    name: 1,
    totalScore: sum([
        "$scores.chinese",
        "$scores.math",
        "$scores.english"
    ])
  })
  .end(a)Copy the code

The output is:

{ "name": "Xiao Ming"."totalScore": 240 }
Copy the code

Adds new array fields

Suppose we have a points collection containing the following documents:

{ "_id": 1."x": 1."y": 1 }
{ "_id": 2."x": 2."y": 2 }
{ "_id": 3."x": 3."y": 3 }
Copy the code

In the following code, we use project to put the x and y fields into a new array field coordinate:

db.collection('points')
  .aggregate()
  .project({
    coordinate: ["$x", "$y"]
  })
  .end(a)Copy the code

The output is as follows:

{ "_id": 1."coordinate": [1.1]} {"_id": 2."coordinate": [2.2]} {"_id": 3."coordinate": [3.3]}Copy the code

replaceRoot

Aggregation phase. Specify an existing field as the root node for the output, or you can specify a computed new field as the root node.

API specification

ReplaceRoot is used as follows:

replaceRoot({    newRoot: <expression>})
Copy the code

The expression format is as follows:

format instructions
< field name > Specify an existing field as the root node for output (error if field does not exist)
The < object > Compute a new field and use the new field as the root node

The sample

Use the existing field as the root node

Suppose we have a collection of schools that looks like this:

{
  "_id": 1."name": "SFLS"."teachers": {
    "chinese": 22."math": 18."english": 21."other": 123}}Copy the code

The following code uses replaceRoot to output the teachers field as the root node:

db.collection('schools')
  .aggregate()
  .replaceRoot({
    newRoot: '$teachers'}).end(a)Copy the code

The output is as follows:

{
  "chinese": 22."math": 18."english": 21."other": 123
}
Copy the code

Use the computed new field as the root node

Suppose we have a set of roles that looks like this:

{ "_id": 1."first_name": "Four,"."last_name": "Yellow" }
{ "_id": 2."first_name": "James Bond"."last_name": "Horse" }
{ "_id": 3."first_name": "Shepherd"."last_name": "Zhang" }
Copy the code

The following code uses replaceRoot to put first_name and last_name together:

const { concat } = db.command.aggregate
db.collection('roles')
  .aggregate()
  .replaceRoot({
    newRoot: {
      full_name: concat(['$last_name'.'$first_name'])}}).end(a)Copy the code

The output is as follows:

{ "full_name": "Yellow Silang" }
{ "full_name": "Mabbond" }
{ "full_name": "Zhang Muzhi" }
Copy the code

sample

Aggregation phase. Randomly selects a specified number of records from the document.

API specification

The form of sample is as follows:

sample({    size: <Positive integer>})
Copy the code

Please note: size is a positive integer, otherwise an error will occur.

The sample

Assume that the document Users has the following records:

{ "name": "a"} {"name": "b" }
Copy the code

Randomly selected

If there is a raffle now, one lucky user needs to be selected. Then the call method of sample is as follows:

db.collection('users')
  .aggregate()
  .sample({
    size: 1}).end(a)Copy the code

Returns a record for a randomly selected user with the following result:

{ "_id": "696529e4-7e82-4e7f-812e-5144714edff6"."name": "b" }
Copy the code

skip

Aggregation phase. Specify a positive integer, skip the number of documents, and print the rest.

The sample

db.collection('users')
  .aggregate()
  .skip(5).end(a)Copy the code

This code skips the first five documents found and prints out the rest.

sort

Aggregation phase. Sorts the input documents according to the specified field.

API specification

The form is as follows:

sort({
    <The field name1>: <collation>.<The field name2>: <collation>,})Copy the code

< collation > can have the following values:

  • 1 represents ascending order (from smallest to largest);
  • -1 represents descending order (from largest to smallest);

The sample

Ascending/descending order

Suppose we have a collection articles containing the following data:

{ "_id": "1"."author": "stark"."score": 80."age": 18 }
{ "_id": "2"."author": "bob"."score": 60."age": 18 }
{ "_id": "3"."author": "li"."score": 55."age": 19 }
{ "_id": "4"."author": "jimmy"."score": 60."age": 22 }
{ "_id": "5"."author": "justan"."score": 95."age": 33 }
Copy the code

The code above does an aggregate search in the Students collection and sorts the results, descending first by the AGE field and then by the Score field.

The following output is displayed:

db.collection('articles')
  .aggregate()
  .sort({
      age: - 1,
      score: - 1}).end(a)Copy the code

sortByCount

Aggregation phase. Groups incoming collections based on incoming expressions. Then count the number of different groups and sort the groups by their number, returning the sorted result.

API specification

SortByCount is called as follows:

sortByCount(<expression>)
Copy the code

The expression is of the form: + specifies the field. Please note: do not omit the + specified field. Please note: do not omit the + specified field. Please note: do not omit symbols.

The sample

Basic types of statistics

But let’s say it was recorded like this:

{ "category": "Web" }
{ "category": "Web" }
{ "category": "Life" }
Copy the code

The following code counts the article categories and counts the number of each category. That is, perform the sortByCount aggregation on the category field.

db.collection('passages')
  .aggregate()
  .sortByCount('$category').end(a)Copy the code

The results returned are as follows: there are two articles under the Web category and one article under the Life category.

{ "_id": "Web"."count": 2 }
{ "_id": "Life"."count": 1 }
Copy the code

Unpacking array types

But suppose the collection passages are recorded as follows: The tags field is an array type of value.

{ "tags": [ "JavaScript"."C#"]} {"tags": [ "Go"."C#"]} {"tags": [ "Go"."Python"."JavaScript"]}Copy the code

How to count the tag information and count the number of tags? Because the tags field corresponds to an array, you need to unwind the tags field and then call sortByCount.

The following code implements this function:

db.collection('passages')
  .aggregate()
  .unwind(`$tags`)
  .sortByCount(`$tags`)
  .end(a)Copy the code

The result returned is as follows:

{ "_id": "Go"."count": 2 }
{ "_id": "C#"."count": 2 }
{ "_id": "JavaScript"."count": 2 }
{ "_id": "Python"."count": 1 }
Copy the code

unwind

Aggregation phase. Splits the document using each element in the specified array field. After splitting, the document changes from one to one or more, one for each element of the array.

API specification

Splits the document using each element in the specified array field. After splitting, the document changes from one to one or more, one for each element of the array.

Unwind can be used in two ways:

The argument is a field name

unwind(<The field name>)
Copy the code

The argument is an object

unwind({
    path: <The field name>,
    includeArrayIndex: <string>,
    preserveNullAndEmptyArrays: <boolean>
})
Copy the code
field type instructions
path string The field name of the array you want to split, starting with $.
includeArrayIndex string Optionally, pass in a new field name and the array index will be stored on the new field. New field names cannot start with $.
preserveNullAndEmptyArrays boolean If true, the document will still be printed if the path field is null, an empty array, or if the field does not exist. If false, the unwind will not output these documents. The default is false.

The sample

Break up the array

Suppose we have a Products collection containing the following data:

{ "_id": "1"."product": "tshirt"."size": ["S"."M"."L"]} {"_id": "2"."product": "pants"."size": []} {"_id": "3"."product": "socks"."size": null }
{ "_id": "4"."product": "trousers"."size": ["S"]} {"_id": "5"."product": "sweater"."size": ["M"."L"]}Copy the code

We split these documents based on the size field

db.collection('products')
  .aggregate()
  .unwind('$size').end(a)Copy the code

The output is as follows:

{ "_id": "1"."product": "tshirt"."size": "S" }
{ "_id": "1"."product": "tshirt"."size": "M" }
{ "_id": "1"."product": "tshirt"."size": "L" }
{ "_id": "4"."product": "trousers"."size": "S" }
{ "_id": "5"."product": "sweater"."size": "M" }
{ "_id": "5"."product": "sweater"."size": "L" }
Copy the code

After splitting, retain the index of the original array

After we split the document according to the size field, we want to keep the original array index in the new index field.

db.collection('products')
  .aggregate()
  .unwind({
      path: '$size',
      includeArrayIndex: 'index'}).end(a)Copy the code

The output is as follows:

{ "_id": "1"."product": "tshirt"."size": "S"."index": 0 }
{ "_id": "1"."product": "tshirt"."size": "M"."index": 1 }
{ "_id": "1"."product": "tshirt"."size": "L"."index": 2 }
{ "_id": "4"."product": "trousers"."size": "S"."index": 0 }
{ "_id": "5"."product": "sweater"."size": "M"."index": 0 }
{ "_id": "5"."product": "sweater"."size": "L"."index": 1 }
Copy the code

Preserve documents with empty fields

Notice that we have two rows of special null-valued data in our collection:

. {"_id": "2"."product": "pants"."size": []} {"_id": "3"."product": "socks"."size": null}...Copy the code

If you want to keep the size in our output is empty, null, array or there is no document, the size field preserveNullAndEmptyArrays parameters may be used

db.collection('products')
  .aggregate()
  .unwind({
      path: '$size',
      preserveNullAndEmptyArrays: true}).end(a)Copy the code

The output is as follows:

{ "_id": "1"."product": "tshirt"."size": "S" }
{ "_id": "1"."product": "tshirt"."size": "M" }
{ "_id": "1"."product": "tshirt"."size": "L" }
{ "_id": "2"."product": "pants"."size": null }
{ "_id": "3"."product": "socks"."size": null }
{ "_id": "4"."product": "trousers"."size": "S" }
{ "_id": "5"."product": "sweater"."size": "M" }
{ "_id": "5"."product": "sweater"."size": "L" }
Copy the code

end

Indicates that the aggregation operation definition is complete and initiates the actual aggregation operation

The return value

Promise.

attribute type instructions
list Array. List of aggregated results

The sample code

const $ = db.command.aggregate
db.collection('books').aggregate()
  .group({
    //Group _id by category field:'$category'.//Let each set of output records have an avgSales field whose value is the average of the sales fields for all records in the group, avgSales: $.avg('$sales')}).end().then(res => console.log(res))
  .catch(err => console.error(err))
Copy the code