preface

In the actual development and maintenance process, we may encounter situations that need to brush data, such as data type modification, enumeration value change and so on. For operations with small data sets, you can use a loop to modify one by one. For example, suppose that the ages of our previous employees were incorrectly recorded, and now the ages of each employee need to be increased by 1. In this case, we can use the snapshot of the dataset to update each employee one by one, as follows:

db.employees.find({age: {$exists: true.$type: 'number'}})
    ._addSpecial( "$snapshot".true).forEach(function(doc) {
  var newAge = doc.age + 1;
  db.employees.updateOne({_id: doc._id}, {$set: {age: newAge}});
});
Copy the code

The following code tells MongoDB to use a snapshot to read the entire data set at once.

db.employees.find()._addSpecial( "$snapshot".true)
Copy the code

Note that there are two other methods available in 3.x, but they are no longer available in 4.x.

db.employees.find().snapshot();
db.employees.find({$query: {}, $snapshot: true});
Copy the code

This approach is obviously inefficient and cannot be executed if the data set is too large, so MongoDB has introduced a batch operation API.

The batch operation

Batch operations Synchronize multiple pieces of data to the server in batch write mode. This eliminates the need to write data to the server every time in the forEach loop. The code for batch operations is as follows:

var snapshot = db.employees.find({age: {$exists: true.$type: 'number'}})
    ._addSpecial( "$snapshot".true);
bulkUpadteOps = [];

snapshot.forEach(function(doc) {
	var newAge = doc.age + 1;
	bulkUpadteOps.push({
		'updateOne': {
			'filter': {'_id': doc._id},
			'update': {$set: {age: newAge}}
		}
	});

	if (bulkUpadteOps.length === 100) { db.employees.bulkWrite(bulkUpadteOps); bulkUpadteOps = []; }});if (bulkUpadteOps.length > 0) {
		db.employees.bulkWrite(bulkUpadteOps);
}
Copy the code

In this way, the operations to be performed are first stored in an array of batch operations. When a certain amount is accumulated, bulkWrite is called to perform the corresponding operations in batches. Finally, it is possible that the number is not an integer multiple of the set number of batch operations, so if there are still operations remaining, one more run will complete all batch updates.

$type Specifies the query data type

The $type operator is used to find the matched data type when looking for snapshot data. $type can be specified as a string or a number. These data types are built-in data types of MongoDB, as shown below. Number represents the alias of a numeric type, including the following types:

  • Double
  • 32-bit integer
  • 64-bit integer
  • decimal
type digital The alias note
Double 1 “double”
String 2 “string”
Object 3 “object”
Array 4 “array”
Binary data 5 “binData”
Undefined 6 “undefined” Has been abandoned.
ObjectId 7 “objectId”
Boolean 8 “bool”
Date 9 “date”
Null 10 “null”
Regular Expression 11 “regex”
DBPointer 12 “dbPointer” Has been abandoned.
JavaScript 13 “javascript”
Symbol 14 “symbol” Has been abandoned.
JavaScript code with scope 15 “javascriptWithScope” Deprecated later than version 4.4
32-bit integer 16 “int”
Timestamp 17 “timestamp”
64-bit integer 18 “long”
Decimal128 19 “decimal” Added later than version 3.4
Min key – 1 “minKey”
Max key 127 “maxKey”

For details, see the MongoDB $type operator documentation

conclusion

This article introduces the batch operation API provided by MongoDB. With the help of the batch operation API, the efficiency of database maintenance can be improved, such as changing data types and adjusting data values in batches.