preface
In the actual development and maintenance process, we may encounter situations that need to brush data, such as data type modification, enumeration value change and so on. For operations with small data sets, you can use a loop to modify one by one. For example, suppose that the ages of our previous employees were incorrectly recorded, and now the ages of each employee need to be increased by 1. In this case, we can use the snapshot of the dataset to update each employee one by one, as follows:
db.employees.find({age: {$exists: true.$type: 'number'}})
._addSpecial( "$snapshot".true).forEach(function(doc) {
var newAge = doc.age + 1;
db.employees.updateOne({_id: doc._id}, {$set: {age: newAge}});
});
Copy the code
The following code tells MongoDB to use a snapshot to read the entire data set at once.
db.employees.find()._addSpecial( "$snapshot".true)
Copy the code
Note that there are two other methods available in 3.x, but they are no longer available in 4.x.
db.employees.find().snapshot();
db.employees.find({$query: {}, $snapshot: true});
Copy the code
This approach is obviously inefficient and cannot be executed if the data set is too large, so MongoDB has introduced a batch operation API.
The batch operation
Batch operations Synchronize multiple pieces of data to the server in batch write mode. This eliminates the need to write data to the server every time in the forEach loop. The code for batch operations is as follows:
var snapshot = db.employees.find({age: {$exists: true.$type: 'number'}})
._addSpecial( "$snapshot".true);
bulkUpadteOps = [];
snapshot.forEach(function(doc) {
var newAge = doc.age + 1;
bulkUpadteOps.push({
'updateOne': {
'filter': {'_id': doc._id},
'update': {$set: {age: newAge}}
}
});
if (bulkUpadteOps.length === 100) { db.employees.bulkWrite(bulkUpadteOps); bulkUpadteOps = []; }});if (bulkUpadteOps.length > 0) {
db.employees.bulkWrite(bulkUpadteOps);
}
Copy the code
In this way, the operations to be performed are first stored in an array of batch operations. When a certain amount is accumulated, bulkWrite is called to perform the corresponding operations in batches. Finally, it is possible that the number is not an integer multiple of the set number of batch operations, so if there are still operations remaining, one more run will complete all batch updates.
$type Specifies the query data type
The $type operator is used to find the matched data type when looking for snapshot data. $type can be specified as a string or a number. These data types are built-in data types of MongoDB, as shown below. Number represents the alias of a numeric type, including the following types:
- Double
- 32-bit integer
- 64-bit integer
- decimal
type | digital | The alias | note |
---|---|---|---|
Double | 1 | “double” | |
String | 2 | “string” | |
Object | 3 | “object” | |
Array | 4 | “array” | |
Binary data | 5 | “binData” | |
Undefined | 6 | “undefined” | Has been abandoned. |
ObjectId | 7 | “objectId” | |
Boolean | 8 | “bool” | |
Date | 9 | “date” | |
Null | 10 | “null” | |
Regular Expression | 11 | “regex” | |
DBPointer | 12 | “dbPointer” | Has been abandoned. |
JavaScript | 13 | “javascript” | |
Symbol | 14 | “symbol” | Has been abandoned. |
JavaScript code with scope | 15 | “javascriptWithScope” | Deprecated later than version 4.4 |
32-bit integer | 16 | “int” | |
Timestamp | 17 | “timestamp” | |
64-bit integer | 18 | “long” | |
Decimal128 | 19 | “decimal” | Added later than version 3.4 |
Min key | – 1 | “minKey” | |
Max key | 127 | “maxKey” |
For details, see the MongoDB $type operator documentation
conclusion
This article introduces the batch operation API provided by MongoDB. With the help of the batch operation API, the efficiency of database maintenance can be improved, such as changing data types and adjusting data values in batches.