1 introduction

As you all know (if you don’t know), MongoDB organizes data in documents. In addition to being used to store Json data, it can also store regular files. We can store some files to MongoDB in BSOON format, which is very convenient, such as images, text files, etc. However, MongoDB’s BSON Document has a size limit, which cannot exceed 16MB. This is inconvenient for us to store large files. Fortunately, MongoDB provides a GridFS file storage component that allows us to store files larger than 16MB, even small files. Let’s take a look at the GridFS storage.

2 Basic Principles and Concepts

GridFS is a simple system that splits large files into smaller ones for storage. When we save a file, the default is to use the collections fs.files and fs.chunks to store the file. Fs. files stores file information, and fs.chunks stores file content in BSON format.

A record of fs.files is as follows:

{ "_id" : ObjectId("5ec6b44af3760d5999bd1c91"), "length" : NumberLong(1048576), "chunkSize" : 261120, "uploadDate" : ISODate(" 2020-05-21T17:03:06.217z "), "filename" : "pkslow. TXT ", "metadata" : {}}Copy the code

Field Description:

_id: indicates the ID of the primary key.

Length: indicates the file size.

ChunkSize: The size of chunk. It determines how many chunks to store files.

UploadDate: file upload time;

Filename: indicates the filename.

Metadata: Other information in a file that can be customized to facilitate subsequent retrieval and use.

One entry from Fs.chunks is as follows:

{
    "_id" : ObjectId("5ec6b44af3760d5999bd1c94"),
    "files_id" : ObjectId("5ec6b44af3760d5999bd1c91"),
    "n" : 2,
    "data" : { "$binary" : "xxxxxxxxx", "$type" : "00" }
}Copy the code

Field Description:

_id: indicates the ID of the primary key.

Files_id: indicates the ID of the file corresponding to the saved content. The value is the same as that of fs._id.

N: The index of chunk number, starting from 0.

Data: indicates the file content.

By looking at the fields in both sets, you have a pretty good idea of how GridFS organizes data. When we save a file, if the file is smaller than chunkSize, the file information will be saved to fs.files with only one record; The file contents are stored in fs.chunks and there is only one record. If the saved file is larger than chunkSize, one record will also be generated in fs.files, but multiple records will be generated in fs.chunks to store the file contents. As shown below:

MongoDB has established relevant indexes for us, which can speed up the query, such as the file name and upload time of fs.files. The fs.chunks file ID and n.

3 Use the mongofiles command

With the basics out of the way, let’s go ahead and use the commands MongoDB has prepared for us to do some things. Of course, the first thing you need to do is to have an installed database, for example, install the latest version of MongoDB with Docker.

We all use the Mongofiles command for operation and need to specify more references. For example, the following command is used to list all files:

Mongofiles --username user --password 123456 --host 127.0.0.1 --port 27017 --authenticationDatabase admin --db testdb listCopy the code

To avoid typing such a long command each time, let’s add a separate name:

Alias MF ='mongofiles --username user --password 123456 --host 127.0.0.1 --port 27017 --authenticationDatabase admin --db  testdb'Copy the code

List files:

mf listCopy the code

Save file: The saved file name is the same as the local file name.

mf put pksow.txtCopy the code

Read files:

mf get pkslow.txtCopy the code

Find a file:

mf search pkslowCopy the code

Delete file:

mf delete pkslow.txtCopy the code

Specify a custom file name:

mf --local pkslow.txt put /com/pkslow.txtCopy the code

4 summarizes

Use your imagination. GridFS can do a lot of things, including storing images, audio, video, etc., when you just want to view portions of a large file.

Note: The MongoDB version used in this article is 4.2.1.


Visit pumpkin Talk www.pkslow.com for more exciting articles!

Welcome to pay attention to the wechat public number “Pumpkin slow Talk”, will continue to update for you…

Read more and share more; Write more. Organize more.