The author:coolyuantao

background

Recently, the team met A small demand, there are two systems A and B. System A supports users to make skin packs online, and users can download the made skin packs and import them to another system B. The skin package itself is actually a ZIP package. After receiving the package, system B decompresses it and performs some routine verification, such as version verification and content validity verification. The overall function is relatively simple.

But ah, A group of testers for our developers A output, first to have several video files A bypass system, change the suffix to zip want to upload directly, every time is wait until after the upload to system B found was not legal, system B before the file not finished uploading and unable to unzip, also don’t know if the file content is legitimate, It takes a lot of bandwidth and a lot of time to alert the user that there is a problem with the skin pack.

There are two issues involved here. Let’s take a look at them:

  1. How files are encrypted so that users can’t reverse them, and sensitive information inside compressed packages can’t be leaked.
  2. When receiving information flows, the server informs users in advance how to determine the validity of compressed packets before the transmission is complete.

AES VS RSA

When it comes to encryption, many people think of symmetric AES and asymmetric RSA. These two algorithms according to the literal meaning is also easier to understand, symmetric encryption technology is the use of encryption and decryption is the same key, this encryption algorithm is very fast, high security level, the size of the encryption before and after the same; Asymmetric encryption technology PK is a public key, private key SK, the principle of the algorithm is to find two prime Numbers, let them exactly equals the product of a number of the agreement, the safety of the asymmetric algorithm is dependent on the decomposition of large Numbers, this there is no theoretical support can be quick to crack, its safety is totally dependent on the length of key, 1024 bits is generally sufficient. However, it is much slower than the symmetric algorithm. Generally, it is only used to encrypt a small amount of data. The length of data to be encrypted cannot exceed the length of the key.

Use AES to encrypt files

Combining the advantages and disadvantages of the two encryption methods, we use AES to encrypt and decrypt files. The main reasons for using AES are as follows:

  1. Encryption and decryption performance problem, AES is extremely fast, compared with RSA more than 1000 times.
  2. RSA has requirements on the length of the source text. The maximum length is only that of the key.

AES encryption algorithm node. js crypto module has been built-in, specific use can refer to the official documents.

AES encryption logic

const crypto = require('crypto');
const algorithm = 'aes-256-gcm';

/** * AES encryption for a buffer *@param {Buffer} Buffer Contents to be encrypted *@param {String} The key key *@param {String} Iv initial vector *@return {{key: string, iv: string, tag: Buffer, context: Buffer}} * /
function aesEncrypt (buffer, key, iv) {
    // Initialize the encryption algorithm
    const cipher = crypto.createCipheriv(algorithm, key, iv);
    let encrypted = cipher.update(buffer);
    let end = cipher.final();
    // Generate an authentication label to verify the source of the ciphertext
    const tag = cipher.getAuthTag();
    return {
        key,
        iv,
        tag,
        buffer: buffer.concat([encrypted, end]);
    };
}
Copy the code

AES decryption logic

Decrypting the whole is the same as encryption, just change the name of the interface:

const crypto = require('crypto');
const algorithm = 'aes-256-gcm';

/** * AES decrypt a buffer *@param {{key: string, iv: string, tag: Buffer, buffer: Buffer}} ret What to decrypt *@param {String} The key key *@param {String} Iv initial vector *@return {Buffer}* /
function aesDecrypt ({key, iv, tag, buffer}) {
    // Initial encryption algorithm
    const decipher = crypto.createDecipheriv(algorithm, key, iv);
    // Generate an authentication label to verify the source of the ciphertext
    decipher.setAuthTag(tag);
    let decrypted = decipher.update(buffer);
    let end = decipher.final();
    return Buffer.concat([decrypted, end]);
}
Copy the code

Specific use of AES

With these two interfaces, we can implement a simple symmetric encryption:

const key = 'abcdefghijklmnopqrstuvwxyz123456'; // 32 The length of the shared key must match that of the algorithm
const iv = 'abcdefghijklmnop';  // The length of the initial vector needs to match the algorithm
let fileBuffer = Buffer.from('abc');

/ / encryption
let encrypted = aesEncrypt(fileBuffer, key, iv);

/ / decryption
let context = aesDecrypt(encrypted);
console.log(context.toString());
Copy the code

In general, this key is important. If leakage occurs, encryption is meaningless. Therefore, key and IV are dynamically generated using random numbers, such as:

const key = crypto.randomBytes(32);
const iv = crypto.randomBytes(16);
Copy the code

After the above adjustment, it is relatively easy to encrypt and decrypt files. Back to our business system, the compressed package generated by system A needs to be used by system B eventually. The two systems are isolated, so how can key and IV be transferred to system B? The generated key is unknown to system B.

When you read this, you may think that when you send the compressed package to B, you can submit the key and IV together. However, on reflection, the key cannot be sent in plaintext. Otherwise, what is the meaning of this encryption?

This is where RSA asymmetric encryption comes in.

The RSA algorithm is used to encrypt the key asymmetrically

RSA encryption algorithm Node.js crypto module has been built in, specific use can refer to the official documents.

Generate the public and private keys of RSA

Using openssl components can be directly generated RSA public key and private key specific command can refer to: www.scottbrady91.com/OpenSSL/Cre… .

Generate private key
openssl genrsa -out private.pem 1024

Public key
openssl rsa -in private.pem -pubout -out public.pem
Copy the code

Pem and public. Pem can be used. Now we use Node.js to implement the specific encryption and decryption logic.

RSA encryption logic

const fs = require('fs');
const crypto = require('crypto');
const PK = fs.readFileSync('./public.pem'.'utf-8');

/** * RSA encrypts a buffer *@param {Buffer} Contents to be encrypted *@return {Buffer}* /
function rsaEncrypt (buffer) {
    return crypto.publicEncrypt(PK, buffer);
}
Copy the code

RSA decryption logic

const fs = require('fs');
const crypto = require('crypto');
const SK = fs.readFileSync('./private.pem'.'utf-8');

/** * RSA decrypts a buffer *@param {Buffer} Contents to be decrypted *@return {Buffer}* /
function rsaDecrypt (buffer) {
    return crypto.privateDecrypt(SK, buffer);
}
Copy the code

Use OF RSA

After the preceding interfaces are configured, the AES key can be encrypted and then transmitted. Server B saves the RSA private key. Server A can use the RSA public key to encrypt data and then send data.

/** * encrypt file *@param {Buffer} fileBuffer
 * @return {{file: Buffer, key: Buffer}} * /
function encrypt (fileBuffer) {
    const key = crypto.randomBytes(32);
    const iv = crypto.randomBytes(16);
    const { tag, file } = aesEncrypt(fileBuffer, key, iv);
    return {
        file,
        key: rsaEncrypt(Buffer.concat([key, iv, tag]));     // As the length is fixed, it can be directly connected
    };
}

/** * Decrypt file *@param {{file: Buffer, key: Buffer}}
 * @return {Buffer}* /
function decrypt ({file, key}) {
    const source = rsaDecrypt(key).toString();
    const k = source.slice(0.32);
    const iv = source.slice(32.48);
    const tag = source.slice(48);
    return aesDecrypt({
        key: k,
        iv,
        tag,
        buffer: file
    })
}
Copy the code

After this combination, the compressed package generated by server A can be decrypted by server B as long as it contains {file, key}, which basically achieves the goal of point 1:1. How can files be encrypted so that users can’t reverse them and sensitive information inside compressed packages can’t be leaked

However, there is still another problem to be solved: 2. When the server receives the information flow, how to judge the validity of the compressed package before the transmission is complete and inform the user in advance

Optimizing encrypted files

The output buffer contains the contents of the file and the encrypted key. In addition to this information, the file contains some additional information, such as the version number, the generation time of the file, and the description information. This information might be divided into several files in the conventional way, and then a compressed package to put the files together, such as:

// zip file - PKG manifest.json // Additional information key.json // saves encrypted key file.json // Encrypted fileCopy the code

If you save the zip file in this way, you would normally need to encrypt the file and change the suffix. However, server B still needs to receive the entire zip file after reading it, decompress it to a temporary directory, and read the contents before performing verification.

In addition, there is another common requirement, the product generally wants to do a preliminary analysis on the browser side when the file is uploaded, so that the obvious illegal files are reminded to the user, so that the user experience is better.

The solution to this problem is not too difficult. All this extra information can be inserted into the header of the file as a binary, such as:

Package field description: |, insert additional information, - | | - behind is the real document binary files: 010101010101010101010 XXXXXXXXXXXXXXXXXXXXXXXXXXXXCopy the code

File header field design

We string all this information together in binary format and deliver a combined file, such as:

// theme pkg. 0 8 16 |------flag------|--extra length--| |----------extra data... ----------| |-------------data... -------------|Copy the code
  • flag:

Fixed THEME and 8 byte length, indicating that the compressed package is a skin package. In this way, you can quickly identify the compressed package

  • extra length:

The actual length of extra data, which is a hexadecimal data, is 8 bytes long, indicating the length of inserted data. For example, if data of length 35 is converted to hexadecimal 0x23, the field is 00000023

  • extra data:

The data encrypted with RSA can be stored here, such as the key field, version number, description information and so on

  • data:

The data encrypted by AES can be decrypted through the key saved in extra Data

A new encrypted file generated

With the above theoretical basis, you can immediately put it into practice, the code is as follows:

/** * encrypt file *@param {Buffer} fileBuffer
 * @return {Buffer}* /
function encrypt (fileBuffer) {
    const key = crypto.randomBytes(32);
    const iv = crypto.randomBytes(16);
    const version = 'v1.1';

    // Log all additional extraservated information, such as the version number and the original key
    const extraJSON = {
        version,
        key,
        iv
    }
    // Complete the AES encryption of the file and output the authentication label
    const { tag, file } = aesEncrypt(fileBuffer, key, iv);
    extraJSON.tag = tag;

    // Perform RSA encryption on extraJSON as a whole
    const extraData = rsaEncrypt(Buffer.from(JSON.stringify(extraJSON)));
    const extraLength = extraData.length;

    // Finally merge all the data together
    return Buffer.concat([
        Buffer.from('THEME'),
        Buffer.from(Buffer.from(extraLength.toString(16).padStart(8.'0'))),
        extraData,
        file
    ]);
}
Copy the code

With this method of encryption, the relevant information is placed in the header of the file, we can be easily read out without the operation of the entire file, for decryption is actually a reverse operation.

Decrypt the newly generated file

/** * Decrypt file *@param {Buffer} fileBuffer
 * @return {Buffer}* /
function decrypt (fileBuffer) {
    const type = fileBuffer.slice(0.8);    // THEME
    const extraLength = +('0x' + fileBuffer.slice(8.16).toString());
    const extraDataEndIndex = 16 + extraLength;

    // Decrypt the RSA encrypted data
    const extraData = rsaDecrypt(fileBuffer.slice(16, extraDataEndIndex));
    const extraJSON = JSON.parse(extraData);
    // Finally use AES to decrypt the remaining files, that is, the final file
    return aesDecrypt({
        key: extraJSON.key,
        iv: extraJSON.iv,
        tag: extraJSON.tag,
        buffer: Buffer.slice(extraDataEndIndex)
    });
}
Copy the code

After using this way to deal with, in RSA decryption extraData, can to check for the entire file, the whole process as long as there is exception documents had been tampered with, in this way would be much better than using compression package, especially the huge files can flow processing, can immediately stop when he found unreasonable.

How does the browser parse this file

Since the entire file format is binary stream, modern browsers have the corresponding ability to read and process, so users can also do some preliminary processing before uploading files, the experience will be greatly improved

Binary data can be read out by using DataView. For details, please refer to: DataView. The preliminary implementation is as follows:

/** * convert the binary stream to the corresponding ASCII character *@param {DataView} Dv binary database *@param {Number}   Start Start position *@param {Number}   End Indicates the end position *@return {String}* /
function buffer2Char (dv, start, end) {
    let ret = [];
    for (let i = start; i < end; i++) {
        let charCode = dv.getUint8(i);
        let code = String.fromCharCode(charCode);
        ret.push(code);
    }
    return ret.join(' ');
}

function test () {
    let fileDom = document.getElementById('file');
    let file = fileDom.files[0];
    let reader = new FileReader();
    reader.readAsArrayBuffer(file);
    reader.addEventListener("load".function(e) {
        let dv = new DataView(buffer);
        let flag = buffer2Char(dv, 0.8);   // THEME
        var extraLength = +('0x' + buffer2Char(dv, 8.16));
        var extraData = buffer2Char(dv, 16, extraLength);

        console.log(flag, extraLength, extraData);
    });
}
Copy the code

Of course, there is a premise in this way is to put some non-sensitive information out, do not encrypt, so that the browser can also read the file. As long as the format conventions of the front and back ends are well done, you can use this method to carry out preliminary verification of the compressed package. Of course, the verification of the back end still needs to be done.

At this point, we have completed the file encryption, decryption and browser parsing operations, I hope it will be helpful to you

conclusion

File encryption, decryption in the back end is actually a very common operation, in addition to the AES, RSA, in fact, there are many other encryption schemes, specifically can see node.js crypto module, has more built-in schemes can be used directly.

Of course, the encryption and decryption of files can also be directly used by zip, 7Z and other compression tools, combined with the password scheme, generally sufficient, but inevitably there is the need for customization, generally used in combination, for example, the above fileBuffer is actually used to compress and encrypt files with these tools first. Or to the scene as the most important, a variety of programs combined with better effect.

File encryption and decryption here, there are any other questions can be discussed in the comments section, thank you.