Compressed format

Zip and gzip are two of the most common compression formats we see, although gzip is, of course, rarely used on Windows. Tar is an archive format that does not compress by default. You need to combine gzip to compress the final tar file into a tar.gz file in gzip format, usually abbreviated to TGZ.

Why is rar not mentioned? Because it’s a patented algorithm, you can get the decompression tool for free, and the decompression tool is paid for. Therefore, rar compressed files are rarely provided in our general application scenarios.

This article will show you how to compress and decompress gzip, tar, TGZ, and zip in Node.js.

Uncompressed file library

The uncompressed file library used in this article comes from urllib and needs to be cloned to the specified directory.

git clone https://github.com/node-modules/urllib.git nodejs-compressing-demo
Copy the code

gzip

In the Linux world, the responsibility of each tool is very pure, very simple, such as Gzip, which only compresses files, and how folders are packaged and compressed is not its concern, that is tar’s responsibility.

The gzip command line compresses a file

For example, if we want to gzip the nodejs-compressing-demo/lib/urllib.js file, we will get a urllib.js.gz file and the source file will be deleted.

$ ls -l nodejs-compressing-demo/lib/urllib.js -rw-r--r-- 1 a a 31318 Feb 12 11:27 nodejs-compressing-demo/lib/urllib.js $ gzip nodejs-compressing-demo/lib/urllib.js $ ls -l nodejs-compressing-demo/lib/urllib.js.gz -rw-r--r-- 1 a a 8909 Feb $gunzip nodejs-compressing-demo/lib/urllib.js.gz $gunzip nodejs-compressing-demo/lib/urllib.js.gzCopy the code

The file size was reduced from 31318 bytes to 8909 bytes, which is more than 3.5 times the compression effect.

The file can also be compressed and saved as an arbitrary file using pipe with the cat command:

$ ls -l nodejs-compressing-demo/README.md
-rw-r--r--  1 a  a  13747 Feb 12 11:27 nodejs-compressing-demo/README.md

$ cat nodejs-compressing-demo/README.md | gzip > README.md.gz

$ ls -l README.md.gz
-rw-r--r--  1 a  a  4903 Feb 12 11:50 README.md.gz
Copy the code

Node. Js gzip

Of course, we’re not really going to implement a Gzip algorithm and tools from scratch. In the Node.js world, you already have the basic libraries ready for you, and we just need to use them right out of the box. This article will use the Compressing module to implement all compression and decompression code.

Why did you choose Compressing? Because it has sufficient code quality and unit testing guarantees, is in an active maintenance state, has a very friendly API, and supports streaming interfaces.


Promise interface

const compressing = require('compressing'); // Select gzip format, Then call compressing.gzip.com pressFile compressFile method (' nodejs - compressing - demo/lib/urllib. Js', 'nodejs-compressing-demo/lib/urllib.js.gz') .then(() => { console.log('success'); }) .catch(err => { console.error(err); }); // Decompress is an echo process, Interface is unified for uncompress compressing. Gzip. Uncompress (' nodejs - compressing - demo/lib/urllib. Js. Gz ', 'nodejs-compressing-demo/lib/urllib.js2') .then(() => { console.log('success'); }) .catch(err => { console.error(err); });Copy the code

Combined with the async/await programming model, the code is written as a normal asynchronous IO operation.

const compressing = require('compressing'); async function main() { try { await compressing.gzip.compressFile('nodejs-compressing-demo/lib/urllib.js', 'nodejs-compressing-demo/lib/urllib.js.gz'); console.log('success'); } catch (err) { console.error(err); Try {} / / decompression await compressing the gzip. Uncompress (' nodejs - compressing - demo/lib/urllib. Js. Gz ', 'nodejs-compressing-demo/lib/urllib.js2'); console.log('success'); } catch (err) { console.error(err); } } main();Copy the code

The Stream interface

In particular, programming in Stream mode requires handling error events for each Stream and manually destroying all streams.

fs.createReadStream('nodejs-compressing-demo/lib/urllib.js') .on('error', handleError) .pipe(new compressing.gzip.FileStream()) // It's a transform stream .on('error', handleError) .pipe(fs.createWriteStream('nodejs-compressing-demo/lib/urllib.js.gz2')) .on('error', handleError); // Decompress, Fs.createreadstream ('nodejs-compressing-demo/lib/urllib.js.gz2'). On ('error', handleError) .pipe(new compressing.gzip.UncompressStream()) // It's a transform stream .on('error', handleError) .pipe(fs.createWriteStream('nodejs-compressing-demo/lib/urllib.js3')) .on('error', handleError);Copy the code

According to the official Backpressuring in Streams recommendation, we should use the Pump module in conjunction with Stream mode programming and pump to do the Stream cleanup.

const pump = require('pump'); const source = fs.createReadStream('nodejs-compressing-demo/lib/urllib.js'); const target = fs.createWriteStream('nodejs-compressing-demo/lib/urllib.js.gz2'); pump(source, new compressing.gzip.FileStream(), target, err => { if (err) { console.error(err); } else { console.log('success'); }}); / / decompression pump (fs. CreateReadStream (' nodejs - compressing - demo/lib/urllib. Js. Gz2 '), the new compressing. Gzip. FileStream (), fs.createWriteStream('nodejs-compressing-demo/lib/urllib.js3'), err => { if (err) { console.error(err); } else { console.log('success'); }});Copy the code

Advantages of the Stream interface

The Stream interface looks much more complex than the Promise interface, so why this scenario? In fact, in the field of HTTP services, the Stream model has greater advantages, because HTTP Request itself is a Request Stream. If you want to return an uploaded file with gzip compression, using the Stream interface does not need to save the uploaded file to the local disk. Instead, it consumes the file stream directly.

Using the example code for uploading an Egg file, we can modify it slightly to achieve gzip compression and return.

const pump = require('pump'); class UploadFormController extends Controller { // ... other codes async upload() { const stream = await this.ctx.getFileStream(); / / directly to the compression stream assigned to CTX. Body, achieve compression while the returned flow response this. CTX. Body = pump (stream, new compressing. Gzip. FileStream ()); }}Copy the code

tar | gzip > tgz

As the gzip chapter knows in advance, tar is responsible for packaging folders 📦. For example, to package the entire nodejs-compressing-demo folder into a file and send it to others, you can run the tar command.

$ tar -c -f nodejs-compressing-demo.tar nodejs-compressing-demo/

$ ls -l nodejs-compressing-demo.tar
-rw-r--r--  1 a  a  206336 Feb 12 14:01 nodejs-compressing-demo.tar
Copy the code

As you can see, tar packages tend to be large because they are uncompressed and approximate the actual total folder size. So we all compress at the same time as we pack.

$ tar -c -z -f nodejs-compressing-demo.tgz nodejs-compressing-demo/ $ ls -l nodejs-compressing-demo.tgz -rw-r--r-- 1 a a  39808 Feb 12 14:07 nodejs-compressing-demo.tgzCopy the code

The more than 5-fold size difference between TAR and TGZ can greatly reduce network transmission bandwidth.

Node. Js TGZ

Promise interface

Before using the compressing.tar.com pressDir (sourceDir targetFile) will pack a folder to a tar file, and then use the above way of gzip compression, tar files compressed into TGZ file.

const compressing = require('compressing'); compressing.tar.compressDir('nodejs-compressing-demo', 'nodejs-compressing-demo.tar') .then(() => { return compressing.gzip.compressFile('nodejs-compressing-demo.tar', 'nodejs-compressing-demo.tgz'); }); .then(() => { console.log('success'); }) .catch(err => { console.error(err); }); / / unzip compressing. Gzip. Uncompress (' nodejs - compressing - demo. TGZ, 'nodejs-compressing-demo.tar') .then(() => { return compressing.tar.uncompress('nodejs-compressing-demo.tar', 'nodejs-compressing-demo2'); }); .then(() => { console.log('success'); }) .catch(err => { console.error(err); });Copy the code

Combined with the async/await programming model, the code is much easier to read:

const compressing = require('compressing'); async function main() { try { await compressing.tar.compressDir('nodejs-compressing-demo', 'nodejs-compressing-demo.tar'); await compressing.gzip.compressFile('nodejs-compressing-demo.tar', 'nodejs-compressing-demo.tgz'); console.log('success'); } catch (err) { console.error(err); Try {} / / decompression await compressing the gzip. Uncompress (' nodejs - compressing - demo. TGZ ', 'nodejs - compressing - demo. Tar'); await compressing.tar.uncompress('nodejs-compressing-demo.tar', 'nodejs-compressing-demo2'); console.log('success'); } catch (err) { console.error(err); } } main();Copy the code

The Stream interface

The compress.tar. Stream class allows you to dynamically add any file or folder to a tar Stream object, which is very flexible.

const tarStream = new compressing.tar.Stream(); // dir tarStream.addEntry('dir/path/to/compress'); // file tarStream.addEntry('file/path/to/compress'); // buffer tarStream.addEntry(buffer); // stream tarStream.addEntry(stream); const destStream = fs.createWriteStream('path/to/destination.tgz'); pump(tarStream, new compressing.gzip.FileStream(), destStream, err => { if (err) { console.error(err); } else { console.log('success'); }});Copy the code

zip

Zip can be seen as a “commercial” combination of tar and gzip. It allows users to use zip without having to distinguish between compressed files and folders.

Example of using the zip command line tool to compress a folder:

$ zip -r nodejs-compressing-demo.zip nodejs-compressing-demo/ adding: nodejs-compressing-demo/ (stored 0%) adding: nodejs-compressing-demo/test/ (stored 0%) ... adding: nodejs-compressing-demo/.travis.yml (deflated 36%) $ ls -l nodejs-compressing-demo.* -rw-r--r-- 1 a a 206336 Feb 12 14:06 nodejs-compressing-demo.tar -rw-r--r-- 1 a a 39808 Feb 12 14:07 nodejs-compressing-demo.tgz -rw-r--r-- 1 a a 55484  Feb 12 14:34 nodejs-compressing-demo.zipCopy the code

By comparing TGZ file sizes with ZIP file sizes, you can see that gzip performs better than ZIP under the default compression parameters.

Node. Js zip

The implementation code is similar to tar, except that it is compressed by default and no gzip process is required.

const compressing = require('compressing'); compressing.zip.compressDir('nodejs-compressing-demo', 'nodejs-compressing-demo.zip') .then(() => { console.log('success'); }) .catch(err => { console.error(err); }); / / unzip compressing. Zip. Uncompress (' nodejs - compressing - demo. Zip, 'nodejs-compressing-demo3') .then(() => { console.log('success'); }) .catch(err => { console.error(err); });Copy the code

conclusion

Is compression and decompression based on Node.js easier than you think? Thanks to the giant NPM, programming can be as simple as a command-line tool. Both the Promise interface and the Stream interface have their best scenarios. Will you choose them?

At this point, you have compression and decompression capabilities, what kind of services and functions can you do?

Hope you found this article useful ❤️