Instead of generating a ZIP file and transferring it from your server, why not download the data and compress it into a browser?
I recently worked on a side project that generated reports based on user requests. For each request, our back end will generate a report, upload it to the Amazon S3 store, and return its URL to the client. Because the report takes some time to generate, the output file is stored and the server caches its URL with the request parameters. If the user orders the same item, the back end returns the URL of the existing file.
A few days ago, I had a new requirement that I download a ZIP file containing hundreds of reports instead of a single file. The first solution I came up with was:
- Prepare the compressed file on the server
- Upload to Amazon S3 storage
- Provide the client with a download URL
However, this solution has some disadvantages:
- The logic for generating zip files is complex. I need to consider generating all files for each request, or a combination between reusing existing files and generating new ones. Both approaches seem complicated. They will take some time to process and require a lot of coding, testing, and maintenance later.
- It doesn’t take advantage of the functionality I’ve already built. Although the ZIP file is a different set of reports, it is likely that most individual reports were generated from earlier requests. Thus, while zip files themselves are unlikely to be reusable, individual files can be. Using the above method, I had to redo the whole process all the time, which wasn’t very efficient.
- It takes a long time to generate a ZIP file. Since my back end is a single-threaded process, this operation may block other requests for a while, and may time out in the meantime.
- Tracking the flow on the client side is very difficult, and I like to place progress bars on the site. If everything is handled on the back end, I need to find other ways to report status on the front end. It’s not easy.
- I want to save on infrastructure costs. It would be great if we could move some computing to the front end and reduce the cost of the infrastructure. My customers don’t mind if they wait a few seconds or spend an extra MB of RAM on their laptop.
The final solution I came up with was to download all the files into a browser and then compress them. In this article, I’ll show you how to do it.
Disclaimer: In this article, I assume that you already have a basic knowledge of Javascript and promises. If you don’t, I suggest you get to know them first and then come back here 🙂
Downloading a single file
My system allows you to download a report file before applying the new solution. There are many ways to do this. The back end can respond to the contents of the original file directly via HTTP requests, or it can upload the file to another storage device and return the file URL. I choose the second method because I want to cache all the generated files.
Once you have the file URL, the job on the client side is simple: open the URL in a new TAB. The browser will do the rest to download the file.
const downloadViaBrowser = url= > {
windowThe open (url, "_blank"); }Copy the code
Download multiple files and store them in memory
When downloading and compressing multiple files, we can no longer use the simple method above.
- If a JS script tries to open many links at once, the browser wonders if it is a security threat and warns the user to prevent these actions. While the user can confirm to continue, it’s not a good experience
- You have no control over the files you download; the browser manages the content and location of the files
Another way around this problem is to use fetch to download files and store the data in memory as bloBs. We can then write it to a file or merge the Blob data into a ZIP file.
const download = url= > {
return fetch(url).then(resp= > resp.blob());
};
Copy the code
This function returns a promise that is resolved as a BLOb. We can combine promise.all () to download multiple files. Promise.all() completes all of the promises at once, resolving if all of the child promises are resolved, or if one of the promises is wrong.
const downloadMany = urls= > {
return Promise.all(urls.map(url= > download(url))
}
Copy the code
Click X file group to download
But what if we need to download a lot of files at once? Let’s say I have 1,000 files, right? Using promise.all () may no longer be a good idea, and your code will send a thousand requests at a time. There are many problems with this approach:
- The number of concurrent connections supported by the operating system and browser is limited. As a result, the browser can only handle several requests at a time. Other requests are queued and timeouts are counted. As a result, most of your requests will time out before they can be sent.
- Sending too many requests at once can also overload the back end
The solution I considered was to split the files into groups. Let’s say I have 1,000 files to download. Rather than start downloading all the files at once with promise.all (), I’ll download 5 files at a time. After these 5, I’ll start another package, and I’ll download 250 packages in total.
To do this, we can do a custom logic. Or I suggest an easier way to do this, using a third-party library called BlueBirdJS. The library implements a number of useful Promise functions. In this use case, I’ll use promise.map (). Note that the Promise here is now a custom Promise provided by the library, not a built-in Promise.
import Promise from 'bluebird';
const downloadByGroup = (urls, files_per_group=5) = > {
return Promise.map(
urls,
async url => {
return await download(url);
},
{concurrency: files_per_group}
);
}
Copy the code
With the above implementation, the function will take an array of urls and start downloading all urls, each with a maximum files_per_group. This function returns a Promise that will be resolved when all urls are downloaded and rejected if any of them fail.
Creating a ZIP file
Now I’ve downloaded everything into memory. As I mentioned above, downloaded content is stored as a Blob. The next step is to create a compressed file using the Blob data.
import JsZip from 'jszip';
import FileSaver from 'file-saver';
const exportZip = blobs= > {
const zip = JsZip();
blobs.forEach((blob, i) = > {
zip.file(`file-${i}.csv`, blob);
});
zip.generateAsync({type: 'blob'}).then(zipFile= > {
const currentDate = new Date().getTime();
const fileName = `combined-${currentDate}.zip`;
return FileSaver.saveAs(zipFile, fileName);
});
}
Copy the code
The final code
Let’s do all the code I’ve done for this right here.
import Promise from 'bluebird';
import JsZip from 'jszip';
import FileSaver from 'file-saver';
const download = url= > {
return fetch(url).then(resp= > resp.blob());
};
const downloadByGroup = (urls, files_per_group=5) = > {
return Promise.map(
urls,
async url => {
return await download(url);
},
{concurrency: files_per_group}
);
}
const exportZip = blobs= > {
const zip = JsZip();
blobs.forEach((blob, i) = > {
zip.file(`file-${i}.csv`, blob);
});
zip.generateAsync({type: 'blob'}).then(zipFile= > {
const currentDate = new Date().getTime();
const fileName = `combined-${currentDate}.zip`;
return FileSaver.saveAs(zipFile, fileName);
});
}
const downloadAndZip = urls= > {
return downloadByGroup(urls, 5).then(exportZip);
}
Copy the code
conclusion
Leveraging the capabilities of the client side can sometimes be useful to reduce the workload and complexity of the back end.
Don’t send too many requests at once. You’ll have trouble on both the front end and the back end. Instead, break the work into small pieces.
Introduce some third-party libraries bluebird, Jszip and File-Saver. They work well for me and may be helpful to you 🙂
Source: Levelup.gitConnected.com translation: Front-end Full Stack Developer