The business requirements
Product Manager: Xiaoming, we need to make an attachment upload requirement, the content may be pictures, PDF or video.
Xiaoming: It can be done, but limit the file size. Had better not exceed 30MB, too big upload is slow, the server pressure is also big.
Product Manager: Communication down, video is a must. Let’s limit it to 50MB or less.
Xiaoming: Sure.
Student: This file upload is too slow, I tried a 50MB file, took a minute.
Xiaoming: Whats up, so slow.
Product manager: No, you’re taking too long. Try to optimize it.
Optimal path
Problem orientation
The overall file upload call link is shown as follows:
Xiao Ming found that it took nearly 30 seconds for the front-end to upload the request to the back-end, which should be caused by the browser parsing the file.
Back-end services are also slow to request file services.
The solution
Xiaoming: Does the file service have an asynchronous interface?
File service: not available at the moment.
Xiaoming: This upload is really slow, any suggestions for optimization?
File service: no, that’s how slow it is.
Xiao Ming:…
Finally, Xiaoming decided to adjust the synchronous return of the back end to asynchronous return to reduce the user’s waiting time.
The implementation of the back end has been adjusted to adapt to the business. The front end gets the asynchronous return identifier after calling, and the back end queries the results returned by the file service synchronously according to the identifier.
Disadvantages are also obvious, asynchronous upload failure, the user is not aware of.
But due to time reasons, that is, to weigh the pros and cons, temporarily online.
Xiao Ming has some time recently, so he wants to implement a file service.
File service
Due to the function of the file service is very primitive, Xiaoming wanted to implement one by himself and optimize it from the following aspects:
(1) compression
(2) asynchronous
(3) second transmission
(4) Concurrency
(5) Direct connection
The compression
In daily development, communicate clearly with the product as far as possible, and let users upload/download compressed package files.
Because network transmission is very time consuming.
Compressed files also have the benefit of saving storage space, a cost we don’t usually consider.
Advantages: simple implementation, effect pulling group.
Cons: Need to align with the business and convince the product. If the product wants images to preview and video to play, compression is not very suitable.
asynchronous
For time-consuming operations, we naturally want to perform them asynchronously, reducing the time the user has to wait synchronously.
When the server receives the file contents, it returns a request id and executes the processing logic asynchronously.
So how do you get the results?
There are two common schemes:
(1) Provide result query interface
Relatively simple, but there can be invalid queries.
(2) Provide asynchronous result callback function
Implementation is more troublesome, you can obtain the results of the first time.
A pass
Friends should have used cloud disk, cloud disk sometimes upload files, very large files, but can be instantly uploaded to complete.
How does this work?
Each file content corresponds to a unique file hash value.
Before uploading, we can query whether the hash value exists. If it exists, we can add a reference directly, skipping the link of file transfer.
Of course, this only works if your user files have a large amount of data and a certain repetition rate.
The pseudocode is as follows:
public FileUploadResponse uploadByHash(final String fileName,
final String fileBase64) {
FileUploadResponse response = new FileUploadResponse();
// Check whether the file exists
String fileHash = Md5Util.md5(fileBase64);
FileInfoExistsResponse fileInfoExistsResponse = fileInfoExists(fileHash);
if(! RespCodeConst.SUCCESS.equals(fileInfoExistsResponse.getRespCode())) { response.setRespCode(fileInfoExistsResponse.getRespCode()); response.setRespMessage(fileInfoExistsResponse.getRespMessage());return response;
}
Boolean exists = fileInfoExistsResponse.getExists();
FileUploadByHashRequest request = new FileUploadByHashRequest();
request.setFileName(fileName);
request.setFileHash(fileHash);
request.setAsyncFlag(asyncFlag);
// The file does not exist
if(! Boolean.TRUE.equals(exists)) { request.setFileBase64(fileBase64); }// Call the server
return fillAndCallServer(request, "api/file/uploadByHash", FileUploadResponse.class);
}
Copy the code
concurrent
Another way is to shard a larger file.
For example, a 100MB file is cut into 10 subfiles and uploaded concurrently. A file corresponds to a unique batch number.
Download, according to the batch number, download the file concurrently, spliced into a complete file.
The pseudocode is as follows:
public FileUploadResponse concurrentUpload(final String fileName,
final String fileBase64) {
// Start with segments
int limitSize = fileBase64.length() / 10;
final List<String> segments = StringUtil.splitByLength(fileBase64, limitSize);
// Concurrent upload
int size = segments.size();
final ConcurrentHashMap<Integer, String> map = new ConcurrentHashMap<>();
final CountDownLatch lock = new CountDownLatch(size);
for(int i = 0; i < segments.size(); i++) {
final int index = i;
Thread t = new Thread() {
public void run(a) {
// Concurrent upload
// countDownlock.countDown(); }}; t.start(); }// Wait for completion
lock.await();
// For upload information processing
}
Copy the code
The directly connected
Of course, another strategy is for the client to directly access the server and skip the back-end service.
Of course, this prerequisite requires that the file service provide an HTTP file upload interface.
You also need to consider security issues. It is better for the front end to call the back end, obtain the authorization token, and then carry the token to upload files.
Develop reading
4 ways to improve file upload Performance, can you?
7 ways to realize asynchronous query to synchronization
Java Compression archive algorithm framework tool COMPRESS
summary
File upload is a very common business requirement, and upload performance is definitely an issue to consider and optimize.
The above methods can be flexibly combined to better practice with your own business.
I hope this article is helpful to you. If you like it, please click to collect and forward a wave.
I am old ma, looking forward to meeting with you next time.