This article is an updated version of the multi-file breakpoint continuation, fragment upload, second upload, and retry mechanism. To see the original implementation, please check this article.
Know what is and know what is
File upload I believe many friends have encountered, then maybe you have encountered when uploading large files, upload time is long, and often failed, and failed, and have to upload again is very annoying. Let’s first understand the cause of the failure!
As far as I know, there are probably the following reasons:
- Server configuration: For example, the default file upload size in PHP is 8M [post_max_size = 8M]. If you put more than 8M content in a request body, an exception will occur
- Request timeout: If you set the interface timeout to 10s, an interface fails to upload large files if the response time is longer than 10s.
- Network fluctuation: this belongs to uncontrollable factor, also is more common problem.
For these reasons, smart people come up with the idea of splitting a file into several small files and uploading them one by one, which is called shard uploading. Network fluctuation this is really uncontrollable, maybe a gust of wind, the network will be disconnected. Well, since the disconnection is out of control, I can only upload the content of files that have already been uploaded, which greatly speeds up the re-upload speed. Hence the term “breakpoint continuation”. At this point, someone in the crowd inserted a mouth, some documents I have uploaded again, why also in the upload, can not waste my traffic and time. Oh… Well, it’s easy. Every time you upload a file, check if it exists. If it does, you don’t have to upload it again. Since then, the “three brothers” have taken over the file world by themselves.
Note that the code in this article is not the actual code, please go to Github to see the latest code github.com/pseudo-god.
Shard to upload
HTML
The native INPUT style is ugly, so here we put a Button in a style overlay.
<div class="btns">
<el-button-group>
<el-button :disabled="changeDisabled">
<i class="el-icon-upload2 el-icon--left" size="mini"></i>Select the file<input
v-if=! "" changeDisabled"
type="file"
:multiple="multiple"
class="select-file-input"
:accept="accept"
@change="handleFileChange"
/>
</el-button>
<el-button :disabled="uploadDisabled" @click="handleUpload()"><i class="el-icon-upload el-icon--left" size="mini"></i>upload</el-button>
<el-button :disabled="pauseDisabled" @click="handlePause"><i class="el-icon-video-pause el-icon--left" size="mini"></i>suspended</el-button>
<el-button :disabled="resumeDisabled" @click="handleResume"><i class="el-icon-video-play el-icon--left" size="mini"></i>restore</el-button>
<el-button :disabled="clearDisabled" @click="clearFiles"><i class="el-icon-video-play el-icon--left" size="mini"></i>empty</el-button>
</el-button-group>
<slot
//datadatavar chunkSize = 10 * 1024 * 1024; // Slice sizevar fileIndex = 0;// The subscript of the file currently being traverseddata:() = >({container: {files: null}, tempFilesArr: [], // Store the files information cancels: [], // Store the request tempThreads: 3 to cancel, // Default status status: Status.wait }),Copy the code
A slightly nicer UI comes out.
Select the file
During file selection, several hooks are exposed that should be familiar to those familiar with elementUi, and are basically the same. OnExceed: hooks when the number of files exceeds the limit, beforeUpload: before the file is uploaded
FileIndex is important because it’s a multi-file upload, so it’s important to locate the file that’s being uploaded, basically
handleFileChange(e) {
const files = e.target.files;
if(! files)return;
Object.assign(this.$data, this.$options.data()); // Reset data all data
fileIndex = 0; // Reset the file subscript
this.container.files = files;
// Determine the number of files selected
if (this.limit && this.container.files.length > this.limit) {
this.onExceed && this.onExceed(files);
return;
}
// Copy a filelist object because the filelist is not editable
var index = 0; // The subscript of the selected file is mainly used when the original file list does not correspond to the temporary file list after the file is deleted
for (const key in this.container.files) {
if (this.container.files.hasOwnProperty(key)) {
const file = this.container.files[key];
if (this.beforeUpload) {
const before = this.beforeUpload(file);
if (before) {
this.pushTempFile(file, index); }}if (!this.beforeUpload) {
this.pushTempFile(file, index); } index++; }}},// Store tempFilesArr, split code for the above hook
pushTempFile(file, index) {
// Additional initial value
const obj = {
status: fileStatus.wait,
chunkList: [].uploadProgress: 0.hashProgress: 0,
index
};
for (const k in file) {
obj[k] = file[k];
}
console.log('pushTempFile -> obj', obj);
this.tempFilesArr.push(obj);
}
Copy the code
Shard to upload
- Create a slice and loop through the decomposing file
createFileChunk(file, size = chunkSize) {
const fileChunkList = [];
var count = 0;
while (count < file.size) {
fileChunkList.push({
file: file.slice(count, count + size)
});
count += size;
}
return fileChunkList;
}
Copy the code
- Looping to create slices, since we are doing multiple files, so there is a loop to deal with, successively creating file slices, and uploading slices.
async handleUpload(resume) {
if (!this.container.files) return;
this.status = Status.uploading;
const filesArr = this.container.files;
var tempFilesArr = this.tempFilesArr;
for (let i = 0; i < tempFilesArr.length; i++) {
fileIndex = i;
// Create a slice
const fileChunkList = this.createFileChunk(
filesArr[tempFilesArr[i].index]
);
tempFilesArr[i].fileHash ='xxxx'; // Save a seat
tempFilesArr[i].chunkList = fileChunkList.map(({ file }, index) = > ({
fileHash: tempFilesArr[i].hash,
fileName: tempFilesArr[i].name,
index,
hash: tempFilesArr[i].hash + The '-' + index,
chunk: file,
size: file.size,
uploaded: false.progress: 0.// Upload progress for each block
status: 'wait' // Upload status, used to display progress status
}));
// Upload slices
await this.uploadChunks(this.tempFilesArr[i]); }}Copy the code
- The uploadChunks method is only responsible for constructing the data to be passed to the back end. The core uploading function is in the sendRequest method
async uploadChunks(data) {
var chunkData = data.chunkList;
const requestDataList = chunkData
.map(({ fileHash, chunk, fileName, index }) = > {
const formData = new FormData();
formData.append('md5', fileHash);
formData.append('file', chunk);
formData.append('fileName', index); // The file name uses the subscript of the slice
return { formData, index, fileName };
});
try {
await this.sendRequest(requestDataList, chunkData);
} catch (error) {
// Upload is rejected
this.$message.error('Upload failed, consider trying again.' + error);
return;
}
// merge slices
const isUpload = chunkData.some(item= > item.uploaded === false);
console.log('created -> isUpload', isUpload);
if (isUpload) {
alert('Failed slice exists');
} else {
// Perform the merge
await this.mergeRequest(data); }}Copy the code
-
SendReques. Upload is the most important area, and it’s also a failure area, so if we have 10 shards, if we just send 10 requests, it’s very easy to hit the browser bottleneck, so we need to process the requests concurrently.
-
Concurrency: Here I use the for loop to control the initial concurrency, and then call myself in a handler function to control concurrency. In handler, the array API. Shift simulates the queue effect to upload slices.
-
Retry: The retryArr array stores and accumulates the number of retries for each slice file request. For example, [1,0,2], the 0th file slice error is reported once, the second error is reported twice. Const index = forminfo.index to make sure it corresponds to the file; Let’s just take our index from the data. If the request fails, add the failed request to the queue again.
- I wrote a small Demo about concurrency and retry. If you don’t understand it, you can study it yourself. The file address is github.com/pseudo-god. , retry code seems to have been lost by me, if we have demand, I fill it again!
-
// Concurrent processing
sendRequest(forms, chunkData) {
var finished = 0;
const total = forms.length;
const that = this;
const retryArr = []; // The array stores the number of hash retries for each file. For example, [1,0,2], the 0th file slice error is reported once and the 2nd file slice error is reported twice
return new Promise((resolve, reject) = > {
const handler = () = > {
if (forms.length) {
/ / out of the stack
const formInfo = forms.shift();
const formData = formInfo.formData;
const index = formInfo.index;
instance.post('fileChunk', formData, {
onUploadProgress: that.createProgresshandler(chunkData[index]),
cancelToken: new CancelToken(c= > this.cancels.push(c)),
timeout: 0
}).then(res= > {
console.log('handler -> res', res);
// Change the state
chunkData[index].uploaded = true;
chunkData[index].status = 'success';
finished++;
handler();
})
.catch(e= > {
// If paused, retry is prohibited
if (this.status === Status.pause) return;
if (typeofretryArr[index] ! = ='number') {
retryArr[index] = 0;
}
// Update the status
chunkData[index].status = 'warning';
// Add up the number of errors
retryArr[index]++;
// Retry 3 times
if (retryArr[index] >= this.chunkRetry) {
return reject('Retry failed', retryArr);
}
this.tempThreads++; // Release the currently occupied channel
// Rejoin the failed queue
forms.push(formInfo);
handler();
});
}
if (finished >= total) {
resolve('done'); }};// Control concurrency
for (let i = 0; i < this.tempThreads; i++) { handler(); }}); }Copy the code
- The upload progress of slices is maintained through the Axios onUploadProgress event, combined with the createProgresshandler method
// Slice upload progress
createProgresshandler(item) {
return p= > {
item.progress = parseInt(String((p.loaded / p.total) * 100));
this.fileProgress();
};
}
Copy the code
The Hash computation
In fact, it is to calculate the MD5 value of a file, MD5 is used in the whole project in several places.
- You need to check whether the file exists by checking the MD5 value.
- Continuation: MD5 is used as the key value, if only.
This project mainly uses worker to process, which will greatly improve performance and speed. Since there are multiple files, the HASH calculation progress must be reflected in each file. Therefore, the fileIndex global variable is used to locate the file that is being uploaded
// Generate file hash (web-worker)
calculateHash(fileChunkList) {
return new Promise(resolve= > {
this.container.worker = new Worker('./hash.js');
this.container.worker.postMessage({ fileChunkList });
this.container.worker.onmessage = e= > {
const { percentage, hash } = e.data;
if (this.tempFilesArr[fileIndex]) {
this.tempFilesArr[fileIndex].hashProgress = Number(
percentage.toFixed(0)); }if(hash) { resolve(hash); }}; }); }Copy the code
Due to the use of worker, we cannot directly use MD5 by using NPM package. You need to download the spark-md5.js file and import it
//hash.js
self.importScripts("/spark-md5.min.js"); // Import the script
// Generate the file hash
self.onmessage = e= > {
const { fileChunkList } = e.data;
const spark = new self.SparkMD5.ArrayBuffer();
let percentage = 0;
let count = 0;
const loadNext = index= > {
const reader = new FileReader();
reader.readAsArrayBuffer(fileChunkList[index].file);
reader.onload = e= > {
count++;
spark.append(e.target.result);
if (count === fileChunkList.length) {
self.postMessage({
percentage: 100.hash: spark.end()
});
self.close();
} else {
percentage += 100/ fileChunkList.length; self.postMessage({ percentage }); loadNext(count); }}; }; loadNext(0);
};
Copy the code
File merging
When all our slices are uploaded, we need to merge the files. Here, we only need to request the interface
mergeRequest(data) {
const obj = {
md5: data.fileHash,
fileName: data.name,
fileChunkNum: data.chunkList.length
};
instance.post('fileChunk/merge', obj,
{
timeout: 0
})
.then((res) = > {
this.$message.success('Upload successful');
});
}
Copy the code
Done: The shard upload function is now complete
Breakpoint continuingly
As the name implies, it is broken from that from that start, clear train of thought is very simple. There are two ways to do this, one is for the server side to come back and tell me to start there, and the other is for the browser side to take care of it. Both schemes have advantages and disadvantages. The second method is used in this project.
The HASH value of the file is the key value. After each slice is uploaded successfully, record it. If you need to continue, you can directly skip the existing record. This project will use Localstorage for storage, where I have pre-packaged the addChunkStorage and getChunkStorage methods.
Data stored in Stroage
Cache handling
In the AXIos success callback for slice upload, the successfully uploaded slice is stored
instance.post('fileChunk', formData, )
.then(res= > {
// Store the uploaded slice subscript
+ this.addChunkStorage(chunkData[index].fileHash, index);
handler();
})
Copy the code
Check the localstorage and modify the event before uploading the slices
async handleUpload(resume){+const getChunkStorage = this.getChunkStorage(tempFilesArr[i].hash);
tempFilesArr[i].chunkList = fileChunkList.map(({ file }, index) = > ({
+ uploaded: getChunkStorage && getChunkStorage.includes(index), // Flag: Whether the upload is complete
+ progress: getChunkStorage && getChunkStorage.includes(index) ? 100 : 0,
+ status: getChunkStorage && getChunkStorage.includes(index)? 'success'
+ : 'wait' // Upload status, used to display progress status
}));
}
Copy the code
After the event is true, filter out the data
async uploadChunks(data) {
var chunkData = data.chunkList;
const requestDataList = chunkData
+ .filter(({ uploaded }) = >! uploaded) .map(({ fileHash, chunk, fileName, index }) = > {
const formData = new FormData();
formData.append('md5', fileHash);
formData.append('file', chunk);
formData.append('fileName', index); // The file name uses the subscript of the slice
return{ formData, index, fileName }; })}Copy the code
Garbage file cleanup
As the number of uploaded files increases, the number of junk files will also increase. For example, in some cases, half of the uploaded files will not continue, or the upload fails, and the number of fragmented files will increase. I have two solutions in mind so far
- The front-end sets the cache time at localStorage, and sends a request to the backend to clean up the fragmented files when the time is exceeded. At the same time, the front-end also cleans the cache.
- Both the front and back ends agree that each cache can only be stored for 12 hours after it is generated and will be automatically cleared after 12 hours
There seem to be some problems in the two schemes above, which may lead to abnormal uploading of slices due to time difference between the front and back ends. Please come up with appropriate solutions and update them later.
-Leonard: Done.
A pass
It’s the easiest, but it sounds awesome. Principle: Computes the HASH of the entire file. Before uploading the file, the server sends a request and transmits the MD5 value. The backend retrieves the file. If the file already exists on the server, no subsequent operations are performed and the upload is complete. You can see that
async handleUpload(resume) {
if (!this.container.files) return;
const filesArr = this.container.files;
var tempFilesArr = this.tempFilesArr;
for (let i = 0; i < tempFilesArr.length; i++) {
const fileChunkList = this.createFileChunk(
filesArr[tempFilesArr[i].index]
);
// Hash check, whether it is transmitted in seconds
+ tempFilesArr[i].hash = await this.calculateHash(fileChunkList);
+ const verifyRes = await this.verifyUpload(
+ tempFilesArr[i].name,
+ tempFilesArr[i].hash
+ );
+ if (verifyRes.data.presence) {
+ tempFilesArr[i].status = fileStatus.secondPass;
+ tempFilesArr[i].uploadProgress = 100; +}else {
console.log('Start uploading sliced files ----', tempFilesArr[i].name);
await this.uploadChunks(this.tempFilesArr[i]); }}}Copy the code
// Check the file before uploading: check whether the file exists
verifyUpload(fileName, fileHash) {
return new Promise(resolve= > {
const obj = {
md5: fileHash, fileName, ... this.uploadArguments// Pass other parameters
};
instance
.post('fileChunk/presence', obj)
.then(res= > {
resolve(res.data);
})
.catch(err= > {
console.log('verifyUpload -> err', err);
});
});
}
Copy the code
Done: The second is sent here.
The back-end processing
The article seems to be a little long, the specific code logic will not be posted first, unless someone leave a message request, hee hee, there is time to update
The Node version
Please go to github.com/pseudo-god…. To view
JAVA version
Next week should update the processing
PHP version
I haven’t written PHP for more than a year, but I will make it up when I have time
To be perfect
- Slice size: this will be dynamically calculated later. You need to automatically calculate the appropriate slice size based on the size of the currently uploaded file. Avoid excessive slices.
- File appending: The file cannot be added to the queue during file uploading. (I have no idea how to handle this.)
Update record
The component has been running for a period of time, during which several problems have been tested. I thought there were no bugs, but it seems that bugs are quite serious
Bug-1: Upload failure occurs when multiple files with the same content but different file names are uploaded at the same time.
Expected result: After the first file is successfully uploaded, subsequent same files are directly uploaded in seconds
Actual result: After the first file is uploaded successfully, other identical files fail with error messages and incorrect number of blocks.
Cause: After the first file block is uploaded, the loop of the next file is immediately started. As a result, the status of whether the file has been uploaded in seconds cannot be obtained in time, resulting in a failure.
Solution: After the current file fragment has been uploaded and the merging interface has been requested, the next loop is performed.
Change the submethods to synchronous, mergeRequest and uploadChunks methods
Bug-2: When the same file is selected each time and the beforeUpload method is triggered, if the same file is selected the second time, the beforeUpload method fails and the entire process fails.
Cause: The data of the last selected input file was not cleared each time the file was selected. The change event of input will not be triggered if the data is the same.
Solution: Clear the data every time you click input. I optimized the other code along the way, see the commit record for details.
<input v-if="! changeDisabled" type="file" :multiple="multiple" class="select-file-input" :accept="accept" + &western nclick = "f.o uterHTML = f.o uterHTML" @ change = "handleFileChange" / >Copy the code
Rewrote the pause and resume functionality, in fact, mainly added pause and resume states
The previous processing logic was too simple and crude, and there were many problems. Now position the state above each file so that when resuming uploads, you can skip it
Packaging components
Write a lot of code, in fact, you can not copy the above code, here I encapsulate a component. You can go to Github to download the file, there are use cases, if useful, please remember to give a star, thank you!
Steal a lazy, specific package component code is not listed, we go directly to download the file to view, if there is not understand, can leave a message.
Component document
Attribute
parameter | type | instructions | The default | note | |
---|---|---|---|---|---|
headers | Object | Set the request header | |||
before-upload | Function | Hook before uploading a file. Returns false to stop uploading | |||
accept | String | Accepts the type of file uploaded | |||
upload-arguments | Object | Parameter to be carried when uploading a file | |||
with-credentials | Boolean | Whether to pass cookies | false | ||
limit | Number | Maximum number of uploads allowed | 0 | 0 means no limit | |
on-exceed | Function | Hook for files exceeding the limit | |||
multiple | Boolean | Whether the mode is multi-select | true | ||
base-url | String | Since this component is built in AXIOS, if you need to go proxy, you can configure your base path directly here | |||
chunk-size | Number | Size of each slice | 10M | ||
threads | Number | Number of concurrent requests | 3 | The higher the number of concurrent requests, the higher the performance requirements on the server. Use the default value as much as possible | |
chunk-retry | Number | Error retry times | 3 | Number of error retries of a fragment request |
Slot
The method name | instructions | parameter | note |
---|---|---|---|
header | Button area | There is no | |
tip | Prompt text | There is no |
Back-end interface documentation: According to the documentation implementation
Code Address:Github.com/pseudo-god….
Interface document address docs.apipost.cn/view/…