This article is an updated version of the multi-file breakpoint continuation, fragment upload, second upload, and retry mechanism. To see the original implementation, please check this article.
Know what is and what is
File upload I believe many friends have encountered, then maybe you have encountered when uploading large files, upload time is long, and often failed, and failed, and have to upload again is very annoying. Let’s first understand the cause of the failure!
As far as I know, there are probably the following reasons:
- Server configuration: For example, the default file upload size in PHP is 8M [post_max_size = 8M]. If you put more than 8M content in a request body, an exception will occur
- Request timeout: If you set the interface timeout to 10s, an interface fails to upload large files if the response time is longer than 10s.
- Network fluctuation: this belongs to uncontrollable factor, also is more common problem.
For these reasons, smart people come up with the idea of splitting a file into several small files and uploading them one by one, which is called shard uploading.
Network fluctuation this is really uncontrollable, maybe a gust of wind, the network will be disconnected. Well, since the disconnection is out of control, I can only upload the content of files that have already been uploaded, which greatly speeds up the re-upload speed. Hence the term “breakpoint continuation”.
At this point, someone in the crowd inserted a mouth, some documents I have uploaded again, why also in the upload, can not waste my traffic and time. Oh… Well, it’s easy. Every time you upload a file, check if it exists. If it does, you don’t have to upload it again. Since then, the “three brothers” have taken over the file world by themselves.
Note that the code in this article is not the actual code, please go to Github to see the latest code github.com/pseudo-god/…
Shard to upload
HTML
The native INPUT style is ugly, so here we put a Button in a style overlay.
<div class="btns">
<el-button-group>
<el-button :disabled="changeDisabled">
<i class="el-icon-upload2 el-icon--left" size="mini"></i>Select the file<input
v-if=! "" changeDisabled"
type="file"
:multiple="multiple"
class="select-file-input"
:accept="accept"
@change="handleFileChange"
/>
</el-button>
<el-button :disabled="uploadDisabled" @click="handleUpload()"><i class="el-icon-upload el-icon--left" size="mini"></i>upload</el-button>
<el-button :disabled="pauseDisabled" @click="handlePause"><i class="el-icon-video-pause el-icon--left" size="mini"></i>suspended</el-button>
<el-button :disabled="resumeDisabled" @click="handleResume"><i class="el-icon-video-play el-icon--left" size="mini"></i>restore</el-button>
<el-button :disabled="clearDisabled" @click="clearFiles"><i class="el-icon-video-play el-icon--left" size="mini"></i>empty</el-button>
</el-button-group>
var chunkSize = 10 * 1024 * 1024; // Slice size
var fileIndex = 0; // The subscript of the file currently being traversed
data: () = > ({
container: {
files: null
},
tempFilesArr: [].// Store files information
cancels: [].// Store the request to cancel
tempThreads: 3.// Default state
status: Status.wait
}),
Copy the code
A slightly nicer UI comes out.
Select the file
During file selection, several hooks are exposed that should be familiar to those familiar with elementUi, and are basically the same. OnExceed: hooks when the number of files exceeds the limit, beforeUpload: before the file is uploaded
FileIndex is important because it’s a multi-file upload, so it’s important to locate the file that’s being uploaded, basically
handleFileChange(e) { const files = e.target.files; if (! files) return; Object.assign(this.$data, this.$options.data()); // reset all data fileIndex = 0; // Reset the subscript this.container. Files = files; / / files to select the number of the if (this. Limit && enclosing the container. The files. The length > this. Limit) {enclosing onExceed && enclosing onExceed (files); return; } // Copy a filelist object with var index = 0. // The subscript of the selected file. Original file list and the temporary file list does not correspond to the for (const key in this. The container. The files) {if (this. The container. The files. HasOwnProperty (key)) {const file = this.container.files[key]; if (this.beforeUpload) { const before = this.beforeUpload(file); if (before) { this.pushTempFile(file, index); } } if (! this.beforeUpload) { this.pushTempFile(file, index); } index++; Pushtempfilesarr (file, index) {pushTempFile(file, index) {const obj = {status: fileStatus.wait, chunkList: [], uploadProgress: 0, hashProgress: 0, index }; for (const k in file) { obj[k] = file[k]; } console.log('pushTempFile -> obj', obj); this.tempFilesArr.push(obj); }Copy the code
Shard to upload
- Create a slice and loop through the decomposing file
createFileChunk(file, size = chunkSize) {
const fileChunkList = [];
var count = 0;
while (count < file.size) {
fileChunkList.push({
file: file.slice(count, count + size)
});
count += size;
}
return fileChunkList;
}
Copy the code
- Looping to create slices, since we are doing multiple files, so there is a loop to deal with, successively creating file slices, and uploading slices.
async handleUpload(resume) { if (! this.container.files) return; this.status = Status.uploading; const filesArr = this.container.files; var tempFilesArr = this.tempFilesArr; for (let i = 0; i < tempFilesArr.length; i++) { fileIndex = i; Const fileChunkList = this.createFilecHunk (filesArr[tempFilesArr[I].index]); tempFilesArr[i].fileHash ='xxxx'; ChunkList = filechunkList.map (({file}, index) => ({fileHash: tempFilesArr[i].hash, fileName: tempFilesArr[i].name, index, hash: tempFilesArr[i].hash + '-' + index, chunk: File, size: file. Size, event: false at the end of the day. Progress: 0. // upload slices await this.uploadchunks (this.tempfilesarr [I]); }}Copy the code
- The uploadChunks method is only responsible for constructing the data to be passed to the back end. The core uploading function is in the sendRequest method
async uploadChunks(data) { var chunkData = data.chunkList; const requestDataList = chunkData .map(({ fileHash, chunk, fileName, index }) => { const formData = new FormData(); formData.append('md5', fileHash); formData.append('file', chunk); formData.append('fileName', index); Return {formData, index, fileName}; }); try { await this.sendRequest(requestDataList, chunkData); } catch (error) {// upload this.$message. Error (' upload failed, try again '+ error); return; } const isUpload = chunkdata. some(item => item. event === = false); console.log('created -> isUpload', isUpload); If (isUpload) {alert(' failed slice '); } else {// perform merge await this.mergerequest (data); }}Copy the code
- SendReques. Upload is the most important area, and it’s also a failure area, so if we have 10 shards, if we just send 10 requests, it’s very easy to hit the browser bottleneck, so we need to process the requests concurrently.
-
Concurrency: Here I use the for loop to control the initial concurrency, and then call myself in a handler function to control concurrency. In handler, the array API. Shift simulates the queue effect to upload slices.
-
Retry: The retryArr array stores and accumulates the number of retries for each slice file request. For example, [1,0,2], the 0th file slice error is reported once, the second error is reported twice. Const index = forminfo.index to make sure it corresponds to the file; Let’s just take our index from the data. If the request fails, add the failed request to the queue again.
-
I wrote a small Demo about concurrency and retry. If you don’t understand it, you can study it yourself. The file address is github.com/pseudo-god/… , retry code seems to have been lost by me, if we have demand, I fill it again!
-
SendRequest (Forms, chunkData) {var finished = 0; const total = forms.length; const that = this; const retryArr = []; // The array stores the number of hash retries for each file. Return new Promise((resolve, resolve, Reject) => {const handler = () => {if (forms.length) {// stack const formInfo = forms.shift(); const formData = formInfo.formData; const index = formInfo.index; instance.post('fileChunk', formData, { onUploadProgress: that.createProgresshandler(chunkData[index]), cancelToken: new CancelToken(c => this.cancels.push(c)), timeout: 0 }).then(res => { console.log('handler -> res', res); // Change state chunkData[index]. Matches = true; chunkData[index].status = 'success'; finished++; handler(); }). Catch (e => {// If (this.status === status.pause) return; if (typeof retryArr[index] ! == 'number') { retryArr[index] = 0; } // Update status chunkData[index]. Status = 'warning'; RetryArr [index]++; If (retryArr[index] >= this. ChunkRetry) {return reject(' retry failed ', retryArr); } this.tempThreads++; Form.push (formInfo); form.push (formInfo); form.push (formInfo); handler(); }); } if (finished >= total) { resolve('done'); }}; For (let I = 0; i < this.tempThreads; i++) { handler(); }}); }Copy the code
- The upload progress of slices is maintained through the Axios onUploadProgress event, combined with the createProgresshandler method
CreateProgresshandler (item) {return p => {item.progress = parseInt(String((p.loaded/p.total) * 100)); this.fileProgress(); }; }Copy the code
The Hash computation
In fact, it is to calculate the MD5 value of a file, MD5 is used in the whole project in several places.
- You need to check whether the file exists by checking the MD5 value.
- Continuation: MD5 is used as the key value, if only.
This project mainly uses worker to process, which will greatly improve performance and speed.
Since there are multiple files, the HASH calculation progress must be reflected in each file. Therefore, the fileIndex global variable is used to locate the file that is being uploaded
// Generate file hash (web-worker) calculateHash(fileChunkList) {return new Promise(resolve => {this.container.worker = new Worker('./hash.js'); this.container.worker.postMessage({ fileChunkList }); this.container.worker.onmessage = e => { const { percentage, hash } = e.data; if (this.tempFilesArr[fileIndex]) { this.tempFilesArr[fileIndex].hashProgress = Number( percentage.toFixed(0) ); } if (hash) { resolve(hash); }}; }); }Copy the code
Due to the use of worker, we cannot directly use MD5 by using NPM package. You need to download the spark-md5.js file and import it
//hash.js self.importScripts("/spark-md5.min.js"); // Import script // Generate file hash self. onMessage = e => {const {fileChunkList} = e.ata; const spark = new self.SparkMD5.ArrayBuffer(); let percentage = 0; let count = 0; const loadNext = index => { const reader = new FileReader(); reader.readAsArrayBuffer(fileChunkList[index].file); reader.onload = e => { count++; spark.append(e.target.result); if (count === fileChunkList.length) { self.postMessage({ percentage: 100, hash: spark.end() }); self.close(); } else { percentage += 100 / fileChunkList.length; self.postMessage({ percentage }); loadNext(count); }}; }; loadNext(0); };Copy the code
File merging
When all our slices are uploaded, we need to merge the files. Here, we only need to request the interface
mergeRequest(data) { const obj = { md5: data.fileHash, fileName: data.name, fileChunkNum: data.chunkList.length }; Instance. post('fileChunk/merge', obj, {timeout: 0}). Then ((res) => {this.$message. Success (' upload successfully '); }); }Copy the code
Done: The shard upload function is now complete
Breakpoint continuingly
As the name implies, it is broken from that from that start, clear train of thought is very simple. There are two ways to do this, one is for the server side to come back and tell me to start there, and the other is for the browser side to take care of it. Both schemes have advantages and disadvantages. The second method is used in this project.
The HASH value of the file is the key value. After each slice is uploaded successfully, record it. If you need to continue, you can directly skip the existing record. This project will use Localstorage for storage, where I have pre-packaged the addChunkStorage and getChunkStorage methods.
Data stored in Stroage
Cache handling
In the AXIos success callback for slice upload, the successfully uploaded slice is stored
Instance. post('fileChunk', formData,).then(res => {// Store the uploaded slice subscript+ this.addChunkStorage(chunkData[index].fileHash, index);
handler();
})
Copy the code
Check the localstorage and modify the event before uploading the slices
async handleUpload(resume) {
+ const getChunkStorage = this.getChunkStorage(tempFilesArr[i].hash);
tempFilesArr[i].chunkList = fileChunkList.map(({ file }, index) => ({
+ matches: getChunkStorage && getChunkstorage. includes(index), // Check whether the upload is complete
+ progress: getChunkStorage && getChunkStorage.includes(index) ? 100:0,
+ status: getChunkStorage && getChunkStorage.includes(index)? 'success'
+ : 'wait' // Upload status, used for progress display
}));
}
Copy the code
After the event is true, filter out the data
async uploadChunks(data) {
var chunkData = data.chunkList;
const requestDataList = chunkData
+ .filter(({ uploaded }) => ! uploaded).map(({ fileHash, chunk, fileName, index }) => { const formData = new FormData(); formData.append('md5', fileHash); formData.append('file', chunk); formData.append('fileName', index); Return {formData, index, fileName}; })}Copy the code
Garbage file cleanup
As the number of uploaded files increases, the number of junk files will also increase. For example, in some cases, half of the uploaded files will not continue, or the upload fails, and the number of fragmented files will increase. I have two solutions in mind so far
- The front-end sets the cache time at localStorage, and sends a request to the backend to clean up the fragmented files when the time is exceeded. At the same time, the front-end also cleans the cache.
- Both the front and back ends agree that each cache can only be stored for 12 hours after it is generated and will be automatically cleared after 12 hours
There seem to be some problems in the two schemes above, which may lead to abnormal uploading of slices due to time difference between the front and back ends. Please come up with appropriate solutions and update them later.
-Leonard: Done.
A pass
It’s the easiest, but it sounds awesome. Principle: Computes the HASH of the entire file. Before uploading the file, the server sends a request and transmits the MD5 value. The backend retrieves the file. If the file already exists on the server, no subsequent operations are performed and the upload is complete. You can see that
async handleUpload(resume) { if (! this.container.files) return; const filesArr = this.container.files; var tempFilesArr = this.tempFilesArr; for (let i = 0; i < tempFilesArr.length; i++) { const fileChunkList = this.createFileChunk( filesArr[tempFilesArr[i].index] ); // Hash check, whether it is transmitted in seconds+ tempFilesArr[i].hash = await this.calculateHash(fileChunkList);
+ const verifyRes = await this.verifyUpload(
+ tempFilesArr[i].name,
+ tempFilesArr[i].hash
+);
+ if (verifyRes.data.presence) {
+ tempFilesArr[i].status = fileStatus.secondPass;
+ tempFilesArr[i].uploadProgress = 100;
+ } else {Console. log(' Start uploading slice file ---- ', tempFilesArr[I].name); await this.uploadChunks(this.tempFilesArr[i]); }}}Copy the code
VerifyUpload (fileName, fileHash) {return new Promise(resolve => {const obj = {md5: fileHash, fileName, ... UploadArguments // Pass other arguments}; instance .post('fileChunk/presence', obj) .then(res => { resolve(res.data); }) .catch(err => { console.log('verifyUpload -> err', err); }); }); }Copy the code
Done: The second is sent here.
The back-end processing
The article seems to be a little long, the specific code logic will not be posted first, unless someone leave a message request, hee hee, there is time to update
The Node version
Please go to github.com/pseudo-god/… To view
JAVA version
Next week should update the processing
PHP version
I haven’t written PHP for more than a year, but I will make it up when I have time
To be perfect
-
Slice size: this will be dynamically calculated later. You need to automatically calculate the appropriate slice size based on the size of the currently uploaded file. Avoid excessive slices.
-
File appending: The file cannot be added to the queue during file uploading. (I have no idea how to handle this.)
Update record
The component has been running for a period of time, during which several problems have been tested. I thought there were no bugs, but it seems that bugs are quite serious
1. When multiple files with the same content but different file names are uploaded at the same time, the upload fails.
Expected result: After the first file is successfully uploaded, subsequent same files are directly uploaded in seconds
Actual result: After the first file is uploaded successfully, other identical files fail with error messages and incorrect number of blocks.
Cause: After the first file block is uploaded, the loop of the next file is immediately started. As a result, the status of whether the file has been uploaded in seconds cannot be obtained in time, resulting in a failure.
Solution: After the current file fragment has been uploaded and the merging interface has been requested, the next loop is performed.
Change the submethods to synchronous, mergeRequest and uploadChunks methods
2. When the same file is selected each time and the beforeUpload method is triggered, if the same file is selected the second time, the beforeUpload method fails and the whole process fails.
Cause: The data of the last selected input file was not cleared each time the file was selected. The change event of input will not be triggered if the data is the same.
Solution: Clear the data every time you click input. I optimized the other code along the way, see the commit record for details.
<input v-if="! changeDisabled" type="file" :multiple="multiple" class="select-file-input" :accept="accept"+ &western nclick = "f.o uterHTML = f.o uterHTML"
@change="handleFileChange"/>
Copy the code
Rewrote the pause and resume functionality, in fact, mainly added pause and resume states
The previous processing logic was too simple and crude, and there were many problems. Now position the state above each file so that when resuming uploads, you can skip it
Rewrite the file selection logic, add the function of adding files, and optimize the code
Because the fileList cannot be modified, I copied and processed a copy of it before. It is difficult to implement, which increases the redundancy of the whole code and makes it difficult to append files. See ElementUi source discovered a better solution, through the Array. The prototype. Slice. Call (files) will FileList into an Array.
The following is only a list of key modifications. In fact, the whole code has been optimized, and the robustness and readability has been improved a little bit. Later, we will consider introducing some design patterns for better processing.
handleFileChange(e) { const files = e.target.files; if (! files) return; fileIndex = 0; If (this.limit && files.length > this.limit) {this.onexceed && this.onexceed (files); return; } this.status = Status.wait;+ const postFiles = Array.prototype.slice.call(files);
+ postFiles.forEach((item) => {
+ this.handleStart(item);}); } // Formatting files, beforeUpload hook processing, and file appending are also controlled by this function+ handleStart(rawFile) {// Initializes some custom attributes+ rawFile.status = fileStatus.wait;
+ rawFile.chunkList = [];
+ rawFile.uploadProgress = 0;
+ rawFile.hashProgress = 0;
+ if (this.beforeUpload) {
+ const before = this.beforeUpload(rawFile);
+ if (before) {
+ this.uploadFiles.push(rawFile); // Append the file
+}
+}
+ if (! this.beforeUpload) {
+ this.uploadFiles.push(rawFile); // Append the file
+}
+}
Copy the code
4. Deal with the problem of backward progress bar after suspension and recovery
When we click “Restore”, the progress bar will be backward because unsuccessful slices will be uploaded again. This problem is also easy to deal with.
Define the temporary variable fakeUploadProgress to store the current progress at each pause. After the upload resumes, when the current progress is greater than that of fakeUploadProgress, then assign the value.
HandleStart (rawFile) {// Initialize partial custom attributes... rawFile.uploadProgress = 0;+ rawFile.fakeUploadProgress = 0; // False progress bar to handle the problem that the progress bar moves backward after resuming uploading
rawFile.hashProgress = 0;
if (this.beforeUpload)...
Copy the code
When paused, the current progress is assigned to the temporary variable
HandlePause () {this.status = status.pause; if (this.uploadFiles.length) { const currentFile = this.uploadFiles[fileIndex]; currentFile.status = fileStatus.pause; // Assign the current progress to a false progress bar+ currentFile.fakeUploadProgress = currentFile.uploadProgress;} while (this.cancels.length > 0) {this.cancels.pop()(' cancel request '); }}Copy the code
Process the true and false progress bars when calculating progress
FileProgress () {const currentFile = this.uploadFiles[fileIndex]; if (currentFile) { const uploadProgress = currentFile.chunkList.map((item) => item.size * item.progress).reduce((acc, cur) => acc + cur); const currentFileProgress = parseInt((uploadProgress / currentFile.size).toFixed(2)); // Progress bar processing -- Progress bar processing moves later+ if (! currentFile.fakeUploadProgress) {
+ currentFile.uploadProgress = currentFileProgress;
+ this.$set(this.uploadFiles, fileIndex, currentFile);
+ } else if (currentFileProgress > currentFile.fakeUploadProgress) {
+ currentFile.uploadProgress = currentFileProgress;
+ this.$set(this.uploadFiles, fileIndex, currentFile);
+}}}Copy the code
Packaging components
Write a lot of code, in fact, you can not copy the above code, here I encapsulate a component. You can go to Github to download the file, there are use cases, if useful, please remember to give a star, thank you!
Steal a lazy, specific package component code is not listed, we go directly to download the file to view, if there is not understand, can leave a message.
Component document
Attribute
parameter | type | instructions | The default | note |
---|---|---|---|---|
headers | Object | Set the request header | ||
before-upload | Function | Hook before uploading a file. Returns false to stop uploading | ||
accept | String | Accepts the type of file uploaded | ||
upload-arguments | Object | Parameter to be carried when uploading a file | ||
with-credentials | Boolean | Whether to pass cookies | false | |
limit | Number | Maximum number of uploads allowed | 0 | 0 means no limit |
on-exceed | Function | Hook for files exceeding the limit | ||
multiple | Boolean | Whether the mode is multi-select | true | |
base-url | String | Since this component is built in AXIOS, if you need to go proxy, you can configure your base path directly here | ||
chunk-size | Number | Size of each slice | 10M | |
threads | Number | Number of concurrent requests | 3 | |
chunk-retry | Number | Error retry times | 3 | |
### Slot | ||||
The method name | instructions | parameter | note | |
: — — — — — — — — — — — – : | : — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — : | : – : | : – : | |
header | Button area | There is no | ||
tip | Prompt text | There is no |
Back-end interface documentation: According to the documentation implementation
Code address: github.com/debug-null/…
Interface documentation address docs. Apipost. Cn/view / 0 e19f1…