Large file upload (slice upload)

  • Convert large files to binary stream format
  • Split a binary stream into multiple pieces using the property that a stream can be sliced
  • Assemble and split fast The same number of requests are made fast, in parallel or serial form
  • After listening that all requests have been successfully sent, a merge signal is sent to the server
Train of thought
  • Step 1: Use the Bolb object, which has a slice method on its prototype. Large files can be sliced
<div> <input type="file" @change="handleFileChange"/> <el-button @click="handleUpload"> </el-button> </div> data(){ return { fileObj:{ file:null }, } } handleFileChange(e){ const [file] = e.target.files if(! file) return this.fileObj.file = file } const SIZE = 10 * 1024 * 1024; CreateFileChunk (file, size = size) {const fileChunkList = []; let cur = 0; While (cur < file.size) {// Slice filechunklist.push ({file: file.slice(cur, cur + size)}); cur += size; } return fileChunkList; },Copy the code
  • Step 2: Upload slices and upload them concurrently
handleUpload() { const fileObj = this.fileObj; if (! fileObj.file) return; const fileChunkList = this.createFileChunk(fileObj.file); this.fileObj.chunkList = fileChunkList.map(({ file },index) => ({ file, size: file.size, percent: 0, chunkName: `${fileObj.file.name}+${index}`, fileName: fileObj.file.name, index, })); await this.uploadChunks(); // Upload slices},Copy the code
// uploadChunks async uploadChunks() {const requestList = this.fileobj.chunklist.map (({file, fileName, index, chunkName }) => { const formData = new FormData(); formData.append("file", file); formData.append("fileName", fileName); formData.append("chunkName", chunkName); return { formData, index }; }) .map(({ formData, index }) => this.axiosRequest({ url: "http://localhost:3000/upload", data: FormData, onUploadProgress: enclosing createProgressHandler (this) fileObj) chunkList [index]), / / to monitor the upload progress callback})); await Promise.all(requestList); // request await this.mergerequest () with promise.all // merge slice}, // Section upload Progress createProgressHandler(item) {return (e) => {// Set the progress percentage of each section item.percent = parseInt(String((E.loaded / e.total) * 100)); }; },Copy the code
/ / merge slice async mergeRequest () {await this. Request ({url: "http://localhost:3000/merge", headers: {" content-type ": "application/json" }, data: JSON.stringify({ filename: this.fileObj.file.name }) }); }},Copy the code
conclusion
  • Click upload button, callcreateFileChunkSlice a file. The number of slices is controlled by the file size
  • createFileChunkThe while loop and the slice method are used to place the slicefileChunkListArray returns
  • When generating file slices, you need to give each slice an identifier as a hash, temporarily using the file name + subscript form
  • calluploadChunksUpload all file slices, place the file slices, slice hash, and file name into FormData, call the request function of the previous step to return a promise, and finally call promise.all to upload all the slices concurrently
  • The front-end callsmergeRequestActively notifies the server to merge, and the server actively merges slices when it receives the request

Breakpoint continuingly

The principle of breakpoint continuation is that the front-end/server needs to remember the uploaded slice so that the next upload can skip the previously uploaded slice.

Train of thought
  • 1: Generates hash based on file content using the spark-MD5 library.

Uploading a large file and reading the content of the file to calculate the hash is very time-consuming. To avoid UI blocking, use web-worker to calculate the hash in the worker thread.

When instantiating web-worker, the parameter is a JS file path that cannot cross domains, so it needs to create a separate hash.js file and place it in the public directory. In addition, dom access is not allowed in worker, but it provides importScripts function for importing external scripts. Import spark-MD5

// public/hash.js self.importScripts('/spark-md5.min.js')// Import script // generate file hash self.onmessage = e => {const { fileChunkList } = e.data; const spark = new self.SparkMD5.ArrayBuffer(); let percentage = 0; let count = 0; const loadNext = index => { const reader = new FileReader(); reader.readAsArrayBuffer(fileChunkList[index].file); reader.onload = e => { count++; spark.append(e.target.result); if (count === fileChunkList.length) { self.postMessage({ percentage: 100, hash: spark.end() }); self.close(); } else { percentage += 100 / fileChunkList.length; self.postMessage({ percentage }); // Recursively calculates the next slice loadNext(count); }}; }; loadNext(0); };Copy the code

In the worker thread, the fileChunkList of file slices is accepted, and the ArrayBuffer of each slice is read by FileReader and continuously transmitted to spark-MD5. After each slice is calculated, a progress event is sent to the main thread through postMessage. When all is done, the final hash is sent to the main thread. Spark-md5 computs a hash value based on all slices. Do not directly add the entire file to the calculation. Otherwise, different files will have the same hash value.

  • 2: Communication logic between main thread and worker thread
CalculateHash (fileChunkList) {return new Promise(resolve =>{// add worker attribute this.container. Worker = new Worker('/hash.js') this.container.worker.postMessage({fileChunkList}); this.container.worker.onmessage = e => { const {percentage,hash} = e.data this.hashPercentage = percentage; if(hash){ resolve(hash) } } }) }, async handleUpload(){ const fileObj = this.fileObj; if (! fileObj.file) return; const fileChunkList = this.createFileChunk(fileObj.file); this.fileObj.hash = await this.calculateHash(fileChunkList); this.fileObj.chunkList = fileChunkList.map(({ file },index) => ({ fileHash:this.fileObj.hash, chunk:file, percentage: 0, hash:this.fileObj.file.name + '-' +index })); await this.uploadChunks() }Copy the code

The main thread uses postMessage to pass all sliced fileChunkList to the worker thread and listens for postMessage events sent by the worker thread to get the file hash.

  • Pause upload (implement breakpoint)

The idea is that using the ABORT method of XMLHttpRequest, you can abort an XHR request by saving the XHR object that uploaded each slice. Each time a slice is successfully uploaded, the corresponding XHR is removed from the requestList, so the requestList saves only the XHR that is uploading the slice.

  • Restore to upload

After file slices are uploaded, the server will establish a folder to store all uploaded slices. Therefore, an interface can be invoked before each front-end upload. The server will return the name of the uploaded slices, and the front-end will skip these uploaded slices to realize the effect of continuous transmission

A pass

Definition: Upload resources already exist on the server, so when the user uploads again, a message is displayed indicating that the upload succeeds

Train of thought

File transfer in seconds depends on the hash generated last time. That is, the server computes the hash before uploading the file and sends the hash to the server for verification. If the server finds a file with the same hash, the server returns a message indicating that the file has been uploaded successfully

async handleUpload(){ if(! this.fileObj.file) return; const fileChunkList = this.createFileChunk(this.fileObj.file); }Copy the code

Additional Web – worker

JS is a single-threaded model that can only do one thing at a time. The function of web-worker is to create a multi-threaded environment for JS, allowing the main thread to create worker threads and assign some tasks to the latter to run. While the main thread is running, the Worker thread is running in the background without interfering with each other.

Worker thread

Worker threads need to have a listener function inside that listens for message events.

self.addEventListener('message',function(e){
  self.postMessage('hello' + e.data)
},false)
Copy the code

Self represents the child thread itself, the global object of the child thread. Self.addeventlistener () specifies the listener function, or you can specify it using self.onMessage. The listener takes an event object whose data property contains the data sent by the main thread. The self.postMessage() method is used to send messages to the main thread. Self.close () closes itself within the Worker.

conclusion

  • Large file upload
    • When the front-end uploads a large file, it slices the file using blob.prototype. slice, uploads multiple slices concurrently, and finally sends a merge request to inform the server to merge the slices.
    • The server receives the slice and stores it, and after receiving the merge request uses the stream to merge the slice into the final file
    • Upload. onProgress listens for slice upload progress of native XMLHtttpRequest
    • Use the vue calculation property to calculate the upload progress of the entire file based on the progress of each slice
  • Breakpoint continuingly
    • Run spark-MD5 to calculate the file hash based on the file content
    • Using the hash function, you can determine whether the server has uploaded the file, prompting the user to upload the file successfully
    • The upload of a slice is suspended through the abort method of XMLHttpRequest
    • Before uploading, the server returns the names of uploaded slices, and the front-end skips the uploading of these slices.