preface

I wrote an article about single file fragment uploading and breakpoint continuation after reading digg friend, which was full of interest, so I began to study it myself. After a period of study, I wrote a small demo by myself. This article will record my coding problems and summarize my ideas for small demo.

Technical key words

Front end: @ vue/cli – service + element – UI + axios

The backend: node. Js + koa

Analysis of coding ideas

Drag and drop to upload

Drag upload is a new feature of HTML5 to implement drag upload, detailed usage can be read mDN-drag

Files are processed using the dragover event, which is triggered when something is dragged within the scope of another object container, and the DROP event, which is triggered when the mouse key is released during a drag.

File Fragment upload

The general idea of file fragmentation is that the front end breaks the large file into a small piece and sends it to the back end, which saves the fragment file. Then, after uploading all the fragment files, the front end invokes the merge interface of the back end to inform the back end to read and write the saved fragment file into the new file.

Before implementing file fragment upload, the front-end needs to consider the following questions:

  1. How do I get the file selected by the user

  2. How and how large files should be divided

  3. How to distinguish the order of small fragments of a large file so that the back end can know which file to read first

  4. How do I distinguish between multiple large files

  5. When a user uploads multiple large files, what does the front end use to save the small file slices corresponding to each large file

  6. How do I send each shard’s data to the back end

  7. How to notify the backend that all slices have been uploaded

Before implementing file sharding upload, the backend should first consider the following questions:

  1. The interface for saving fragments requires the fields given by the front end to distinguish the saving of different files

  2. Merge file interface, how to read a file to generate a new file

These problems are not difficult to implement when the thinking is clear, which is also what I learned in the process of learning.

The change event is used to get the file first. Multiple file uploads can be implemented by setting the multiple attribute to the input tag. After obtaining a large file, you can slice the large file using the slice method. The size of the slice can be selected according to the props uploaded by the user. Because users may upload multiple files, it is necessary to associate small files corresponding to each large file, so as not to mess up the number of file data. Here, I use objects for processing and saving, the code is as follows:

	// Accept file functions
    handleChange(e){
      this.filesAry=Array.from(e.target.files);
      this.data=this.createChunk(this.filesAry);
    },
	// Cut a single file
    handleChunk(file){
      let current=0;
      let fileList=[];
      while(current<=file.size){
        fileList.push({
          file:file.slice(current,this.SIZE+current)
        });
        current+=this.SIZE;
      }
      return fileList;
    },
    // Cut large files
    createChunk(files=[]){
      let filesObj=files.reduce((pre,cur,index,ary) = >{
        pre[`${cur.name}_${index}`] =this.handleChunk(cur);
        returnpre; }, {});return filesObj;
    },
Copy the code

Finally, this.data is the container where the user uploads all the files. Key is composed of file name + index, and the corresponding value of key is an array, which stores the small file slices corresponding to each large file.

At this point, we have basically completed the coding problem of front-end 1-2.

Next, each shard in this.data is the data to be sent to the back end, which is to send the request. In the demo, the request is made through AXIOS and the data is sent as a FormData form. So axios wraps the request like this:

createRequest({method='post',url=' ',data={}}){
    return axios({
            method:method,
            url:url,
            data:data,
          })
},
Copy the code

Therefore, before the request, each small file corresponding to this.data should be assembled into formData data format, and the data transmitted is as follows: File == the file itself, index== the index value of the small file, hash== the name of the large file _ Index of the large file, nameHash== MD5 of the file, filename== the filename. These data I think are sent to the back end, so that the back end of the file processing, any data can be directly used.

The MD5 (which is the only value) of the file is needed to assemble formData. Therefore, spark-MD5 is used. For details, see the official website. Generate MD5 code as follows:

createMd5(fileChunkList=[]){
    let currentChunk=0,md5;
    let reader=new FileReader();
    let spark = new SparkMD5.ArrayBuffer();	
    function readFile(){
        if(fileChunkList[currentChunk].file){
            reader.readAsArrayBuffer(fileChunkList[currentChunk].file)
        }
    }
    readFile();
    return new Promise(resolve= >{
        reader.onload=e= >{
            currentChunk++;
            spark.append(e.target.result);
            if(currentChunk<fileChunkList.length){
                readFile();
            }else{ md5=spark.end(); resolve(md5); }}; })},Copy the code

We can then generate formData for each shard, and then call the request function for each formDta object one by one to complete the slice request encapsulation. The code is as follows:

// Assemble each slice into a FormData object
    createFormDataRequest(files=[],prop=' ',nameHash=' ',fileName=' ',fileChunk=[]){
      let target=files.map((file,index) = >{
          let formdata= new FormData();
          formdata.append('file',file.file);
          formdata.append('index',index);
          formdata.append('hash',prop);
          formdata.append('nameHash',nameHash);
          formdata.append('filename',fileName);
          return {formdata,index};
      }).map(({formdata,index}) = >{
        return this.createRequest({
            method:'post'.url:'http://localhost:3001/api/handleUpload'.data:formdata,
          })
      })
      return target;
    },
Copy the code

Finally, generate MD5 in the click handle upload button handler and pass it to createFormDataRequest. Since there are multiple large file uploads, loop through this.data, passing the array of small files corresponding to each large file to createMd5 for processing. The code is as follows:

// Click upload function
    async handleUpload(e){
       this.targetRequest={};
      for(let prop in this.data){
        if(this.data.hasOwnProperty(prop)){
            let fileName=splitFilename(prop);
            let nameHash=await this.createMd5(this.data[prop]);
            this.targetRequest[`${fileName}_${nameHash}`] =this.createFormDataRequest(this.data[prop],prop,nameHash,fileName); }}// Send the request, and merge when the request completes
      Object.keys(this.targetRequest).forEach(async key=>{
        let {filename,nameHash} =splitFileHash(key);
        await Promise.all(this.targetRequest[key]).then(async res=>{
          this.createRequest({
              method:'post'.url:'http://localhost:3001/api/handleMerge'.data:{
                filename,
                nameHash,
                SIZE:this.SIZE
              },
          }).then(res= >{
            this.$message.success({
              message:`${filename}Upload successful ~ '})})})})},Copy the code

The retrieved targetRequest is an array that holds promises (that is, each shard). All requests are sent through concurrent processing, and after the request is completed, a merge function is called to inform the back end that it is time to merge the files.

Basically, the big file shard upload front end is done, and the back end is the interface that writes shard processing and shard merging.

When processing fragments, the backend creates a folder to store each fragment. The folder name is nameHash, and each fragment is renamed as filename _ index of the file.

const Router=require('koa-router');
const apiRouter=new Router();
const path=require('path');
const fs=require('fs');
const targetPath=path.resolve(__dirname,'.. /target/');

const splitExt=(filename=' ') = >{
  let name= filename.slice(0,filename.lastIndexOf('. '));
  let ext=filename.slice(filename.lastIndexOf('. ') +1,filename.length);
  return {name,ext};
}


// Merge files
/** * hash: large filename + large file index * nameHash: MD5 of a large file * filename: filename * index: index of a small file fragment */
apiRouter.post('/api/handleUpload'.async (ctx)=>{
  const {hash,nameHash,filename,index}=ctx.request.body;
  const chunkPath=path.resolve(targetPath,`${nameHash}`);
  if(! fs.existsSync(chunkPath)){await fs.mkdirSync(chunkPath);
  }
  const {name,ext}=splitExt(filename);
  console.log(ctx.request.files.file.path,The '-',index);
  await fs.renameSync(ctx.request.files.file.path,`${chunkPath}/${filename}_${index}`);
  return ctx.response.status=200;
});



module.exports=apiRouter;
Copy the code

When handling merges, the key is to have the file name and extension of the target file, and then write the content to the target file as a createWriteStream and createReadStream stream. Part of the code is as follows:

const Router=require('koa-router');
const apiRouter=new Router();
const path=require('path');
const fs=require('fs');
const targetPath=path.resolve(__dirname,'.. /target/');
const splitExt=(filename=' ') = >{
  let name= filename.slice(0,filename.lastIndexOf('. '));
  let ext=filename.slice(filename.lastIndexOf('. ') +1,filename.length);
  return {name,ext};
}
// File merge
/** * Parameters: filename: large file mingc * nameHash: MD5 of the file * SIZE: cut SIZE */
apiRouter.post('/api/handleMerge'.async (ctx)=>{
  const {filename,nameHash,SIZE}=ctx.request.body;
  const targetFilePath=path.resolve(targetPath,`${filename}`);
  const pipStream = (path, writeStream) = > {
		return new Promise(resolve= > {
			const readStream = fs.createReadStream(path);
			readStream.on("end".function(err){
				if(err) throw err;
				// fs.unlinkSync(path);
				resolve();
			});
			readStream.pipe(writeStream,{end:false}); })}; fs.readdir(path.resolve(targetPath,nameHash),async (err,files)=>{
    if(err) return console.log('err:',err);
    files.sort((a,b) = >a.split('_') [1]-b.split('_') [1]);
    files=files.map(file= >path.resolve(targetPath,nameHash,file));
    Promise.all(files.map(async(file,index)=>{
      return pipStream(file,fs.createWriteStream(targetFilePath,{
        start:index * SIZE,
        end:(index+1)*SIZE,
      }))
    }))
  })
  ctx.response.status=200;
})

module.exports=apiRouter;

Copy the code

Breakpoint continuingly

The concept of resumable upload is that when the user clicks the pause button, the request of the small slice that is being sent is suspended. After clicking the resume upload button, the user can upload the small slice on the basis of the previously uploaded slice again.

The general question is:

  1. How does the front end get the request abort functions for each small slice, and when to process the point in time for these abort functions

  2. How do I get the small slices that have been passed when I click Resume upload again

  3. After obtaining the small slices, how to filter the uploaded slices from the original slice array

In Demo, the name of uploaded file slices is returned through the backend. Therefore, an interface is provided to return uploaded slices and receive MD5 and large file names of large files. The code is as follows:

const Router=require('koa-router');
const apiRouter=new Router();
const path=require('path');
const fs=require('fs');
const targetPath=path.resolve(__dirname,'.. /target/');

const splitExt=(filename=' ') = >{
  let name= filename.slice(0,filename.lastIndexOf('. '));
  let ext=filename.slice(filename.lastIndexOf('. ') +1,filename.length);
  return {name,ext};
}

// Upload the file again
/** * nameHash: MD5 of a large file * filename: filename */
apiRouter.post('/api/handleAgain'.async (ctx)=>{
  const {nameHash,filename}=ctx.request.body;
  console.log(nameHash,filename)
  const chunkPath=path.resolve(targetPath,`${nameHash}`);
  if(fs.existsSync(chunkPath)){
    let filesChunk=await fs.readdirSync(chunkPath);
    return ctx.body={fileChunk:filesChunk,filename:filename,flag:true}; / / found
  }else{
    return ctx.body={fileChunk: [].filename:filename,flag:false};  / / couldn't find it}});module.exports=apiRouter;
Copy the code

The interface not only returns the uploaded file slices to the front end, but also gives a flag identifier (whether the code can find the folder of the slices), and also gives the front end the file name together, because it is multiple files to upload, so that the front end will restore the upload function later to process the data more easily.

Demo is an abort function that generates each small slice via Axios’s CancelToken, and is retrieved at the request of the interceptor. Then click the pause button to suspend the function, and then clear. Of course, remember to filter out abort functions that have completed the request. Add code to createRequest:

axios.interceptors.request.use((config) = >{
      let CancelToken = axios.CancelToken;
      // Set the cancel function
      config.cancelToken = new CancelToken((c) = >{
      	this.cancelAry.push({fn:c,url:config.url});
      });
      return config;
      },(err) = >{
      	return Promise.reject(err);
      });
      axios.interceptors.response.use((response) = >{
      	let {config} =response;
      	this.cancelAry=this.cancelAry.filter(cancel= >cancel.url! ==config.url);return response;
      },(err) = >{
      	return Promise.reject(err);
});
Copy the code

Publish the abort function in the click cancel handler:

// Click the pause function
    handleCancel(e){
      this.cancelAry.forEach(fn= >fn());
      this.cancelAry=[];
    },
Copy the code

The next step is to complete the click-to-resume upload logic, call/API /handleAgain to retrieve the uploaded data, filter the data, and finally call the upload and merge sharding interfaces.

To avoid code redundancy, add the logic for filtering data to the createFormDataRequest function so that the formData data is generated once the filtering is complete. The new code is as follows:

files.filter((file,index) = >fileChunk.includes(`${fileName}_${index}`)! =true)
Copy the code

The flag identifier on the back end is used to determine whether it is the first upload. Because if flag is true, that is, the back end finds the large object file, then it means that the data needs to be filtered and the signal of success is sent to the user in seconds. If you upload a large file for the first time and there is no data at the back end, you do not need to filter data for the first time. And when you click upload file, you have to do this layer of judgment, because the target file will continue to upload on the original basis. So in the click upload function to call restore click upload handler function, the new code is as follows:

let {result,fileChunk}=await this.handleAgain(this.data[prop],prop,nameHash,fileName);
            if(! result){return;
            }
            this.targetRequest[`${fileName}_${nameHash}`] =this.createFormDataRequest(this.data[prop],prop,nameHash,fileName,fileChunk);
Copy the code

Then click resume upload part of the code is as follows:

 // Click re-upload function
    async handleAgain(nameHash,filename,requestChunk=[]){
      let result,fileChunk=[];
        await this.restoreFile(nameHash,filename).then(res= >{
          if(res.status==200) {if(res.data.flag==false) {this.$message('Seconds passed successfully ~~');
              result=true;
            }else{ fileChunk=re.data.fileChunk; }}})return {result,fileChunk};
    },
Copy the code

Okay, so we’re almost done with the breakpoint continuation.

The file preview

Demo currently supports image preview, which will be improved later. It mainly uses FileReader to generate reader objects and readAsDataURL to get urls.

Record of coding problems

Problem a

Key words: illegal operation on a directory to the directory for illegal operation Because I in the process of learning to use createWriteStream this API, the first parameter to the incoming file path and file name repetition error of the existing. Personally tested ~~

Question 2

err: Error: write after end at writeAfterEnd (_stream_writable.js:236:12) at Form.Writable.write (_stream_writable.js:287:5) at IncomingMessage.ondata (_stream_readable.js:639:20) at emitOne (events.js:116:13) at IncomingMessage.emit (events.js:211:7) at addChunk (_stream_readable.js:263:12) at readableAddChunk (_stream_readable.js:250:11) at IncomingMessage.Readable.push (_stream_readable.js:208:10) at HTTPParser.parserOnBody (_http_common.js:130:22)

Question 3

BadRequestError: stream ended Unexpectedly, without a pause or pause Not in the interface processing logic, but outside. As a result, several shards in the same large file use the same multipart object, so each shard should have a corresponding multipart object.

conclusion

After their own study and research, the harvest is quite big. The demo also has optimization points, thinking about how to handle concurrent requests. The demo code will be updated later, please look forward to ~~

Article sorted out for a long time, feel good, hope manual praise encouragement ~~

Koa-vue-uploadfile -demo is the code address for dragging and uploading multiple large files

The self-developed code is a small branch of the out-of-the-box demo I built before to learn, the code address is: project-dev-demo

If you feel good still hope to manually click star~~

The resources

Bytedance Interviewer: Please implement a large file upload and resumable breakpoint

Write a variety of new front-end file upload guide, from small pictures to large file breakpoint continued