1. The concept

The core of file sharding is to use blod.prototype. slice method to slice files into slices and upload multiple slices simultaneously with the help of THE concurrency of HTTP

2. Implement

A shard component is a secondary wrapper over an ElementUi-based el-Upload component

<template>
    <div>
        <el-upload class="uploadComponent" ref="uploadFile" :drag=! "" disabled"
            v-bind="$attrs" :on-change="handleUpload" :disabled="disabled"
        >
            <i class="el-icon-upload" v-show=! "" disabled"></i>
        </el-upload>
    </div>
</template>
Copy the code

Attributes that are bound to the component are passed to the el-Upload component through attribute pass-through v-bind= “$attrs”, so properties that are already implemented by the el-Upload component can be bound to the component

<file-chunks-upload-component 
    :disabled=! "" isEdit" ref="videoUpload" :file-list="videoFileList"
    action="" :auto-upload="false" list-type="fileList"
    accept=".mp4, .avi, .wmv, .rmvb, .rm" validType="video"
    validTypeTip="Upload files can only be mp4, AVI, WMV, RMVB format"
    :getFileChunks="handleUpload" :sliceSize="100 * 1024 * 1024"
    :on-preview="handleVideoPreview" :before-remove="handleVideoRemove"
/>
Copy the code

The component receives several additional attributes in addition to the attributes of the el-Upload component:

  • Disabled: Sets whether a component can be edited
  • ValidType: validType of file. A single validType can be passed in as a string, such as’ video ‘or’ image ‘. Multiple validtypes can be passed in as an array, such as [‘ video ‘, ‘image’].
  • ValidTypeTip: Prompt text for invalid file verification
  • ChunkSize: size of each fragment
  • GetFileChunks: obtains fragment information
  • SliceSize: indicates the critical point for starting file sharding

Bind on-change property to el-upload component to monitor file changes, get file information, and perform a series of processing

Processing process:

  1. Check whether the file size is larger than sliceSize. If yes, the file fragmentation is continued; if no, the file is returned
// If the file size is smaller than sliceSize, the file is returned without fragmentation
if(this.sliceSize && file.size <= this.sliceSize){
    this.getFileChunks && this.getFileChunks({file})
    return
}
Copy the code
  1. Check file type

Verify file types by way of file header information, encapsulated as a tool method

/*** * Check the file type * by using the file header information https://blog.csdn.net/tjcwt2011/article/details/120333846?utm_medium=distribute.pc_relevant.none-task-blog-2%7Edefault%7 ECTRLIST%7Edefault-3.no_search_link&depth_1-utm_source=distribute.pc_relevant.none-task-blog-2%7Edefault%7ECTRLIST%7Edef ault-3.no_search_link */

const fileType = Object.assign(Object.create(null), {
    'isMP4': str= > ['00 00 00 14'.'00 00 00 18'.'00 00 00 1c'.'00 00 00 20'].some(s= > str.indexOf(s) === 0),
    'isAVI': str= > str.indexOf('52 49 46 46') = = =0.'isWMV': str= > str.indexOf('30 26 b2 75') = = =0.'isRMVB': str= > str.indexOf('2e 52 4d 46') = = =0.'isJPG': str= > str.indexOf('ff d8 ff') = = =0.'isPNG': str= > str.indexOf('89 50 4e 47') = = =0.'isGIF': str= > str.indexOf('47 49 46 38') = = =0
})

const validFileType = Object.assign(Object.create(null), {
    'video': str= > fileType['isMP4'](str) || fileType['isAVI'](str) ||  fileType['isWMV'](str) || fileType['isRMVB'](str),
    'image': str= > fileType['isJPG'](str) || fileType['isPNG'](str) || fileType['isGIF'](str)
})

const bufferToString = async buffer => {
    let str = [...new Uint8Array(buffer)].map(b= > b.toString(16).toLowerCase().padStart(2.'0')).join(' ')
    return str
}

const readBuffer = (file, start = 0, end = 4) = > {
    return new Promise((resolve, reject) = > {
        const reader = new FileReader()
        reader.onload = () = > {
            resolve(reader.result)
        }
        reader.onerror = reject
        reader.readAsArrayBuffer(file.slice(start, end))
    })
}

const validator = async (type, file) => {
    const buffer = await readBuffer(file)
    const str = await bufferToString(buffer)
    
    switch(Object.prototype.toString.call(type)){
        case '[object String]':
            return validFileType[type](str)
        case '[object Array]':
            return [...type].some(t= > validFileType[t](str))
    }
}

export default validator
Copy the code

Note: For file header information of various file types, you can query the file header mapping table from the following links

  1. Generate slice

Use blob.prototype.slice to slice a large file into smaller pieces

createFileChunk(file){
    let chunks = []
    let cur = 0
    if(file.size > this.chunkSize){
        while(cur < file.size){
            chunks.push({ file: file.slice(cur, cur + this.chunkSize) })
            cur += this.chunkSize
        }
    }else{
        chunks.push({ file: file })
    }
    return chunks
},
Copy the code
  1. Generate a hash

To implement breakpoint continuation and second transmission, the front-end needs to generate a unique identifier to provide to the back-end. Since the unique identifier must remain unchanged, the correct method is to generate hash based on the file content. The spark-MD5 library is used to calculate the hash value of a file based on the file content.

Since computing hashes is time-consuming and causes UI blocking, web-worker is used here to compute hashes on worker threads. Since the function parameter URL is the specified script and cannot be cross-domain when web-worker is instantiated, we create a separate hash.js file in the public directory and import spark-MD5 through importScripts

hash.js

/ / into the spark - md5
self.importScripts('spark-md5.min.js')

self.onmessage = e= >{
  // Accept data from the main thread
  let { chunks } = e.data
  const spark = new self.SparkMD5.ArrayBuffer()

  let count = 0

  const loadNext = index= >{
    const reader = new FileReader()
    reader.readAsArrayBuffer(chunks[index].file)
    reader.onload = e= >{
      count ++
      spark.append(e.target.result)

      if(count==chunks.length){
        self.postMessage({
          hash:spark.end()
        })
      }else{
        loadNext(count)
      }
    }
  }
  loadNext(0)}Copy the code

The more fragments, the larger the fragments, the longer it takes to compute the hash. To optimize the calculation speed, sampling is used to compute the hash when certain precision is lost

There are two cases here

1. If the file size is smaller than 50M, hash is calculated for all fragments

2. If the file size is larger than 50M, fragments are extracted in even numbers and 10M of content is extracted from each fragment to compute the hash

Computational hash method

 async calculateHashWorker(chunks, size){
    var _this = this;
    // If the file size exceeds 50M, generate hash in sampling mode
    const MAX_FILE_SIZE = 50 * 1024 * 1024
    const SAMPLE_SIZE = 10 * 1024 * 1024
    if(size >= MAX_FILE_SIZE){
        chunks = chunks.map((item, index) = > {
            if(index % 2= =0) {return { file: item.file.slice(0, SAMPLE_SIZE)}
            }
        }).filter(item= > item)
    }

    return new Promise(resolve= > {
        _this.worker = new Worker('/hash.js')
        _this.worker.postMessage({ chunks })
        _this.worker.onmessage = e= > {
            const { hash } = e.data
            if (hash) {
                resolve(hash)
            }
        }
    })
},
Copy the code
  1. Return file slice information

Get the slice collection, hash value, and file information by binding the getFileChunks property to the component

async handleUpload(params){
    if(! params.raw)return
    let file = params.raw
    // If the file size is smaller than sliceSize, the file is returned without fragmentation
    if(this.sliceSize && file.size <= this.sliceSize){
        this.getFileChunks && this.getFileChunks({file})
        return
    }
    // Verify file type
    if(this.validType && !await validator(this.validType, file)){
        this.$message({
            type: 'warning'.message: this.validTypeTip,
            duration: 3000,})this.clearUploadFiles()
        return
    }

    const loading  = this.$loading({
        lock: true.text: 'Generating slices, please wait patiently'
    })

    let chunks = this.createFileChunk(file)
    let hash = await this.calculateHashWorker(chunks, file.size)

    this.getFileChunks && this.getFileChunks({chunks, hash, file})

    loading.close()
},
Copy the code

Shard upload component complete code

<template>
    <div>
        <el-upload class="uploadComponent" ref="uploadFile" :drag=! "" disabled"
            v-bind="$attrs" :on-change="handleUpload" :disabled="disabled"
        >
            <i class="el-icon-upload" v-show=! "" disabled"></i>
        </el-upload>
    </div>
</template>

<script>
import validator from '@/utils/fileTypeValidate'

export default {
    name: 'FileChunksUploadComponent'.props: {
        disabled: {  // Whether editable
            type: Boolean.default: false
        },
        validType: {  // File verification type
            type: String | Array.default(){
                return ' '}},validTypeTip: {  // File check prompt
            type: String.default: 'Upload file format is not correct! '
        },
        chunkSize: {  // Fragment size
            type: Number.default: 50 * 1024 * 1024
        },
        getFileChunks: {  // Get the fragment information method
            type: Function.default: null
        },
        sliceSize: {  // The file implements the critical size of the fragment
            type: Number.default: 0}},methods: {
        async handleUpload(params){
            if(! params.raw)return
            let file = params.raw
            // If the file size is smaller than sliceSize, the file is returned without fragmentation
            if(this.sliceSize && file.size <= this.sliceSize){
                this.getFileChunks && this.getFileChunks({file})
                return
            }
            // Verify file type
            if(this.validType && !await validator(this.validType, file)){
                this.$message({
                    type: 'warning'.message: this.validTypeTip,
                    duration: 3000,})this.clearUploadFiles()
                return
            }

            const loading  = this.$loading({
                lock: true.text: 'Generating slices, please wait patiently'
            })
            
            let chunks = this.createFileChunk(file)
            let hash = await this.calculateHashWorker(chunks, file.size)

            this.getFileChunks && this.getFileChunks({chunks, hash, file})
            
            loading.close()
        },
        createFileChunk(file){
            let chunks = []
            let cur = 0
            if(file.size > this.chunkSize){
                while(cur < file.size){
                    chunks.push({ file: file.slice(cur, cur + this.chunkSize) })
                    cur += this.chunkSize
                }
            }else{
                chunks.push({ file: file })
            }
            return chunks
        },
        async calculateHashWorker(chunks, size){
            var _this = this;
            // If the file size exceeds 50M, generate hash in sampling mode
            const MAX_FILE_SIZE = 50 * 1024 * 1024
            const SAMPLE_SIZE = 10 * 1024 * 1024
            if(size >= MAX_FILE_SIZE){
                chunks = chunks.map((item, index) = > {
                    if(index % 2= =0) {return { file: item.file.slice(0, SAMPLE_SIZE)}
                    }
                }).filter(item= > item)
            }

            return new Promise(resolve= > {
                _this.worker = new Worker('/hash.js')
                _this.worker.postMessage({ chunks })
                _this.worker.onmessage = e= > {
                    const { hash } = e.data
                    if (hash) {
                        resolve(hash)
                    }
                }
            })
        },
        clearUploadFiles(){
            this.$refs.uploadFile.clearFiles()
        }
    }
}
</script>
Copy the code

Article 3. Practical experience

To implement the fragment uploading function of large files, the background provides three interfaces, including fragment checking interface, fragment uploading interface, and fragment merging interface.

  • Fragment check interface: Detects whether fragments have been uploaded and files have been uploaded based on hash values generated by components to implement intermittent and second transmission, avoiding repeated file upload.
  • Fragment upload interface: uploads fragments that have not been uploaded
  • Fragment merging interface: After all file fragments are uploaded, the backend is informed to merge the fragments into files

The process of fragment merging takes a certain amount of time. For small files, fragment uploading will increase the upload time. Therefore, the component is controlled by binding the sliceSize attribute to the component.

3.1 Fragment Upload Process

  1. Invoke the fragment check interface to check whether files or some fragments have been uploaded

2. Filter out the uploaded fragments and invoke the fragment upload interface

3. After all fragments are uploaded, invoke the fragment merging interface to merge files

The complete code

<file-chunks-upload-component 
    :disabled=! "" isEdit" ref="videoUpload" :file-list="videoFileList"
    action="" :auto-upload="false" list-type="fileList"
    accept=".mp4, .avi, .wmv, .rmvb, .rm" validType="video"
    validTypeTip="Upload files can only be mp4, AVI, WMV, RMVB format"
    :getFileChunks="handleUpload" :sliceSize="100 * 1024 * 1024"
    :on-preview="handleVideoPreview" :before-remove="handleVideoRemove"
/>
Copy the code
async handleUpload({chunks, hash, file}){
    this.loading = true
    this.loadingText = 'Uploaded. Please be patient.'
    try{
        let res = null
        if(chunks && hash){
            res = await this.sliceChunksUpload(chunks, hash, file)
        }else{
            res = await this.videoFileUpload(file)
        }
        if(res){
            this.videoFileList.push({name: res, url: res})
            this.$messageInfo('Upload successful')}}catch(err){
        this.$messageWarn('Upload failed')}this.$refs.videoUpload.clearUploadFiles()
    this.loading = false
    this.loadingText = ' '
},
// Upload fragmented video files
sliceChunksUpload(chunks, hash, file){
    return new Promise(async (resolve, reject) => {
        chunks = chunks.map((item, index) = > {
            return {
                file: item.file,
                name: `${index}_${hash}`,
                index,
                hash
            }
        })
        try{
            let {data} = await sewingCraftApi.checkChunkUpload({taskId: hash})
            if(data && data.completeUpload){
                resolve(data.uploadUrl)
                return
            }

            if(data.map){
                chunks = chunks.filter(({file, name}) = > {
                    return! data.map[name] || data.map[name] ! = file.size }) }if(chunks.length){
                let uploadRes = await httpUtil.multiRequest(chunks, uploadRequest)
                let flag = uploadRes.every(res= > res.data)
                if(! flag){ reject('Upload failed! ')
                    return}}let extName = file.name.substring(file.name.lastIndexOf('. ') + 1)
            sewingCraftApi.mergeChunk({taskId: hash, extName}).then(res= > {
                resolve(res.data)
            })
        }catch(err){
            reject(err)
        }

        function uploadRequest({hash, index, name, file}){
            let formData = new FormData()
            formData.append('taskId', hash)
            formData.append('chunkNumber', index)
            formData.append('identifier', name)
            formData.append('file', file)
            return sewingCraftApi.uploadChunkUpload(formData)
        }
    })
}
Copy the code

reference

Bytedance Interviewer: Please implement a large file upload and resumable breakpoint

Ultimate validation of file types

128 common file header information comparison table