0 background
In the current service scenario, a user uploads a video. After the video is successfully uploaded, the frame cutting service is run in the background and the picture is returned as the recommended cover for display. However, this scheme needs to wait for the video to be uploaded and then run the frame cutting task in the background, so the user has to wait a long time.
Therefore, the front end should be considered for frame cutting, and recommendation covers should be generated at the beginning of video uploading to improve user experience.
1 Scheme Comparison
1.1 canvas
Sectional frame
CurrentTime =seconds (videoobject. currentTime=seconds) and then draw the image in the canvas. There is a related open source library, you can experience its demo. However,
1.2 Webassembly
Sectional frame
Using the powerful C/C++ written FFMPEG, through emscripten compiler packaged into the form of WASM + JS, and then using JS to achieve the video frame cutting function. In terms of compatibility, Webassembly has been supported by all major browsers, but only some browsers are still not supported, and the old solution is used for the unsupported browsers.
1.2.1 ffmpeg. Wasm
Currently, there is an open source library ffmpeg.wasm. The library includes:
- @ffmpeg/core compile ffMPEG to generate ffMpeg -core.wasm + JS glue code.
- @ffmpeg/ffmpegThe section that implements the invocation of the glue code generated in the previous step provides
load
.run
Such as API. At the same time, if the developer is right@ffmpeg/core
If not, you can build a custom oneffmpeg-core.wasm
.
So, can you just use it? There are some problems to be solved:
- Browser compatibility: As we know, the browser’s JS thread is single-threaded and mutually exclusive with the rendering thread. In order not to block the page rendering and js main thread,
@ffmpeg/core
When compiling FFMPEG, thepthreadsThe resulting JS glue is used in the codesharedarraybuffer
.sharedarraybuffer
It can satisfy the data sharing between the main thread and the worker, as well as the data sharing among multiple workers, which is ideal for this scenario. But becauseSecurity issues, all major browsers are disabled by default, you need to configure additional return header fields, andSupport is not ideal, cannot meet the online standard.
- Wasm redundant:
@ffmpeg/core
compiledffmpeg-core.wasm
Almost all of the ffMPEG features are included, many of which are not needed for frame capture.
Therefore, the final solution is to use Webassembly frame truncation to implement:
- Custom compile FFMPEG, compile out without
sharedarraybuffer
theffmpeg-core.wasm
+js and optimizeffmpeg-core.wasm
File size. - use
@ffmpeg/ffmpeg
Call the ffMPEG method compiled in the previous step to achieve the truncated frame function. - use
web worker
Run the truncated business code to prevent blocking the main thread.
2 compile ffmpeg
2.1 Understanding Concepts
First, let’s look at the concepts and principles involved.
2.1.1 ffmpeg
Ffmpeg is an excellent C/C++ audio and video processing library that can achieve video screenshots. Libraries involved in the screenshot:
- Libavcodec: Encodes and decodes audio and video.
- Libavformat: encapsulate and unencapsulate audio and video.
- Libavutil: a library that contains common utility functions, including arithmetic, character manipulation, etc.
- Libswscale: Image scaling and pixel format conversion.
Components involved:
- Demuxer: Decapsulate the video
- Decoder: To decode video
- Encoder: After getting the decoded frame, output picture encoding
- Muxer: Image encapsulation
2.1.2 WebAssembly
WebAssemblyOr WASM: a new format that is portable, small, fast loading, and Web compatible. Provides an efficient compilation target for languages such as C, C++ and Rust, enabling code written in multiple languages to run at near native speed on a network platform.
2.1.3 emscripten
Emscripten is a WebAssembly compiler toolchain. Specifically, languages such as C/C++, through the Clang front-end to LLVM intermediate code (IR), and from LLVM IR to WASM. WebAssembly is then downloaded by the browser, which then passes through the WebAssembly module, to the assembly code of the target machine, to the machine code (x86/ARM, etc.).
What are LLVM and Clang?
LLVM is the same as LLVM Intermediate Representation (LLVM IR) for different front and back ends. Clang is a sub-project of LLVM, a C/C++/Objective-C compiler front end based on the LLVM architecture.
- Frontend: lexical analysis, syntax analysis, semantic analysis, and intermediate code generation
- Optimizer: Intermediate code optimization (loop optimization, remove useless code, etc.)
- Backend: Generates object code. If the object code is absolute instruction code (machine code), the object code can be executed immediately. If the object code is assembly instruction code, it must be assembled by the assembler (generating machine code) before it can be run.
2.2 Build an Emscripten environment
You can install Emscripten by following the instructions on the website. A more recommended method is to download Docker Desktop and use the established Emscripten environment by running Docker to avoid the pit of the local development environment. Download the ffMPEG source code, here refer to @ffmPEG /core, using version 4.3.1. In the source directory, write the following script to run Docker:
#! /bin/bash
set -euo pipefail
EM_VERSION=2.0.8
docker pull emscripten/emsdk:$EM_VERSION
docker run \
--rm \
-v $PWD:/src \ # Bind mount
emscripten/emsdk:$EM_VERSION \
sh -c 'bash ./build.sh'
Copy the code
You can run the docker run –help command to view the command description. The next step is to write the compiled script build.sh.
2.3 Configuring FFMPEG compilation Parameters
Use emconfigure to set the appropriate environment parameters and configure the FFmpeg compilation parameters. Documentation for the configuration:
- run
emconfigure ./configure --help
View all available configurations. - A detailed description of the FFMPEG configuration can be viewed here.
- Documentation for FFMPEG Muxers and Demuxers
- Ffmpeg Encodes and decodes documentation
emconfigure ./configure
--target-os=none # use none to prevent any os specific configurations
--arch=x86_32 # use x86_32 to achieve minimal architectural optimization
--enable-cross-compile # enable cross compile
--disable-x86asm # disable x86 asm
--disable-inline-asm # disable inline asm
--disable-stripping # disable stripping
--disable-programs # disable programs build (incl. ffplay, ffprobe & ffmpeg)
--disable-doc # disable doc
--nm="llvm-nm"
--ar=emar
--ranlib=emranlib
--cc=emcc
--cxx=em++
--objcc=emcc
--dep-cc=emcc
Remove components that are not needed
--disable-avdevice
--disable-swresample
--disable-postproc
--disable-network
--disable-pthreads
--disable-w32threads
--disable-os2threads
# Decapsulation, codec, etc
--disable-everything The key to reducing wASM volume is to disable some components except the following
--enable-filters
--enable-muxer=image2
--enable-demuxer=mov # mov,mp4,m4a,3gp,3g2,mj2
--enable-demuxer=flv
--enable-demuxer=h264
--enable-demuxer=asf
--enable-encoder=mjpeg
--enable-decoder=hevc
--enable-decoder=h264
--enable-decoder=mpeg4
--enable-protocol=file
emmake make -j4 # building
Copy the code
2.4 generated js + wasm
Compile the link code generated in the previous make step to JavaScript + WebAssembly. You can view the EMCC parameter options through emcc –help and the Clang parameter options through clang –help.
emcc
-I. -I./fftools # Add directory to include search path
-Llibavcodec -Llibavdevice -Llibavfilter -Llibavformat -Llibavresample -Llibavutil -Llibpostproc -Llibswscale -Llibswresample # Add directory to library search path
-Qunused-arguments # Don't emit warning for unused driver arguments.
-o wasm/dist/ffmpeg-core.js fftools/ffmpeg_opt.c fftools/ffmpeg_filter.c fftools/ffmpeg_hw.c fftools/cmdutils.c fftools/ffmpeg.c # output
-lavdevice -lavfilter -lavformat -lavcodec -lswresample -lswscale -lavutil -lm # library
-s USE_SDL=2 # use SDL2
-s MODULARIZE=1 # use modularized version to be more flexible
-s EXPORT_NAME="createFFmpegCore" # assign export name for browser
-s EXPORTED_FUNCTIONS="[_main]" # export main and proxy_main funcs
-s EXTRA_EXPORTED_RUNTIME_METHODS="[FS, cwrap, ccall, setValue, writeAsciiToMemory]" # export extra runtime methods
-s INITIAL_MEMORY=33554432 # 33554432 bytes = 32MB
-s ALLOW_MEMORY_GROWTH=1 # allows the total amount of memory used to change depending on the demands of the application
--post-js wasm/post-js.js # emits a file after the emitted code. use to expose exit function
-O3 # optimize code and reduce code size
Copy the code
The final build of FFMPEG-core. wasm is 5MB in size and will be even smaller after gzip.
3. Realize frame cutting function
Here is a simple implementation to run the compiled ffmPEG-core.js.
3.1 Calling JS glue code
This part of calling js glue code is already implemented in the open source @FFmpeg/FFmPEG library, and we can simply use its API.
const { createFFmpeg } = require('@ffmpeg/ffmpeg');
const ffmpeg = createFFmpeg({ log: true });
(async() = > {await ffmpeg.load();
await ffmpeg.run('-i'.'example.mp4'.'-r'.'80'.'-vf'.'select="eq(pict_type,I)"'.'-frames'.'8'.'frame-%04d.jpg'); // Capture 8 frames and key frames}) ();Copy the code
So how do we implement the load and run methods?
The first thing to understand here is that JavaScript can only use Number as an argument when exchanging data with C. Since JavaScript and C/C++ have completely different data systems from a language perspective, Number is the only intersection between the two, so when they call each other, they are essentially exchanging numbers.
Therefore, if the argument is a string, an array, or other type other than Number, you need to split it into the following steps:
- use
Module._malloc()
inModule
Allocate memory in heap, get address PTR; - PTR where data such as strings/arrays are copied to memory;
- Call a C/C++ function with the PTR as an argument.
- use
Module._free()
The release of PTR.
As you can see, the call process is quite tedious. To simplify the call process, Emscripten provides cwrap functions.
Here is part of the source code for @ffmpeg/ FFmpeg:
const createFFmpegCore = require('path/to/ffmpeg-core.js');
let ffmpeg;
/ / load
const load = async () => {
Core = await createFFmpegCore({
print: (message) = >{}}); ffmpeg = Core.cwrap('_main'.'number'['number'.'number']); // cwrap calls the exported main function
};
const parseArgs = (Core, args) = > {
const argsPtr = Core._malloc(args.length * Uint32Array.BYTES_PER_ELEMENT);
args.forEach((s, idx) = > {
const buf = Core._malloc(s.length + 1);
Core.writeAsciiToMemory(s, buf);
Core.setValue(argsPtr + (Uint32Array.BYTES_PER_ELEMENT * idx), buf, 'i32');
});
return [args.length, argsPtr]; // [array length, array PTR]
};
// Run the ffmpeg command
const run = (. _args) = > {
return new Promise((resolve) = >{ ffmpeg(... parseArgs(Core, _args));// Pass the command arguments
});
};
module.exports = {
load,
run,
};
Copy the code
3.2 Frame truncation optimization
Cutting n frames at a time is very slow, because it is read frame by frame until the frame of the specified time is captured. The longer the time, the longer the capture time.
Therefore, change to such a cut frame, directly cut the key frame specified time, speed up a hundred times.
// ...
(async() = > {// ...
// Obtain the duration first
const frameNum = 8;
const per = duration / (frameNum - 1);
for (let i = 0; i < frameNum; i++) {
await ffmpeg.run('-ss'.`The ${Math.floor(per * i)}`.'-i'.'example.mp4'.'-s'.'960x540'.'-f'.'image2'.'-frames'.'1'.`frame-${i + 1}.jpeg`);
}
})();
Copy the code
4 web worker
Because -s USE_PTHREADS=1 is not configured in the build, the above call to ffmpeg will block the js main thread and render the page. For example, when the recommendation cover is generated, the progress status of the uploaded video cannot be updated, and the user cannot respond to other buttons on the page. Therefore, you need to add a Web worker to run it.
Web Workers are scripts that run on threads separate from browser page threads and can be used to divert almost all heavy processing from page threads. The main thread and worker can communicate with the onMessage event via the postMessage() method:
main.js
var myWorker = new Worker('worker.js');
myWorker.onmessage = function(e) {
result.textContent = e.data;
console.log('Message received from worker');
}
first.onchange = function() {
myWorker.postMessage([first.value,second.value]);
console.log('Message posted to worker');
}
Copy the code
worker.js
onmessage = function(e) {
console.log('Message received from main script');
var workerResult = 'Result: ' + (e.data[0] * e.data[1]);
console.log('Posting message back to main script');
postMessage(workerResult);
}
Copy the code
Comlink (1.1KB) is recommended here, which makes this message-based API more friendly by providing an RPC implementation. For example, to achieve truncated frame communication:
main.js
import * as Comlink from 'Comlink';
async function onFileUpload(file) {
const ffmpegWorker = Comlink.wrap(new Worker('./worker.js'));
const frameU8Arrs = await ffmpegWorker.getFrames(file);
console.log('get frameU8Arrs from worker', frameU8Arrs);
}
Copy the code
worker.js
import * as Comlink from 'Comlink';
async function getFrames(file) {
// ...
// Obtain the duration, such as duration, first
const frameNum = 8;
const per = duration / (frameNum - 1);
let frameU8Arrs = [];
for (let i = 0; i < frameNum; i++) {
await ffmpeg.run('-ss'.`The ${Math.floor(per * i)}`.'-i'.'example.mp4'.'-s'.'960x540'.'-f'.'image2'.'-frames'.'1'.`frame-${i + 1}.jpeg`);
}
// Obtain the image binary data Uint8Array from MEMFS
for (let i = 0; i < frameNum; i++) {
const u8arr = await ffmpeg.FS('readFile'.`frame-${i + 1}.jpeg`);
frameU8Arrs.push(u8arr);
ffmpeg.FS('unlink', fileName);
}
return frameU8Arrs;
}
Comlink.expose({
getFrames,
});
Copy the code
In addition, if you are using Webpack, you can configure the worker-plugin to load the correct worker.js path.
5 Online Effect
After going online, users can select and edit the video cover for browsers that support this solution without waiting for the video to be uploaded. The truncated frame duration is on average 50% less than the old scheme. Point of optimization:
- Some browsers report errors, and then continue to optimize, improve browser support. (for example, QQ browser does not support Webassembly.Memory, some version of Safari fetch wasm failed)
- There is room for reduction in wASM volume. (For example, when compiling with — enable-filters, all filters are included)
reference
- Build FFmpeg WebAssembly version (= ffmpeg.wasm): Part.2 Compile with Emscripten
- Front end video frame extraction ffMPEG + Webassembly