WebCodecs
- Draft: wicg. Making. IO/web – codecs
- Github:github.com/WICG/web-co…
An API that allows Web applications to encode and decode audio and video
In Chrome >= 86 version of the experience
- Chrome ://flags/#enable-experimental-web-platform-features
Enabled
- Enable Chrome from the command line:
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --enable-blink-features=WebCodecs
// Check the current browser support using the VideoEncoder API
if ('VideoEncoder' in window) {
// Support the WebCodecs API
}
Copy the code
Why the Web codec API appears
There are already a number of Web apis for media manipulation: Media Stream API, Media Recording API, Media Source API, and WebRTC API, but do not provide some underlying API for Web developers to frame or decapsulates encoded video.
Many audio and video editors solve this problem by using WebAssembly to bring audio and video codec to the browser, but the problem is that many browsers already support audio and video codec at the bottom level and with a lot of hardware-accelerated tuning. If you use WebAssembly to repackage these capabilities, Seems like a waste of human and computer resources.
Hence the WebCodecs API, which exposes the media API to use some of the capabilities that browsers already have, such as:
- Video and audio decoding
- Video and audio coding
- Raw video frame
- Image decoder
At this point, some Web media development projects such as video editors, video conferences, and video streams are much easier to operate
Check the project progress: github.com/WICG/web-co…
WebCodecs processing process
Frames are central to video processing, so most WebCodecs classes either consume frames or produce frames. Video Encoders convert frames to encoded chunks, and Video Decoders convert encoded chunks to frames. All of this is done asynchronously on the non-main thread, so you can keep the main thread fast.
Currently, in WebCodecs, the only way to display a frame on a page is to convert it to an ImageBitmap and draw a bitmap on the canvas or convert it to a WebGLTexture
Webcodecs practical application
Encoding
There are now two ways to convert an existing image to VideoFrame
- One is to create a frame from an ImageBitmap
- The second is through
VideoTrackReader
Set method to handle from[MediaStreamTrack](https://developer.mozilla.org/en-US/docs/Web/API/MediaStreamTrack)
This API is useful when you need to capture video streams from the camera or screen
ImageBitmap
let cnv = document.createElement('canvas');
// draw something on the canvas.let bitmap = await createImageBitmap(cnv);
let frameFromBitmap = new VideoFrame(bitmap, { timestamp: 0 });
Copy the code
The first is to create the frame directly from the ImageBitmap. Simply call the new VideoFrame() constructor and supply it with the bitmap and display the timestamp
VideoTrackReader
const framesFromStream = [];
const stream = awaitThe navigator. MediaDevices. GetUserMedia ({... });const vtr = new VideoTrackReader(stream.getVideoTracks()[0]);
vtr.start((frame) = > {
framesFromStream.push(frame);
});
Copy the code
To encode the Frame into an EncodedVideoChunk object using VideoEncoder, VideoEncoder requires two objects:
- with
output
εerror
The initialization object for both methods, passed toVideoEncoder
Cannot be modified after - Encoder configuration object that contains parameters for output video streams. Can be called later
configure()
To change these parameters
const init = {
output: handleChunk,
error: (e) = > {
console.log(e.message); }};let config = {
codec: 'vp8'.width: 640.height: 480.bitrate: 8 _000_000.// 8 Mbps
framerate: 30};let encoder = new VideoEncoder(init);
encoder.configure(config);
Copy the code
After the encoder is set, you can accept frames. When you start accepting frames from the Media Stream, the callback passed to VideoTrackReader.start() is executed. To pass the frame to the encoder, you need to check the frame regularly to prevent too many frames from causing processing problems. Note: encoder.configure() and encoder.encode() will return immediately without waiting for the actual processing to complete. If the processing is complete, the output() method is called with the encoded chunk input. Note that encoer.encode() consumes frame, and if frame needs to be used later, clone needs to be called to copy it.
let frameCounter = 0;
let pendingOutputs = 0;
const vtr = new VideoTrackReader(stream.getVideoTracks()[0]);
vtr.start((frame) = > {
if (pendingOutputs > 30) {
// There are too many frames being processed for the encoder to handle and no more frames to be processed
return;
}
frameCounter++;
pendingOutputs++;
const insert_keyframe = frameCounter % 150= = =0;
encoder.encode(frame, { keyFrame: insert_keyframe });
});
Copy the code
Finally, you complete the handleChunk method, which typically sends blocks of data over the network or encapsulates them into a media container
function handleChunk(chunk) {
let data = new Uint8Array(chunk.data); // actual bytes of encoded data
let timestamp = chunk.timestamp; // media time in microseconds
let is_key = chunk.type == 'key'; // can also be 'delta'
pending_outputs--;
fetch(`/upload_chunk? timestamp=${timestamp}&type=${chunk.type}`,
{
method: 'POST'.headers: { 'Content-Type': 'application/octet-stream' },
body: data
});
}
Copy the code
Sometimes you need to make sure all pending encoding requests are complete by calling flush()
await encoder.flush();
Copy the code
Decoding
Setting up VideoDecoder is similar to the above. You need to pass init and config objects
const init = {
output: handleFrame,
error: (e) = > {
console.log(e.message); }};const config = {
codec: "vp8".codedWidth: 640.codedHeight: 480};const decoder = new VideoDecoder(init);
decoder.configure(config);
Copy the code
Once the decoder is set up, you can feed it an EncodedVideoChunk object, Through [BufferSouce] (https://developer.mozilla.org/en-US/docs/Web/API/BufferSource) to create a chunk
const responses = await downloadVideoChunksFromServer(timestamp);
for (let i = 0; i < responses.length; i++) {
const chunk = new EncodedVideoChunk({
timestamp: responses[i].timestamp,
data: new Uint8Array(responses[i].body),
});
decoder.decode(chunk);
}
await decoder.flush();
Copy the code
Rendering the Decoded Frame to the page is divided into three steps:
- the
VideoFrame
convert[ImageBitmap](https://developer.mozilla.org/en-US/docs/Web/API/ImageBitmap)
- Wait for the right time to display the frame
- Draw the image on the canvas
When a frame is no longer needed, call destroy() to manually destroy it before garbage collection, which reduces the page memory footprint
const cnv = document.getElementById("canvas_to_render");
const ctx = cnv.getContext("2d", { alpha: false });
const readyFrames = [];
let underflow = true;
let timeBase = 0;
function handleFrame(frame) {
readyFrames.push(frame);
if (underflow) setTimeout(renderFrame, 0);
}
function delay(time_ms) {
return new Promise((resolve) = > {
setTimeout(resolve, time_ms);
});
}
function calculateTimeTillNextFrame(timestamp) {
if (timeBase == 0) timeBase = performance.now();
const media_time = performance.now() - timeBase;
return Math.max(0, timestamp / 1000 - media_time);
}
async function renderFrame() {
if (readyFrames.length === 0) {
underflow = true;
return;
}
const frame = readyFrames.shift();
underflow = false;
const bitmap = await frame.createImageBitmap();
frame.destroy();
// Based on the frame timestamp, calculate the real-time wait time required before the next frame is displayed
const timeTillNextFrame = calculateTimeTillNextFrame(frame.timestamp);
await delay(timeTillNextFrame);
ctx.drawImage(bitmap, 0.0);
// Render the next frame immediately
setTimeout(renderFrame, 0);
}
Copy the code
Demo
Address: experience ringcrl. Making. IO/static/webc…
If you can’t open it, please refer to the above [in Chrome >= 86 version for experience].
<! DOCTYPEhtml>
<html>
<head>
<meta charset="UTF-8">
<style>
canvas {
padding: 10px;
background: gold;
}
button {
background-color: # 555555;
border: none;
color: white;
padding: 15px 32px;
width: 150px;
text-align: center;
display: block;
font-size: 16px;
}
</style>
</head>
<body>
<button onclick="playPause()">Pause</button>
<canvas id="dst" width="640" height="480"></canvas>
<canvas style="visibility: hidden;" id="src" width="640" height="480"></canvas>
<script src="./main.js"></script>
</body>
</html>
Copy the code
const codecString = "vp8";
let keepGoing = true;
function playPause() { keepGoing = ! keepGoing;const btn = document.querySelector("button");
if (keepGoing) {
btn.innerText = "Pause";
} else {
btn.innerText = "Play"; }}function delay(time_ms) {
return new Promise((resolve) = > {
setTimeout(resolve, time_ms);
});
}
async function startDrawing() {
const cnv = document.getElementById("src");
const ctx = cnv.getContext("2d", { alpha: false });
ctx.fillStyle = "white";
const { width } = cnv;
const { height } = cnv;
const cx = width / 2;
const cy = height / 2;
const r = Math.min(width, height) / 5;
const drawOneFrame = function drawOneFrame(time) {
const angle = Math.PI * 2 * (time / 5000);
const scale = 1 + 0.3 * Math.sin(Math.PI * 2 * (time / 7000));
ctx.save();
ctx.fillRect(0.0, width, height);
ctx.translate(cx, cy);
ctx.rotate(angle);
ctx.scale(scale, scale);
ctx.font = "30px Verdana";
ctx.fillStyle = "black";
const text = "π πΉ π· Hello WebCodecs π₯ π οΈ π";
const size = ctx.measureText(text).width;
ctx.fillText(text, -size / 2.0);
ctx.restore();
window.requestAnimationFrame(drawOneFrame);
};
window.requestAnimationFrame(drawOneFrame);
}
function captureAndEncode(processChunk) {
const cnv = document.getElementById("src");
const fps = 60;
let pendingOutputs = 0;
let frameCounter = 0;
const stream = cnv.captureStream(fps);
const vtr = new VideoTrackReader(stream.getVideoTracks()[0]);
const init = {
output: (chunk) = > {
pendingOutputs--;
processChunk(chunk);
},
error: (e) = > {
console.log(e.message); vtr.stop(); }};const config = {
codec: codecString,
width: cnv.width,
height: cnv.height,
bitrate: 10e6.framerate: fps,
};
const encoder = new VideoEncoder(init);
encoder.configure(config);
vtr.start((frame) = > {
if(! keepGoing)return;
if (pendingOutputs > 30) {
// Too many frames in flight, encoder is overwhelmed
// let's drop this frame.
return;
}
frameCounter++;
pendingOutputs++;
const insert_keyframe = frameCounter % 150= = =0;
encoder.encode(frame, { keyFrame: insert_keyframe });
});
}
async function frameToBitmapInTime(frame, timeout_ms) {
const options = { colorSpaceConversion: "none" };
const convertPromise = frame.createImageBitmap(options);
if (timeout_ms <= 0) return convertPromise;
const results = await Promise.all([convertPromise, delay(timeout_ms)]);
return results[0];
}
function startDecodingAndRendering() {
const cnv = document.getElementById("dst");
const ctx = cnv.getContext("2d", { alpha: false });
const readyFrames = [];
let underflow = true;
let timeBase = 0;
function calculateTimeTillNextFrame(timestamp) {
if (timeBase === 0) timeBase = performance.now();
const mediaTime = performance.now() - timeBase;
return Math.max(0, timestamp / 1000 - mediaTime);
}
async function renderFrame() {
if (readyFrames.length === 0) {
underflow = true;
return;
}
const frame = readyFrames.shift();
underflow = false;
const bitmap = await frame.createImageBitmap();
// Based on the frame timestamp, calculate the real-time wait time required before the next frame is displayed
const timeTillNextFrame = calculateTimeTillNextFrame(frame.timestamp);
await delay(timeTillNextFrame);
ctx.drawImage(bitmap, 0.0);
// Render the next frame immediately
setTimeout(renderFrame, 0);
frame.destroy();
}
function handleFrame(frame) {
readyFrames.push(frame);
if (underflow) {
underflow = false;
setTimeout(renderFrame, 0); }}const init = {
output: handleFrame,
error: (e) = > {
console.log(e.message); }};const config = {
codec: codecString,
codedWidth: cnv.width,
codedHeight: cnv.height,
};
const decoder = new VideoDecoder(init);
decoder.configure(config);
return decoder;
}
function main() {
if(! ("VideoEncoder" in window)) {
document.body.innerHTML = "<h1>WebCodecs API is not supported.</h1>";
return;
}
startDrawing();
const decoder = startDecodingAndRendering();
captureAndEncode((chunk) = > {
decoder.decode(chunk);
});
}
document.body.onload = main;
Copy the code
Refer to the address
Video processing with WebCodecs