WebCodecs

  • Draft: wicg. Making. IO/web – codecs
  • Github:github.com/WICG/web-co…

An API that allows Web applications to encode and decode audio and video

In Chrome >= 86 version of the experience

  • Chrome ://flags/#enable-experimental-web-platform-featuresEnabled
  • Enable Chrome from the command line:/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --enable-blink-features=WebCodecs
// Check the current browser support using the VideoEncoder API
if ('VideoEncoder' in window) {
  // Support the WebCodecs API
}
Copy the code

Why the Web codec API appears

There are already a number of Web apis for media manipulation: Media Stream API, Media Recording API, Media Source API, and WebRTC API, but do not provide some underlying API for Web developers to frame or decapsulates encoded video.

Many audio and video editors solve this problem by using WebAssembly to bring audio and video codec to the browser, but the problem is that many browsers already support audio and video codec at the bottom level and with a lot of hardware-accelerated tuning. If you use WebAssembly to repackage these capabilities, Seems like a waste of human and computer resources.

Hence the WebCodecs API, which exposes the media API to use some of the capabilities that browsers already have, such as:

  • Video and audio decoding
  • Video and audio coding
  • Raw video frame
  • Image decoder

At this point, some Web media development projects such as video editors, video conferences, and video streams are much easier to operate

Check the project progress: github.com/WICG/web-co…

WebCodecs processing process

Frames are central to video processing, so most WebCodecs classes either consume frames or produce frames. Video Encoders convert frames to encoded chunks, and Video Decoders convert encoded chunks to frames. All of this is done asynchronously on the non-main thread, so you can keep the main thread fast.

Currently, in WebCodecs, the only way to display a frame on a page is to convert it to an ImageBitmap and draw a bitmap on the canvas or convert it to a WebGLTexture

Webcodecs practical application

Encoding

There are now two ways to convert an existing image to VideoFrame

  • One is to create a frame from an ImageBitmap
  • The second is throughVideoTrackReaderSet method to handle from[MediaStreamTrack](https://developer.mozilla.org/en-US/docs/Web/API/MediaStreamTrack)This API is useful when you need to capture video streams from the camera or screen

ImageBitmap

let cnv = document.createElement('canvas');
// draw something on the canvas.let bitmap = await createImageBitmap(cnv);
let frameFromBitmap = new VideoFrame(bitmap, { timestamp: 0 });
Copy the code

The first is to create the frame directly from the ImageBitmap. Simply call the new VideoFrame() constructor and supply it with the bitmap and display the timestamp

VideoTrackReader

const framesFromStream = [];
const stream = awaitThe navigator. MediaDevices. GetUserMedia ({... });const vtr = new VideoTrackReader(stream.getVideoTracks()[0]);
vtr.start((frame) = > {
  framesFromStream.push(frame);
});
Copy the code

To encode the Frame into an EncodedVideoChunk object using VideoEncoder, VideoEncoder requires two objects:

  • withoutputerrorThe initialization object for both methods, passed toVideoEncoderCannot be modified after
  • Encoder configuration object that contains parameters for output video streams. Can be called laterconfigure()To change these parameters
const init = {
  output: handleChunk,
  error: (e) = > {
    console.log(e.message); }};let config = {
  codec: 'vp8'.width: 640.height: 480.bitrate: 8 _000_000.// 8 Mbps
  framerate: 30};let encoder = new VideoEncoder(init);
encoder.configure(config);
Copy the code

After the encoder is set, you can accept frames. When you start accepting frames from the Media Stream, the callback passed to VideoTrackReader.start() is executed. To pass the frame to the encoder, you need to check the frame regularly to prevent too many frames from causing processing problems. Note: encoder.configure() and encoder.encode() will return immediately without waiting for the actual processing to complete. If the processing is complete, the output() method is called with the encoded chunk input. Note that encoer.encode() consumes frame, and if frame needs to be used later, clone needs to be called to copy it.

let frameCounter = 0;
let pendingOutputs = 0;
const vtr = new VideoTrackReader(stream.getVideoTracks()[0]);

vtr.start((frame) = > {
  if (pendingOutputs > 30) {
    // There are too many frames being processed for the encoder to handle and no more frames to be processed
    return;
  }
  frameCounter++;
  pendingOutputs++;
  const insert_keyframe = frameCounter % 150= = =0;
  encoder.encode(frame, { keyFrame: insert_keyframe });
});
Copy the code

Finally, you complete the handleChunk method, which typically sends blocks of data over the network or encapsulates them into a media container

function handleChunk(chunk) {
  let data = new Uint8Array(chunk.data);  // actual bytes of encoded data
  let timestamp = chunk.timestamp;        // media time in microseconds
  let is_key = chunk.type == 'key';       // can also be 'delta'
  pending_outputs--;
  fetch(`/upload_chunk? timestamp=${timestamp}&type=${chunk.type}`,
  {
    method: 'POST'.headers: { 'Content-Type': 'application/octet-stream' },
    body: data
  });
}
Copy the code

Sometimes you need to make sure all pending encoding requests are complete by calling flush()

await encoder.flush();
Copy the code

Decoding

Setting up VideoDecoder is similar to the above. You need to pass init and config objects

const init = {
  output: handleFrame,
  error: (e) = > {
    console.log(e.message); }};const config = {
  codec: "vp8".codedWidth: 640.codedHeight: 480};const decoder = new VideoDecoder(init);
decoder.configure(config);
Copy the code

Once the decoder is set up, you can feed it an EncodedVideoChunk object, Through [BufferSouce] (https://developer.mozilla.org/en-US/docs/Web/API/BufferSource) to create a chunk

const responses = await downloadVideoChunksFromServer(timestamp);
for (let i = 0; i < responses.length; i++) {
  const chunk = new EncodedVideoChunk({
    timestamp: responses[i].timestamp,
    data: new Uint8Array(responses[i].body),
  });
  decoder.decode(chunk);
}
await decoder.flush();

Copy the code

Rendering the Decoded Frame to the page is divided into three steps:

  • theVideoFrameconvert[ImageBitmap](https://developer.mozilla.org/en-US/docs/Web/API/ImageBitmap)
  • Wait for the right time to display the frame
  • Draw the image on the canvas

When a frame is no longer needed, call destroy() to manually destroy it before garbage collection, which reduces the page memory footprint

const cnv = document.getElementById("canvas_to_render");
const ctx = cnv.getContext("2d", { alpha: false });
const readyFrames = [];
let underflow = true;
let timeBase = 0;

function handleFrame(frame) {
  readyFrames.push(frame);
  if (underflow) setTimeout(renderFrame, 0);
}

function delay(time_ms) {
  return new Promise((resolve) = > {
    setTimeout(resolve, time_ms);
  });
}

function calculateTimeTillNextFrame(timestamp) {
  if (timeBase == 0) timeBase = performance.now();
  const media_time = performance.now() - timeBase;
  return Math.max(0, timestamp / 1000 - media_time);
}

async function renderFrame() {
  if (readyFrames.length === 0) {
    underflow = true;
    return;
  }
  const frame = readyFrames.shift();
  underflow = false;

  const bitmap = await frame.createImageBitmap();
  frame.destroy();
  // Based on the frame timestamp, calculate the real-time wait time required before the next frame is displayed
  const timeTillNextFrame = calculateTimeTillNextFrame(frame.timestamp);
  await delay(timeTillNextFrame);
  ctx.drawImage(bitmap, 0.0);

  // Render the next frame immediately
  setTimeout(renderFrame, 0);
}
Copy the code

Demo

Address: experience ringcrl. Making. IO/static/webc…

If you can’t open it, please refer to the above [in Chrome >= 86 version for experience].

<! DOCTYPEhtml>
<html>

<head>
  <meta charset="UTF-8">
  <style>
    canvas {
      padding: 10px;
      background: gold;
    }

    button {
      background-color: # 555555;
      border: none;
      color: white;
      padding: 15px 32px;
      width: 150px;
      text-align: center;
      display: block;
      font-size: 16px;
    }
  </style>
</head>

<body>
  <button onclick="playPause()">Pause</button>
  <canvas id="dst" width="640" height="480"></canvas>
  <canvas style="visibility: hidden;" id="src" width="640" height="480"></canvas>
  <script src="./main.js"></script>

</body>

</html>
Copy the code
const codecString = "vp8";
let keepGoing = true;

function playPause() { keepGoing = ! keepGoing;const btn = document.querySelector("button");
  if (keepGoing) {
    btn.innerText = "Pause";
  } else {
    btn.innerText = "Play"; }}function delay(time_ms) {
  return new Promise((resolve) = > {
    setTimeout(resolve, time_ms);
  });
}

async function startDrawing() {
  const cnv = document.getElementById("src");
  const ctx = cnv.getContext("2d", { alpha: false });

  ctx.fillStyle = "white";
  const { width } = cnv;
  const { height } = cnv;
  const cx = width / 2;
  const cy = height / 2;
  const r = Math.min(width, height) / 5;
  const drawOneFrame = function drawOneFrame(time) {
    const angle = Math.PI * 2 * (time / 5000);
    const scale = 1 + 0.3 * Math.sin(Math.PI * 2 * (time / 7000));
    ctx.save();
    ctx.fillRect(0.0, width, height);

    ctx.translate(cx, cy);
    ctx.rotate(angle);
    ctx.scale(scale, scale);

    ctx.font = "30px Verdana";
    ctx.fillStyle = "black";
    const text = "😊 📹 📷 Hello WebCodecs 🎥 🎞 ️ 😊";
    const size = ctx.measureText(text).width;
    ctx.fillText(text, -size / 2.0);
    ctx.restore();
    window.requestAnimationFrame(drawOneFrame);
  };
  window.requestAnimationFrame(drawOneFrame);
}

function captureAndEncode(processChunk) {
  const cnv = document.getElementById("src");
  const fps = 60;
  let pendingOutputs = 0;
  let frameCounter = 0;
  const stream = cnv.captureStream(fps);
  const vtr = new VideoTrackReader(stream.getVideoTracks()[0]);

  const init = {
    output: (chunk) = > {
      pendingOutputs--;
      processChunk(chunk);
    },
    error: (e) = > {
      console.log(e.message); vtr.stop(); }};const config = {
    codec: codecString,
    width: cnv.width,
    height: cnv.height,
    bitrate: 10e6.framerate: fps,
  };

  const encoder = new VideoEncoder(init);
  encoder.configure(config);

  vtr.start((frame) = > {
    if(! keepGoing)return;
    if (pendingOutputs > 30) {
      // Too many frames in flight, encoder is overwhelmed
      // let's drop this frame.
      return;
    }
    frameCounter++;
    pendingOutputs++;
    const insert_keyframe = frameCounter % 150= = =0;
    encoder.encode(frame, { keyFrame: insert_keyframe });
  });
}

async function frameToBitmapInTime(frame, timeout_ms) {
  const options = { colorSpaceConversion: "none" };
  const convertPromise = frame.createImageBitmap(options);

  if (timeout_ms <= 0) return convertPromise;

  const results = await Promise.all([convertPromise, delay(timeout_ms)]);
  return results[0];
}

function startDecodingAndRendering() {
  const cnv = document.getElementById("dst");
  const ctx = cnv.getContext("2d", { alpha: false });
  const readyFrames = [];
  let underflow = true;
  let timeBase = 0;

  function calculateTimeTillNextFrame(timestamp) {
    if (timeBase === 0) timeBase = performance.now();
    const mediaTime = performance.now() - timeBase;
    return Math.max(0, timestamp / 1000 - mediaTime);
  }

  async function renderFrame() {
    if (readyFrames.length === 0) {
      underflow = true;
      return;
    }
    const frame = readyFrames.shift();
    underflow = false;

    const bitmap = await frame.createImageBitmap();
    // Based on the frame timestamp, calculate the real-time wait time required before the next frame is displayed
    const timeTillNextFrame = calculateTimeTillNextFrame(frame.timestamp);
    await delay(timeTillNextFrame);
    ctx.drawImage(bitmap, 0.0);

    // Render the next frame immediately
    setTimeout(renderFrame, 0);
    frame.destroy();
  }

  function handleFrame(frame) {
    readyFrames.push(frame);
    if (underflow) {
      underflow = false;
      setTimeout(renderFrame, 0); }}const init = {
    output: handleFrame,
    error: (e) = > {
      console.log(e.message); }};const config = {
    codec: codecString,
    codedWidth: cnv.width,
    codedHeight: cnv.height,
  };

  const decoder = new VideoDecoder(init);
  decoder.configure(config);
  return decoder;
}

function main() {
  if(! ("VideoEncoder" in window)) {
    document.body.innerHTML = "<h1>WebCodecs API is not supported.</h1>";
    return;
  }
  startDrawing();
  const decoder = startDecodingAndRendering();
  captureAndEncode((chunk) = > {
    decoder.decode(chunk);
  });
}

document.body.onload = main;
Copy the code

Refer to the address

Video processing with WebCodecs