Streaming Video Basics MSE Primer & FFmpeg making video preview thumbnails and FMP4

This is the second article in the series on danmaku player. In the previous article, please see developing danmaku video player 1 from scratch. Iotan and other video sites are using this to play streaming media player official website: nplayer.js.org/.

If you look at videos on major video sites, you’ll see that the SRC attribute of the video element of the site is a string that starts with a blob.

Why do video links have a BLOb prefix? This is because video sites use the MSE that this article is about to play video.

Media Source Extensions

Although video is very powerful, there are still many functions that are not supported by video, such as live broadcasting, instant switching of video definition, dynamic updating of audio language and other functions.

Addressing these issues is MSE (Media Source Extensions), a W3C specification that is supported by most browsers today.

It uses the video tag plus JS to implement these complex functions. It sets the SRC of the video to a MediaSource object, then fetches the data via HTTP requests and passes it to the SourceBuffer in MeidaSource to play the video.

How do you connect MediaSource and video elements? This requires url.createObjecturl, which creates a DOMString representing the specified File object or Blob object. The lifecycle of this URL and the Document binding in the window that created it.

const video = document.querySelector('video')
const mediaSource = new MediaSource()

mediaSource.addEventListener('sourceopen'.({ target }) = > {
    URL.revokeObjectURL(video.src)
    const mime = 'video/webm; codecs="vorbis, vp8"'
    const sourceBuffer = target.addSourceBuffer(mime) // target 就是 mediaSource
    fetch('/static/media/flower.webm')
        .then(response= > response.arrayBuffer())
        .then(arrayBuffer= > {
            sourceBuffer.addEventListener('updateend'.() = > {
                if(! sourceBuffer.updating && target.readyState ==='open') {
                    target.endOfStream()
                    video.play()
                }
            })
            sourceBuffer.appendBuffer(arrayBuffer)
        })
})

video.src = URL.createObjectURL(mediaSource)
Copy the code

The addSourceBuffer method creates a new SourceBuffer object based on the given MIME type, and then appends it to MediaSource’s SourceBuffers list.

We need to pass in the specific codecs string, the first is audio (vorbis), the second is video (VP8), the two positions can also be interchangeable, knowing the specific codecs the browser does not need to download the specific data to know whether the current type is supported. If this method is not supported, a NotSupportedError error is raised. For more information about media types MIME codecs refer to RFC 4281.

RevokeObjectURL is also called at the beginning. This does not break any objects and can be called at any time after MediaSource is connected to video. It allows the browser to do garbage collection when appropriate.

The video is not pushed directly to MediaSource, but to SourceBuffers, one or more of which can be found within a MeidaSource. Each is associated with a content type, which could be video, audio, video, audio, and so on.

Video format

When the HTML5 standard is specified, to specify a video format as part of the standard, all browsers must implement it. But the main argument for H.264 video encoding from various vendors is that H.264 is very powerful (high picture quality, high compression ratio, mature codecs…). But it also comes with a hefty licensing fee. Free browsers such as Mozilla, which makes no direct revenue from the browsers it develops, pay licensing fees to include H.264 in the standard, which it considers unacceptable. As a result, the uniformity of the video format specification was abandoned and browser vendors were free to choose which format to support.

However, all major browsers now support H.264 encoding for video, so h.264 is preferred when selecting video encoding.

MediaSource API

`MediaSource`attribute

attribute	describe
sourceBuffers	Returns a`MediaSource`all`SourceBuffer` 的 SourceBufferListobject
activeSourceBuffers	A subset of the last property that returns the currently selected video track, enabled audio specification, and show or hide text track
duration	Gets or sets the duration of the current media presentation, negative or`NaN`Thrown when`InvalidAccessError`. `readyState`Don’t for`open` 或 `SourceBuffer.updating`Properties for`true`Thrown when`InvalidStateError`
readyState	said`MediaSource`The current state of

ReadyState has the following values:

closedNot attached to onemediaOn the element
openAttached to onemediaElement and ready to receiveSourceBufferobject
endedAttached to onemediaElement, but the stream has beenMediaSource.endOfStream()The end of the

`MediaSource`methods

methods	describe
addSourceBuffer(mime)	Create a new SourceBuffer object based on the given MIME type and append it to MediaSource’s SourceBuffers list
removeSourceBuffer(sourceBuffer)	Removes the SourceBuffer specified in MediaSource. If not, raise NotFoundError
endOfStream(endOfStreamError)	Sends an end signal to the MediaSource, receiving a DOMString parameter indicating the error to be thrown when the end of the stream is reached
setLiveSeekableRange(start, end)	Sets the range of videos that the user can skip, in seconds`Double`Type, negative, and other invalid arguments will throw an exception
clearLiveSeekableRange	Clear the last setting`LiveSeekableRange`

AddSourceBuffer may throw the following error:

error	describe
InvalidAccessError	The submitted mimeType is an empty string
InvalidStateError	The value of MediaSource. ReadyState is not equal to open
NotSupportedError	MimeType not supported by current browser
QuotaExceededError	The browser can no longer process the SourceBuffer object

The parameters of endOfStream may be the following two strings:

networkTerminates playback and signals a network error
decodeTerminates playback and emits a decoder error signal

If MediaSource. ReadyState is not equal to open or sourceBuffer. updating is equal to true, calling endOfStream raises InvalidStateError.

It also has a static method

A static method	describe
isTypeSupported(mime)	Returns whether the specified MIME type is supported`true`Indicating possible support is no guarantee

`MediaSource`The event

error	describe
sourceopen	`readyState` 从 `closed` 或 `ended` 到 `open`
sourceended	`readyState` 从 `open` 到 `ended`
sourceclose	`readyState` 从 `open` 或 `ended` 到 `closed`

SourceBuffer API

The SourceBuffer passes chunks of media data to the Meida element for playback via MediaSource.

`SourceBuffer`attribute

attribute	describe
mode	Controls the processing of sequences of media fragments,`segments`The clip timestamp determines the playback order,`sequence`The order of addition determines the order of playback, in`MediaSource.addSourceBuffer()`Sets the initial value if the media fragment has a timestamp set to`segments`, otherwise,`sequence`. You can only set from`segments`Set to`sequence`It cannot be the other way around.
updating	Is it being updated, e.g`appendBuffer()` 或 `remove()`The method is still being processed
buffered	Returns the current bufferedTimeRangesobject
timestampOffset	Controls the timestamp offset of the media snippet. Default is 0
audioTracks	Of the currently contained AudioTrackAudioTrackListobject
videoTracks	Returns the currently contained VideoTrackVideoTrackListobject
textTracks	Returns the value of the currently contained TextTrackTextTrackListobject
appendWindowStart	Sets or gets the start timestamp of the Append window
appendWindowEnd	Sets or gets the end timestamp of the AppEnd Window

Append Window is a timestamp range to filter the append encoding frame. Encoding frames in range are allowed to be added to the SourceBuffer, and those outside are filtered.

`SourceBuffer`methods

methods	describe
appendBuffer(source)	Add media data fragments (`ArrayBuffer` 或 `ArrayBufferView`) to`SourceBuffer`
abort	To interrupt the current segment and reset the segment parser, enable`updating`become`false`
remove(start, end)	Removes media data from a specified range

`SourceBuffer`The event

methods	describe
updatestart	`updating` 从 `false`into`true` 时
update	`append` 或 `remove`Has been successfully completed,`updating` 从 `true`into`false`
updateend	`append` 或 `remove`It’s over in`update`After the trigger
error	`append`When an error occurs,`updating` 从 `true`into`false`
abort	`append` 或 `remove` 被 `abort()`Method interrupt,`updating` 从 `true`into`false`

Matters needing attention

Almost allMediaSourceMethod calls or sets properties inMediaSource.readyStatenotopenThrown whenInvalidStateErrorError. You should view the current state before calling a method or setting a property, even in event callbacks, because the state may have changed before the callback is executed. In additionendOfStreamThe method is also becauseSourceBuffer 的 updating 为 trueThe exception is also thrown
In the callSourceBufferMethod or setting properties should be checkedSourceBufferWhether it isfalse
whenMediaSource.readyStateThe value isended, the callappendBuffer() 和 remove()Or setmode 和 timestampOffsetWhen, will letreadyStateintoopenAnd triggeringsourceopenEvent, so there should be multiple processessourceopenEvents to prepare

Initialization Segment

If you take any MP4 file and play it using the above example, you will find that it won’t play. This is because SourceBuffer receives two types of data:

Initialization SegmentThe initial clip of the video, which contains all the initialization information needed for the sequence of media fragments to be decoded.
Media SegmentPackaged and time-stamped media data that contains a portion of the media timeline.

MSE requires FMP4 (fragmented MP4) format. MP4 files are fragmented in an object-oriented format including Atoms, each containing Boxes. Information about MP4 files can be viewed using this website.

This is an ordinary MP4 file, you can see it has a large MDAT (movie data) box.

Fragmented MP4. ISO BMFF initialization is defined as a single File Type Box FTYP followed by a single Movie Header Box MOOv. For more information, see ISO BMFF Byte Stream Format.

Moov only contains some basic information of the video (type, encoder, etc.), the location and size of the MOOF sample, and there is a MDAT behind the MOOF box, which contains the sample as described in the previous MOOF box.

To see if the current video is fMP4, see if fTYP is followed by moov and then moof MDat.

To convert an ordinary MP4 to FMP4, download Bento4. Then use the command line to enter the following command.

mp4fragment ./friday.mp4 ./friday.fmp4
Copy the code

The Bento4 /bin directory is full of great mp4 tools, and the /utils directory is full of python utility scripts.

tool

In addition to Bento4, there are many other useful tools. With the following tools, you can quickly create video footage of MSE practices.

chrome://media-internals/

Chrome ://media-internals/ Is a tool used by Chrome to debug multimedia. Enter this value in the address bar.

Shaka Packager

Shaka Packager is a small video tool from Google. It is only 5 meters long. It can be used to view video information, separate audio and video, and support HLS and DASH.

Its command lines are in the following format:

packager stream_descriptor[,stream_descriptor] [flags]
Copy the code

Like using it to view video information:

Packager in=friday.mp4 --dump_stream_info # download File "friday.mp4": Found 2 stream(s).stream [0] type: Time_scale: 44100 duration: 271360 (6.2 seconds) is_encrypted: false Codec: AAC sample_bits: 16 num_channels: 2 sampling_frequency: 44100 language: und Stream [1] type: Video codec_string: Time_scale: 3000 duration: 18500 (6.2 seconds) is_encrypted: false Codec: H264 width: 640 height: 480 pixel_aspect_ratio: 1:1 trick_play_factor: 0 nalu_length_size: 4Copy the code

See the official documentation for more.

FFmpeg

FFmpeg is a very powerful video processing open source software, many video players use it as the kernel. Examples in subsequent articles will use this tool.

For example, to convert an ordinary MP4 to FMP4, you can use the following command:

ffmpeg -i ./friday.mp4 -movflags empty_moov+frag_keyframe+default_base_moof f.mp4
Copy the code

The command line format is as follows:

ffmpeg [global_options] {[input_file_options] -i input_url} ... {[output_file_options] output_url} ...
Copy the code

It can have an unlimited number of input and output files, with -i followed by the input URL and output files that cannot be parsed as arguments.

 _______              ______________
|       |            |              |
| input |  demuxer   | encoded data |   decoder
| file  | ---------> | packets      | -----+
|_______|            |______________|      |
                                           v
                                       _________
                                      |         |
                                      | decoded |
                                      | frames  |
                                      |_________|
 ________             ______________       |
|        |           |              |      |
| output | <-------- | encoded data | <----+
| file   |   muxer   | packets      |   encoder
|________|           |______________|
Copy the code

It unlocks the input file’s container, decodes the data inside, encodes it in the specified format, and then uses the specified container to wrap the output file. After decoded frames, FFmpeg can be processed with filter, such as adding filters, rotation, sharpening, etc. Filter can be divided into simple and complex, complex can handle multiple input streams.

If we just want to change the container of the video, we can speed it up by skipping the decoding and encoding process.

ffmpeg -i input.avi -c copy output.mp4
Copy the code

-C indicates the specified encoder, -c copy indicates the direct copy encoding, -c:v indicates the video encoding, and -c:a indicates the audio encoding. For example, -c:v libx264 indicates that the video is encoded as H.264 using the CPU, and -c:v h264_nvenc uses the N card, which is faster. -c:a copy Indicates that the audio code is copied directly.

ffmpeg -y -i myvideo.mp4 -c:v copy -an myvideo_video.mp4
ffmpeg -y -i myvideo.mp4 -c:a copy -vn myvideo_audio.m4a
Copy the code

-an Removes audio streams. -vn removes video streams. -y overwrites files with the same name without confirmation.

Mp4 # Viewing video information ffmPEG-formats # Viewing supported containers ffmPEG-codecs # Viewing supported encoding formats ffmPEG-encoders # View the built-in encoderCopy the code

See the official documentation for more information about FFmpeg.

Video thumbnail preview

Understand the above useful tools, to use FFmpeg to achieve a small video player function.

Now on video sites, when you put your mouse over the progress bar, there’s a little thumbnail that previews the content at that point in time.

ffmpeg -i ./test.webm -vf 'fps=1/10:round=zero:start_time=-9,scale=160x90,tile=5x5' M%d.jpg
Copy the code

We can use the command above to generate a Sprite image consisting of 25 160×90 preview images.

-vf parameters are followed by filters. If multiple filters are used, separate them. If one filter has multiple parameters, separate them.

FPS =1/10 means a graph is generated in 10 seconds, FPS =1/60 means a graph is generated in a minute, round=zero time stamp is rounded to 0, start_time=-9 because FPS is generated every seconds, not from 0 seconds. -9 means it is generated from 1 second. Ignore 0 second black screen frames.

Scale =160×90 Set the output image resolution size, tile=5×5 combine the small images together in a 5×5 way, M%d.jpg means the output is JPG, and the file is m1.jpg m2.jpg… Increase like this.

If you want to use NodeJS, you can use the Node-Fluent-ffmpeg thumbnails method.

With Sprite images, we can add a video thumbnail function to the player implemented in the previous article. This is done through CSS background.

const thumb = document.querySelector('.thumb')
const gapSec = 10 // A picture is displayed in seconds
const images = [...] / / picture
const row = 5, col = 5 // A graph has several rows and columns
const width = 160, height = 90; // Width and height of thumbnail
const thumbQuantityPerImg = col * row

function updateThumbnail(seconds) { // Pass in the number of seconds to display the thumbnail
    const thumbNum = (seconds / gapSec) | 0; // The current thumbnail number
    const url = images[(thumbNum / thumbQuantityPerImg) | 0];
    const x = (thumbNum % col) * width; // The offset of x
    const y = ~~((thumbNum % thumbQuantityPerImg) / row) * height; // The offset of y

    thumb.style.backgroundImage = `url(${url}) `;
    thumb.style.backgroundPosition = ` -${x}px -${y}px`;
}
Copy the code

Results the following

This shows only the thumbnail update logic, ignoring the timing, event handling, and styling code.

See NPlayer for the complete code.

Online demo: nplayer.js.org/

Video slice

With MSE we a video can be divided into multiple small video, then can control the cache to save progress flow, can also be video compressed into different resolution, in the case of user network bad dynamic loading rate low segment, insert ads into many segments can also realize, dynamically switch the audio language, and other functions.

. / audio / ├ ─ ─ the. / 128 KBPS / | ├ ─ ─ segment0. Mp4 | ├ ─ ─ segment1, mp4 | └ ─ ─ segment2. Mp4 └ ─ ─ the. / 320 KBPS / ├ ─ ─ segment0. Mp4 ├ ─ ─ Segment1. Mp4 └ ─ ─ segment2. Mp4. / video / ├ ─ ─ the. / 240 / p | ├ ─ ─ segment0. Mp4 | ├ ─ ─ segment1, mp4 | └ ─ ─ segment2. Mp4 └ ─ ─. / 720 / p ├ ─ ─ segment0. Mp4 ├ ─ ─ segment1, mp4 └ ─ ─ segment2. Mp4Copy the code

You can also use HTTP Range instead of slicing the video to Range the data.

ffmpeg -i ./friday.mp4 -f segment -segment_time 2 -segment_format_options movflags=dash ff%04d.mp4
Copy the code

We used the command above to cut a video into 2 second fMP4 video clips.

window.MediaSource = window.MediaSource || window.WebKitMediaSource
const video = document.querySelector('video')
const mime = 'video/mp4; Codecs = "avc1.4 d401e mp4a. 40.2" '
const segments = [
    '/ static/media/ff0000.5 b66d30e. Mp4'.'/ static/media/ff0001.89895 c46. Mp4'.'/ static/media/ff0002.44 cfe1e4. Mp4'
]
const segmentData = []
let currentSegment = 0

if ('MediaSource' in window && MediaSource.isTypeSupported(mime)) {
    const ms = new MediaSource()
    video.src = URL.createObjectURL(ms)
    mediaSource.addEventListener('sourceopen', sourceOpen)
}

function sourceOpen({ target }) {
    URL.revokeObjectURL(video.src)
    target.removeEventListener('sourceopen', sourceOpen)
    const sb = target.addSourceBuffer(mime)
    fetchSegment(ab= > sb.appendBuffer(ab))
    sb.addEventListener('updateend'.() = > {
        if(! sb.updating && segmentData.length ) { sb.appendBuffer(segmentData.shift()) } }) video.addEventListener('timeupdate'.function timeupdate() {
        if( currentSegment > segments.length && ! sb.updating && target.readyState ==='open'
        ) { // All fragments are loaded
            target.endOfStream()
            video.removeEventListener('timeupdate', timeupdate)
        } else if (
            video.currentTime > (currentSegment * 2 * 0.8) 
            // A segment is 2 seconds long
            // Wait until 80% of a clip is played before requesting the next clip
        ) {
            fetchSegment(ab= > {
                if (sb.updating) {
                    segmentData.push(ab)
                } else {
                    sb.appendBuffer(ab)
                }
            })
        }
    })
    video.addEventListener('canplay'.() = > {
        video.play()
    })
}

function fetchSegment(cb) {
    fetch(segments[currentSegment])
        .then(response= > response.arrayBuffer())
        .then(arrayBuffer= > {
            currentSegment += 1
            cb(arrayBuffer)
        })
}
Copy the code

This example is very simple and does not deal with seek, adaptive bitrate and other complex functions.

conclusion

Now video sites almost all use MSE to play video. Using MSE has the benefits of providing a better user experience and greater cost savings. Although video playback is generally performed using HLS Dash and other open source clients, we do not use MSE ourselves, but these clients use MSE at the bottom, understanding MSE is the best way to understand these clients.

Website: nplayer.js.org
Source: github.com/woopen/npla…
Codesandbox.io /s/ancient-s…

NPlayer supports any video player for streaming media and station B danmu experience
Develop danmu video player 1 from scratch
The original video sites such as Iyouteng are using this to play streaming media

reference

MediaSource
Media Source Extensions ™
Media Source Extensions
Streaming media on demand with Media Source Extensions
Streaming a video with Media Source Extensions

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Streaming Video Basics MSE Primer & FFmpeg making video preview thumbnails and FMP4

Media Source Extensions

Video format

MediaSource API

`MediaSource`attribute

`MediaSource`methods

`MediaSource`The event

SourceBuffer API

`SourceBuffer`attribute

`SourceBuffer`methods

`SourceBuffer`The event

Matters needing attention

Initialization Segment

tool

chrome://media-internals/

Shaka Packager

FFmpeg

Video thumbnail preview

Video slice

conclusion

Related articles

reference

Streaming Video Basics MSE Primer & FFmpeg making video preview thumbnails and FMP4

Media Source Extensions

Video format

MediaSource API

MediaSourceattribute

MediaSourcemethods

MediaSourceThe event

SourceBuffer API

SourceBufferattribute

SourceBuffermethods

SourceBufferThe event

Matters needing attention

Initialization Segment

tool

chrome://media-internals/

Shaka Packager

FFmpeg

Video thumbnail preview

Video slice

conclusion

Related articles

reference

Related Posts

Note: Webpack some commonly used dependency packages

🪜 typescript Learning Notes [Lesson 1] Understanding and Using

JS base scope and closure

`MediaSource`attribute

`MediaSource`methods

`MediaSource`The event

`SourceBuffer`attribute

`SourceBuffer`methods

`SourceBuffer`The event