Nowadays, more and more businesses want to show videos in their own pages. If an app says it doesn’t support video, it’s just as unimaginable as if it couldn’t see pictures in an app N years ago
Of course, playing video is inseparable from the VIDEO element of the HTML5 standard. Therefore, I hope to understand the principle and performance of video tags from the essence of video
A, use,
MDN: THE HTML element is used to embed a media player in AN HTML or XHTML document to support video playback within the document.
<video controls width="250">
<source src="/media/cc0-videos/flower.webm"
type="video/webm">
<source src="/media/cc0-videos/flower.mp4"
type="video/mp4">
Sorry, your browser doesn't support embedded videos.
</video>
Copy the code
The example above shows the use of the
The content in the middle of the
tag is a downgrade for when the browser does not support this element.
Browsers don’t all support the same video format, so you can provide multiple video sources in the
Second, browser and player
It’s easy to use, but it’s worth noting that there are two important messages in the sentence “browsers don’t all support the same video format.
1. Different browsers support different video formats
Or to put it another way: Different browsers support different video formats because they have different built-in video players
The video element is a ** replaceable element. ** Replaceable elements are elements that the browser decides to display based on their tags and attributes. These elements often have no actual content, i.e. an empty element
In Chrome, the preset player is CPlayer, while in Safari, the preset player is SPlayer
Different player supportVideo formatOf course, it’s not the same. Not only that, but performance varies in UI, controllability, and a few other areas
To discuss the video format supported by the video TAB, you must first specify the browser
2. Browsers only support specific video formats
The following chart shows the mainstream video formats supported by the major browsers
If you look at the first line of the table for chrome’s supported formats, you’ll see that only Ogg, MP4, and WebM are in parentheses. But what really determines whether or not to play is the ones outside the brackets, which are relatively unfamiliar h.264, HEVC, VP8 and so on
So what is the format of the video, but also from the composition of the video
3. Video format
The essence of a video is simple, a fast-playing image.
A few simple concepts:
- Frames: Each frame represents a still image
- Frame rate (FPS) : The number of images displayed per second. The larger the frame rate, the smoother the picture; The smaller the frame rate, the more stagnant the picture
- 12 FPS: Due to the special physiology of the human eye, images viewed at a frame rate higher than about 10-12 frames per second are considered coherent
- 24 FPS: Sound films shoot and play at a frame rate of 24 frames per second, which is acceptable to the average person
- 30 FPS: In the early days of highly dynamic video games, frame rates below 30 frames per second were incoherent because there was no motion blur to reduce the flow
- 60 FPS: In practice, 60 FPS is a better experience than 30 FPS
- 85 FPS: Generally speaking, the limit of video processing in the brain
1. The source file
Let’s calculate the size of the video in its simplest definition
- To describe colors in RGB, 0 to 255 equals 256 possibilities, which is exactly 1 byte. So one pixel
255 * 255 * 255
The size of3B
- The resolution of the
980 * 720
Normal resolution pictures are required980 * 720 * 3 b material 2 m
- The normal videoFrame rate30 indicates that 30 images are played in 1s. So a 1-minute video is equal to
2 m * 30 * 60 material 3.5 G
A 980-resolution pure video of 1 minute length at 30 frame rate is about 3.5g in size
The original YUV, RGB format video source file is that big
2. The coding
A video of this size is obviously unusable, and at current Internet speeds, a video of a few minutes can take half a day to download
So the first task is compression, the technical term is encode. Video is generally encoded in two ways
2.1 Intra-frame coding
In-frame coding is compressing images, such as all images compressed as JPeGs (the idea being that the human eye is more sensitive to brightness than to color). Usually after this step you can reduce the size of the video by 90%
2.2 Interframe coding
Interframe coding takes advantage of the correlation between frames. For example, compared with the first frame, only the sun and people change, so the second frame only needs to store the corresponding changes (similar to git, only save diff).Compressed frames are divided into:The I frame,P frameandB frame
- I frame: key frame, using the intra – frame compression technology
- P frame: Forward reference frame. When compressed, only the previously processed frame is referred to. Interframe compression is used
- B frame: bidirectional reference frame, in compression, it refers to the frame before and after it. Interframe compression is used
In addition to I/P/B frames, there are image sequencesGOP. GOP: There is an image sequence between two I frames, and there is only one I frame in an image sequence
3. Encoding format
Coding ideas can be generally classified into these two, how to achieve the specific, that we have their own powers
Different companies, different organizations, different generations will have different solutions. Efficiency and cost are totally different
Common coding standards (roughly in order of coding efficiency from low to high, the lower the better)
standard | Launch institutions | instructions |
---|---|---|
H.262, MPEG-2 Part 2 | MPEG/ITU-T | A precursor to MPEG-4, MPEG-2 was the standard codec for DVDS and early Blu-ray discs. It’s not often used for streaming video. |
AVC (H.264), MPEG-4 Part 10 | MPEG/ITU-T | H.264 provides high quality video images at low bit rates. Is the most common and mainstream standard |
VP8 | Similar to the h. 264 | |
HEVC (H.265) | MPEG/ITU-T | H.265 is designed to deliver higher-quality web video with limited bandwidth, taking up half as much space as H.264 for the same visual quality |
VP9 | Similar to the h. 265 | |
AV1 | AOM | The specific goal of AV1 is to significantly reduce the bit rate compared with VP9 and HEVE, and the target is about 30% reduction. Decoding is of low complexity, with a target roughly twice that of VP9 |
Of course, these are just specifications (like our front-end ES6 and ES10). Even if again advanced, no one to promote, use also useless
Whether a coding standard can be popularized and used depends not only on efficiency, but also on a series of factors such as the organization behind it, patent and so on. The most popular and widely used coding specification is H.264
To put it another way, h.264 is currently the most popular video format
4. Package format
We usually see XXX.MP4, XXX.AVI files, accurately called video encapsulation format. Is a container that contains video encoding (e.g. H.264), audio encoding (e.g. AAC), synchronization information, captions, metadata (e.g. Titles), and so on
Codec name (short) | Full Codec name | Package format Container Support |
---|---|---|
AV1 | AOMedia Video 1 | MP4, WebM |
AVC (H.264) | Advanced Video Coding | 3GP, MP4, WebM, TS, FLV |
H.263 | H.263 Video | 3GP |
HEVC (H.265) | High Efficiency Video Coding | MP4 |
MP4V-ES | MPEG-4 Video Elemental Stream | 3GP, MP4 |
MPEG-1 | MPEG-1 Part 2 Visual | MPEG, QuickTime |
MPEG-2 | MPEG-2 Part 2 Visual | MP4, MPEG, QuickTime |
Theora | Theora | Ogg |
VP8 | Video Processor 8 | 3GP, Ogg, WebM |
VP9 | Video Processor 9 | MP4, Ogg, WebM |
So go back to the original support chart and use Chrome to list it. H.264 and H.265 (HEVC) can both be packaged into MP4, but Chrome only supports the former
5. The audio
A complete “video” contains video coding, audio coding, subtitles and many other information.
Video encoding and audio encoding generally exist together. ** Like video, audio goes from source to encoding to encapsulation. Comparison can be understood, this article does not make other explanation
Common audio coding standards
standard | Launch institutions | instructions |
---|---|---|
AAC | MPEG | All fields (New) |
AC-3 | Dolby Inc. | The movie |
MP3 | MPEG | Fields (old) |
WMA | Microsoft Inc. | Microsoft platform |
The most widely used audio encoding format is AAC
4. Video processing
As we have learned from the above, whether a player can play a video depends on its encoding format, but the encapsulation format can not be ignored
Take the widely used FLV format as an example:
1. Encapsulate and unpack
H.264 and AAC can be packaged as MP4 or FLV. But Chrome can only play the former and not the latter, which makes no sense when you think about it. For example, two things can be separated in box A, but not in box B
Because essentially what matters isthings(H.264, H.265), rather thanbox(MP4, FLV)This is the browser versus the differenceVideo codingSupport forChrome supports H.264. So you still have to understandthingswithboxHow is it converted between
The step of wrapping code into a file is called remux, and the step of disassembling the file into code is called demux.
Encapsulation, abstractly, is actually arranging things into different codes
Suppose h.264’s binary is 123456789, and the different packages are as follows (those other than numbers can refer to audio, subtitles, etc.)
- MP4
ab
123456789
b
c
d
Copy the code
- FLV
a
b
123
c
456
d
789
Copy the code
So it looks like we could unpack FLV back to H.264 and audio and subtitles, and then package it as MP4
2. MSE
There is an easier way, we don’t have to seal the package at all
With the Media Source Extensions API, you can unpack the video encoding and audio encoding and play it directly
The Media Source Extension API (MSE) provides the capability to implement plug-in – free web-based streaming media. With MSE, media streams can be created in JavaScript and can be played using
The well-known related library in the industry is flv.js, which is open source by former staff of STATION B, which can unpack FLV of specified coding into H.264 and ACC and play it through MSE.Currently, MSE is generally supported except for iOS and IE
3. The other
A little further can think of, can use JS unpack, so decoding can also?If you look at the video implementation, you can see that the browser is decoded by FFmpeg. In fact, js or Wasm can replace this step
Suppose we now have a video format that the browser doesn’t support
- The encoding format found after unpacking is not supported
- Heart a malicious, through JS or Wasm compiled FFmpeg completed the decoding, got the video source file
And then there are two ideas
- Encode it into H.264 and play it over MSE
- According to the source file YUV, RGB, get the original picture of each frame. Continuously play pictures in Canvas to form a video, and simultaneously play audio in Audio
There are industry counterparts for both methods, but neither has been widely used due to performance and compatibility issues
So in theory, any video format can be played
Five, the live broadcast
All of the above is playing a full video. In reality, a video of a few minutes must be larger than 10M. Major video websites are more likely to have more than dozens of minutes of drama and variety shows, as well as live broadcasts of varying sizes
1. Video transmission
In these cases, you certainly can’t wait for the video to be fully downloaded before playing it. There are two major solutions for handling large video data
- Slice file: Divide the video into 3-10s video clips, download and play them one by one
- Continuous stream: A video stream is transmitted and played in data frames (essentially smaller segments).
There are four common video transmission protocols
agreement | http-flv | rtmp | hls | dash |
---|---|---|---|---|
transport | http | tcp | http | http |
Video package format | flv | FLV, tag, | Ts file | Mp4, 3GP, WebM slices |
Time delay | low | low | high | high |
The data segment | Continuous flow | Continuous flow | Slice file | Slice file |
HTML 5 play | Unpack and play via HTML5 (flv.js) | Does not support | Native support, or via (hls.js) | Play directly or HTML5 unpacked play |
The conclusion is that slicing files are more compatible and more suitable for voD. Continuous stream has better timeliness and is suitable for live broadcast
In fact, the two solutions only solve the transmission problem. For example, after a 10s video is divided into 10 1s videos, the video has to be reloaded after each 1s video is played.
Basically, every 1s of playback, the progress bar on the video jumps from beginning to end. At the code level, the initialization and termination of the video will be re-executed every 1s. This experience is certainly not enough for normal use
2. Streaming media
There are two solutions
2.1 Data is unpacked and played through MSE
Binary files controlled by MSE can be deleted at will. So the new data is added directly to the video source through the stream, and the video does not reload
But iOS Safari, which has poor MSE support, doesn’t work with this solution
2.2 Browser Handling
The first scenario itself needs to deal with the following parts
- Keep sending requests for the latest video
- Unpack the video and audio coding
- Push encoding file to MSE
HTTP Live Streaming (HLS) is a Live Streaming protocol proposed by Apple. So it’s supported natively in browsers on their platforms
By passing the HLS address directly to the video’s SRC property, the browser has done all of this
And from the beginning, video is a replaceable element. So many browsers (especially domestic mobile browsers) will use their own developed, different from the standard player.
- The downside is that it’s not as controllable as standard video. Like having no control over the display hierarchy, always at the top of the page…
- The upside is that it will help you deal with a lot of videos and even streaming agreements
For example, QQ browser and UC browser. You can play FLV and HLS streams directly
So I put together a list of live implementations that have always used the best solution:(Very few browsers do not support HTTP-FLV streams after hijacking, but will support HLS streams, such cases can be whitelisted)
Reference:
- What happens between recording the video and playing it in the browser
- Web video codec guide
- HTML5 Video Support by Codec
- Wiki: HTML 5 Video
- Multimedia Front End Manual